Commit Graph

137641 Commits

Author SHA1 Message Date
Hannes Reinecke
98d40e7665 block: document BLK_STS_AGAIN usage
BLK_STS_AGAIN should only be used if RQF_NOWAIT is set and the bio
would block. So we'd better document that to avoid accidental misuse.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220524055631.85480-2-hare@suse.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-27 20:38:17 -06:00
Bart Van Assche
5d2ae14276 block: Fix the bio.bi_opf comment
Commit ef295ecf09 modified the Linux kernel such that the bottom bits
of the bi_opf member contain the operation instead of the topmost bits.
That commit did not update the comment next to bi_opf. Hence this patch.

From commit ef295ecf09:
-#define bio_op(bio)    ((bio)->bi_opf >> BIO_OP_SHIFT)
+#define bio_op(bio)    ((bio)->bi_opf & REQ_OP_MASK)

Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Fixes: ef295ecf09 ("block: better op and flags encoding")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220511235152.1082246-1-bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-12 06:33:26 -06:00
Christoph Hellwig
5ce7729f25 block: reorder the REQ_ flags
Keep the op-specific flag last so that they are clearly separate from
the generic flags.  Various recent commits just kept adding new flags
at the end.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220512061408.1826595-1-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-12 06:24:18 -06:00
Christoph Hellwig
f624506f98 kthread: unexport kthread_blkcg
kthread_blkcg is only used by the built-in blk-cgroup code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20220420042723.1010598-16-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-02 14:06:20 -06:00
Christoph Hellwig
c97ab27157 blk-cgroup: remove unneeded includes from <linux/blk-cgroup.h>
Remove all the includes that aren't actually needed from
<linux/blk-cgroup.h> and push them to the actual source files where
needed.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20220420042723.1010598-12-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-02 14:06:20 -06:00
Christoph Hellwig
7f20ba7c42 blk-cgroup: remove pointless CONFIG_BLOCK ifdefs
No need to make BLK_CGROUP stubs conditional on CONFIG_BLOCK as they
can't be used without that.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20220420042723.1010598-11-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-02 14:06:20 -06:00
Christoph Hellwig
bbb1ebe7a9 blk-cgroup: replace bio_blkcg with bio_blkcg_css
All callers of bio_blkcg actually want the CSS, so replace it with an
interface that does return the CSS.  This now allows to move
struct blkcg_gq to block/blk-cgroup.h instead of exposing it in a
public header.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20220420042723.1010598-10-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-02 14:06:20 -06:00
Christoph Hellwig
f4a6a61cb6 blktrace: cleanup the __trace_note_message interface
Pass the cgroup_subsys_state instead of a the blkg so that blktrace
doesn't need to poke into blk-cgroup internals, and give the name a
blk prefix as the current name is way too generic for a public
interface.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20220420042723.1010598-9-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-02 14:06:20 -06:00
Christoph Hellwig
dec223c92a blk-cgroup: move struct blkcg to block/blk-cgroup.h
There is no real need to expose the blkcg structure to the whole kernel.
Move it to the private header an expose a helper to let the writeback
code access the cgwb_list member.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20220420042723.1010598-8-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-02 14:06:20 -06:00
Christoph Hellwig
397c9f46ee blk-cgroup: move blkcg_{pin,unpin}_online out of line
Move these two functions out of line as there is no good reason
to inline them.  Also switch to passing a cgroup_subsys_state
instead of doing the conversion in the caller to prepare for making
the blkcg structure private to blk-cgroup.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20220420042723.1010598-7-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-02 14:06:20 -06:00
Christoph Hellwig
216889aad3 blk-cgroup: move blk_cgroup_congested out line
There is no urgent need to inline this function, so move it out of line.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20220420042723.1010598-6-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-02 14:06:20 -06:00
Christoph Hellwig
db05628435 blk-cgroup: move blkcg_{get,set}_fc_appid out of line
No need to have these helpers inline.  Also remove the stubs and just use
an IS_ENABLED for the get side (the set side already is only built
conditionlly).

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20220420042723.1010598-5-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-02 14:06:20 -06:00
Ming Lei
5f0614a55e block: change exported IO accounting interface from gendisk to bdev
Export IO accounting interfaces in terms of block_device now that
gendisk has become more internal to block core.

Rename __part_{start,end}_io_acct's first argument from part to bdev.
Rename __part_{start,end}_io_acct to bdev_{start,end}_io_acct and
export them.  Remove disk_{start,end}_io_acct and update caller (zram)
to use bdev_{start,end}_io_acct.

DM can now be updated to use bdev_{start,end}_io_acct.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Link: https://lore.kernel.org/r/20220418022733.56168-2-snitzer@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-04-18 06:49:52 -06:00
Christoph Hellwig
44abff2c0b block: decouple REQ_OP_SECURE_ERASE from REQ_OP_DISCARD
Secure erase is a very different operation from discard in that it is
a data integrity operation vs hint.  Fully split the limits and helper
infrastructure to make the separation more clear.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> [drbd]
Acked-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> [nifs2]
Acked-by: Jaegeuk Kim <jaegeuk@kernel.org> [f2fs]
Acked-by: Coly Li <colyli@suse.de> [bcache]
Acked-by: David Sterba <dsterba@suse.com> [btrfs]
Acked-by: Chao Yu <chao@kernel.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20220415045258.199825-27-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-04-17 19:49:59 -06:00
Christoph Hellwig
7b47ef52d0 block: add a bdev_discard_granularity helper
Abstract away implementation details from file systems by providing a
block_device based helper to retrieve the discard granularity.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> [drbd]
Acked-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Acked-by: David Sterba <dsterba@suse.com> [btrfs]
Link: https://lore.kernel.org/r/20220415045258.199825-26-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-04-17 19:49:59 -06:00
Christoph Hellwig
70200574cc block: remove QUEUE_FLAG_DISCARD
Just use a non-zero max_discard_sectors as an indicator for discard
support, similar to what is done for write zeroes.

The only places where needs special attention is the RAID5 driver,
which must clear discard support for security reasons by default,
even if the default stacking rules would allow for it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> [drbd]
Acked-by: Jan Höppner <hoeppner@linux.ibm.com> [s390]
Acked-by: Coly Li <colyli@suse.de> [bcache]
Acked-by: David Sterba <dsterba@suse.com> [btrfs]
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20220415045258.199825-25-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-04-17 19:49:59 -06:00
Christoph Hellwig
cf0fbf894b block: add a bdev_max_discard_sectors helper
Add a helper to query the number of sectors support per each discard bio
based on the block device and use this helper to stop various places from
poking into the request_queue to see if discard is supported and if so how
much.  This mirrors what is done e.g. for write zeroes as well.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> [drbd]
Acked-by: Coly Li <colyli@suse.de> [bcache]
Acked-by: David Sterba <dsterba@suse.com> [btrfs]
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20220415045258.199825-24-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-04-17 19:49:59 -06:00
Christoph Hellwig
5c4b4a5c6f block: move {bdev,queue_limit}_discard_alignment out of line
No need to inline these fairly larger helpers.  Also fix the return value
to be unsigned, just like the field in struct queue_limits.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20220415045258.199825-22-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-04-17 19:49:59 -06:00
Christoph Hellwig
4e1462ffe8 block: remove queue_discard_alignment
Just use bdev_alignment_offset in disk_discard_alignment_show instead.
That helpers is the same except for an always false branch that doesn't
matter in this slow path.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20220415045258.199825-20-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-04-17 19:49:59 -06:00
Christoph Hellwig
89098b075c block: move bdev_alignment_offset and queue_limit_alignment_offset out of line
No need to inline these fairly larger helpers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20220415045258.199825-19-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-04-17 19:49:59 -06:00
Christoph Hellwig
640f2a2391 block: use bdev_alignment_offset in disk_alignment_offset_show
This does the same as the open coded variant except for an extra branch,
and allows to remove queue_alignment_offset entirely.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20220415045258.199825-18-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-04-17 19:49:59 -06:00
Christoph Hellwig
2aba0d19f4 block: add a bdev_max_zone_append_sectors helper
Add a helper to check the max supported sectors for zone append based on
the block_device instead of having to poke into the block layer internal
request_queue.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20220415045258.199825-16-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-04-17 19:49:59 -06:00
Christoph Hellwig
36d254893a block: add a bdev_stable_writes helper
Add a helper to check the stable writes flag based on the block_device
instead of having to poke into the block layer internal request_queue.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20220415045258.199825-15-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-04-17 19:49:59 -06:00
Christoph Hellwig
a557e82e5a block: add a bdev_fua helper
Add a helper to check the FUA flag based on the block_device instead of
having to poke into the block layer internal request_queue.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20220415045258.199825-14-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-04-17 19:49:59 -06:00
Christoph Hellwig
08e688fdb8 block: add a bdev_write_cache helper
Add a helper to check the write cache flag based on the block_device
instead of having to poke into the block layer internal request_queue.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: David Sterba <dsterba@suse.com> [btrfs]
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20220415045258.199825-13-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-04-17 19:49:59 -06:00
Christoph Hellwig
10f0d2a517 block: add a bdev_nonrot helper
Add a helper to check the nonrot flag based on the block_device instead
of having to poke into the block layer internal request_queue.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: David Sterba <dsterba@suse.com> [btrfs]
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20220415045258.199825-12-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-04-17 19:49:59 -06:00
Christoph Hellwig
817e8b51eb target: pass a block_device to target_configure_unmap_from_queue
The SCSI target drivers is a consumer of the block layer and shoul
d generally work on struct block_device.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20220415045258.199825-3-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-04-17 19:49:58 -06:00
Christoph Hellwig
066ff57101 block: turn bio_kmalloc into a simple kmalloc wrapper
Remove the magic autofree semantics and require the callers to explicitly
call bio_init to initialize the bio.

This allows bio_free to catch accidental bio_put calls on bio_init()ed
bios as well.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Coly Li <colyli@suse.de>
Acked-by: Mike Snitzer <snitzer@kernel.org>
Link: https://lore.kernel.org/r/20220406061228.410163-5-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-04-17 19:30:41 -06:00
Mike Snitzer
b53f3dcd70 block: allow use of per-cpu bio alloc cache by block drivers
Refine per-cpu bio alloc cache interfaces so that DM and other block
drivers can properly create and use the cache:

DM uses bioset_init_from_src() to do its final bioset initialization,
so must update bioset_init_from_src() to set BIOSET_PERCPU_CACHE if
%src bioset has a cache.

Also move bio_clear_polled() to include/linux/bio.h to allow users
outside of block core.

Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220324203526.62306-3-snitzer@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-04-17 16:49:40 -06:00
Mike Snitzer
0df71650c0 block: allow using the per-cpu bio cache from bio_alloc_bioset
Replace the BIO_PERCPU_CACHE bio-internal flag with a REQ_ALLOC_CACHE
one that can be passed to bio_alloc / bio_alloc_bioset, and implement
the percpu cache allocation logic in a helper called from
bio_alloc_bioset.  This allows any bio_alloc_bioset user to use the
percpu caches instead of having the functionality tied to struct kiocb.

Signed-off-by: Mike Snitzer <snitzer@kernel.org>
[hch: refactored a bit]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220324203526.62306-2-snitzer@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-04-17 16:49:40 -06:00
Linus Torvalds
de6e933668 Merge tag 'gpio-fixes-for-v5.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
 "A single fix for gpio-sim and two patches for GPIO ACPI pulled from
  Andy:

   - fix the set/get_multiple() callbacks in gpio-sim

   - use correct format characters in gpiolib-acpi

   - use an unsigned type for pins in gpiolib-acpi"

* tag 'gpio-fixes-for-v5.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio: sim: fix setting and getting multiple lines
  gpiolib: acpi: Convert type for pin to be unsigned
  gpiolib: acpi: use correct format characters
2022-04-16 17:01:43 -07:00
Linus Torvalds
92edbe32e3 Merge tag 'random-5.18-rc3-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random
Pull random number generator fixes from Jason Donenfeld:

 - Per your suggestion, random reads now won't fail if there's a page
   fault after some non-zero amount of data has been read, which makes
   the behavior consistent with all other reads in the kernel.

 - Rather than an inconsistent mix of random_get_entropy() returning an
   unsigned long or a cycles_t, now it just returns an unsigned long.

 - A memcpy() was replaced with an memmove(), because the addresses are
   sometimes overlapping. In practice the destination is always before
   the source, so not really an issue, but better to be correct than
   not.

* tag 'random-5.18-rc3-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
  random: use memmove instead of memcpy for remaining 32 bytes
  random: make random_get_entropy() return an unsigned long
  random: allow partial reads if later user copies fail
2022-04-16 16:42:53 -07:00
Linus Torvalds
90ea17a9e2 Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
 "13 fixes, all in drivers.

  The most extensive changes are in the iscsi series (affecting drivers
  qedi, cxgbi and bnx2i), the next most is scsi_debug, but that's just a
  simple revert and then minor updates to pm80xx"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: iscsi: MAINTAINERS: Add Mike Christie as co-maintainer
  scsi: qedi: Fix failed disconnect handling
  scsi: iscsi: Fix NOP handling during conn recovery
  scsi: iscsi: Merge suspend fields
  scsi: iscsi: Fix unbound endpoint error handling
  scsi: iscsi: Fix conn cleanup and stop race during iscsid restart
  scsi: iscsi: Fix endpoint reuse regression
  scsi: iscsi: Release endpoint ID when its freed
  scsi: iscsi: Fix offload conn cleanup when iscsid restarts
  scsi: iscsi: Move iscsi_ep_disconnect()
  scsi: pm80xx: Enable upper inbound, outbound queues
  scsi: pm80xx: Mask and unmask upper interrupt vectors 32-63
  Revert "scsi: scsi_debug: Address races following module load"
2022-04-16 13:38:26 -07:00
Bartosz Golaszewski
0ebb4fbe31 Merge tag 'intel-gpio-v5.18-2' of gitolite.kernel.org:pub/scm/linux/kernel/git/andy/linux-gpio-intel into gpio/for-current
intel-gpio for v5.18-2

* Couple of fixes related to handling unsigned value of the pin from ACPI

gpiolib:
 -  acpi: Convert type for pin to be unsigned
 -  acpi: use correct format characters
2022-04-16 21:57:00 +02:00
Linus Torvalds
59250f8a7f Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "14 patches.

  Subsystems affected by this patch series: MAINTAINERS, binfmt, and
  mm (tmpfs, secretmem, kasan, kfence, pagealloc, zram, compaction,
  hugetlb, vmalloc, and kmemleak)"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm: kmemleak: take a full lowmem check in kmemleak_*_phys()
  mm/vmalloc: fix spinning drain_vmap_work after reading from /proc/vmcore
  revert "fs/binfmt_elf: use PT_LOAD p_align values for static PIE"
  revert "fs/binfmt_elf: fix PT_LOAD p_align values for loaders"
  hugetlb: do not demote poisoned hugetlb pages
  mm: compaction: fix compiler warning when CONFIG_COMPACTION=n
  mm: fix unexpected zeroed page mapping with zram swap
  mm, page_alloc: fix build_zonerefs_node()
  mm, kfence: support kmem_dump_obj() for KFENCE objects
  kasan: fix hw tags enablement when KUNIT tests are disabled
  irq_work: use kasan_record_aux_stack_noalloc() record callstack
  mm/secretmem: fix panic when growing a memfd_secret
  tmpfs: fix regressions from wider use of ZERO_PAGE
  MAINTAINERS: Broadcom internal lists aren't maintainers
2022-04-15 15:57:18 -07:00
Marco Elver
2dfe63e61c mm, kfence: support kmem_dump_obj() for KFENCE objects
Calling kmem_obj_info() via kmem_dump_obj() on KFENCE objects has been
producing garbage data due to the object not actually being maintained
by SLAB or SLUB.

Fix this by implementing __kfence_obj_info() that copies relevant
information to struct kmem_obj_info when the object was allocated by
KFENCE; this is called by a common kmem_obj_info(), which also calls the
slab/slub/slob specific variant now called __kmem_obj_info().

For completeness, kmem_dump_obj() now displays if the object was
allocated by KFENCE.

Link: https://lore.kernel.org/all/20220323090520.GG16885@xsang-OptiPlex-9020/
Link: https://lkml.kernel.org/r/20220406131558.3558585-1-elver@google.com
Fixes: b89fb5ef0c ("mm, kfence: insert KFENCE hooks for SLUB")
Fixes: d3fb45f370 ("mm, kfence: insert KFENCE hooks for SLAB")
Signed-off-by: Marco Elver <elver@google.com>
Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Reported-by: kernel test robot <oliver.sang@intel.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>	[slab]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-15 14:49:55 -07:00
Linus Torvalds
fb649bda6f Merge tag 'block-5.18-2022-04-15' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:

 - Moving of lower_48_bits() to the block layer and a fix for the
   unaligned_be48 added with that originally (Alexander, Keith)

 - Fix a bad WARN_ON() for trim size checking (Ming)

 - A polled IO timeout fix for null_blk (Ming)

 - Silence IO error printing for dead disks (Christoph)

 - Compat mode range fix (Khazhismel)

 - NVMe pull request via Christoph:
     - Tone down the error logging added this merge window a bit
       (Chaitanya Kulkarni)
     - Quirk devices with non-unique unique identifiers (Christoph)

* tag 'block-5.18-2022-04-15' of git://git.kernel.dk/linux-block:
  block: don't print I/O error warning for dead disks
  block/compat_ioctl: fix range check in BLKGETSIZE
  nvme-pci: disable namespace identifiers for Qemu controllers
  nvme-pci: disable namespace identifiers for the MAXIO MAP1002/1202
  nvme: add a quirk to disable namespace identifiers
  nvme: don't print verbose errors for internal passthrough requests
  block: null_blk: end timed out poll request
  block: fix offset/size check in bio_trim()
  asm-generic: fix __get_unaligned_be48() on 32 bit platforms
  block: move lower_48_bits() to block
2022-04-15 11:38:55 -07:00
Linus Torvalds
0647b9cc7f Merge tag 'io_uring-5.18-2022-04-14' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:

 - Ensure we check and -EINVAL any use of reserved or struct padding.

   Although we generally always do that, it's missed in two spots for
   resource updates, one for the ring fd registration from this merge
   window, and one for the extended arg. Make sure we have all of them
   handled. (Dylan)

 - A few fixes for the deferred file assignment (me, Pavel)

 - Add a feature flag for the deferred file assignment so apps can tell
   we handle it correctly (me)

 - Fix a small perf regression with the current file position fix in
   this merge window (me)

* tag 'io_uring-5.18-2022-04-14' of git://git.kernel.dk/linux-block:
  io_uring: abort file assignment prior to assigning creds
  io_uring: fix poll error reporting
  io_uring: fix poll file assign deadlock
  io_uring: use right issue_flags for splice/tee
  io_uring: verify pad field is 0 in io_get_ext_arg
  io_uring: verify resv is 0 in ringfd register/unregister
  io_uring: verify that resv2 is 0 in io_uring_rsrc_update2
  io_uring: move io_uring_rsrc_update2 validation
  io_uring: fix assign file locking issue
  io_uring: stop using io_wq_work as an fd placeholder
  io_uring: move apoll->events cache
  io_uring: io_kiocb_update_pos() should not touch file for non -1 offset
  io_uring: flag the fact that linked file assignment is sane
2022-04-15 11:33:20 -07:00
Linus Torvalds
38a5e3fb17 Merge tag 'vfio-v5.18-rc3' of https://github.com/awilliam/linux-vfio
Pull vfio fix from Alex Williamson:

 - Fix VF token checking for vfio-pci variant drivers (Jason Gunthorpe)

* tag 'vfio-v5.18-rc3' of https://github.com/awilliam/linux-vfio:
  vfio/pci: Fix vf_token mechanism when device-specific VF drivers are used
2022-04-14 16:06:56 -07:00
Linus Torvalds
d20339fa93 Merge tag 'net-5.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
 "Including fixes from wireless and netfilter.

  Current release - regressions:

   - smc: fix af_ops of child socket pointing to released memory

   - wifi: ath9k: fix usage of driver-private space in tx_info

  Previous releases - regressions:

   - ipv6: fix panic when forwarding a pkt with no in6 dev

   - sctp: use the correct skb for security_sctp_assoc_request

   - smc: fix NULL pointer dereference in smc_pnet_find_ib()

   - sched: fix initialization order when updating chain 0 head

   - phy: don't defer probe forever if PHY IRQ provider is missing

   - dsa: revert "net: dsa: setup master before ports"

   - dsa: felix: fix tagging protocol changes with multiple CPU ports

   - eth: ice:
      - fix use-after-free when freeing @rx_cpu_rmap
      - revert "iavf: fix deadlock occurrence during resetting VF
        interface"

   - eth: lan966x: stop processing the MAC entry is port is wrong

  Previous releases - always broken:

   - sched:
      - flower: fix parsing of ethertype following VLAN header
      - taprio: check if socket flags are valid

   - nfc: add flush_workqueue to prevent uaf

   - veth: ensure eth header is in skb's linear part

   - eth: stmmac: fix altr_tse_pcs function when using a fixed-link

   - eth: macb: restart tx only if queue pointer is lagging

   - eth: macvlan: fix leaking skb in source mode with nodst option"

* tag 'net-5.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (52 commits)
  net: bcmgenet: Revert "Use stronger register read/writes to assure ordering"
  rtnetlink: Fix handling of disabled L3 stats in RTM_GETSTATS replies
  net: dsa: felix: fix tagging protocol changes with multiple CPU ports
  tun: annotate access to queue->trans_start
  nfc: nci: add flush_workqueue to prevent uaf
  net: dsa: realtek: don't parse compatible string for RTL8366S
  net: dsa: realtek: fix Kconfig to assure consistent driver linkage
  net: ftgmac100: access hardware register after clock ready
  Revert "net: dsa: setup master before ports"
  macvlan: Fix leaking skb in source mode with nodst option
  netfilter: nf_tables: nft_parse_register can return a negative value
  net: lan966x: Stop processing the MAC entry is port is wrong.
  net: lan966x: Fix when a port's upper is changed.
  net: lan966x: Fix IGMP snooping when frames have vlan tag
  net: lan966x: Update lan966x_ptp_get_nominal_value
  sctp: Initialize daddr on peeled off socket
  net/smc: Fix af_ops of child socket pointing to released memory
  net/smc: Fix NULL pointer dereference in smc_pnet_find_ib()
  net/smc: use memcpy instead of snprintf to avoid out of bounds read
  net: macb: Restart tx only if queue pointer is lagging
  ...
2022-04-14 11:58:19 -07:00
Linus Torvalds
b9b4c79e58 Merge tag 'sound-5.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
 "This became an unexpectedly large pull request due to various
  regression fixes in the previous kernels.

  The majority of fixes are a series of patches to address the
  regression at probe errors in devres'ed drivers, while there are yet
  more fixes for the x86 SG allocations and for USB-audio buffer
  management. In addition, a few HD-audio quirks and other small fixes
  are found"

* tag 'sound-5.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (52 commits)
  ALSA: usb-audio: Limit max buffer and period sizes per time
  ALSA: memalloc: Add fallback SG-buffer allocations for x86
  ALSA: nm256: Don't call card private_free at probe error path
  ALSA: mtpav: Don't call card private_free at probe error path
  ALSA: rme9652: Fix the missing snd_card_free() call at probe error
  ALSA: hdspm: Fix the missing snd_card_free() call at probe error
  ALSA: hdsp: Fix the missing snd_card_free() call at probe error
  ALSA: oxygen: Fix the missing snd_card_free() call at probe error
  ALSA: lx6464es: Fix the missing snd_card_free() call at probe error
  ALSA: cmipci: Fix the missing snd_card_free() call at probe error
  ALSA: aw2: Fix the missing snd_card_free() call at probe error
  ALSA: als300: Fix the missing snd_card_free() call at probe error
  ALSA: lola: Fix the missing snd_card_free() call at probe error
  ALSA: bt87x: Fix the missing snd_card_free() call at probe error
  ALSA: sis7019: Fix the missing error handling
  ALSA: intel_hdmi: Fix the missing snd_card_free() call at probe error
  ALSA: via82xx: Fix the missing snd_card_free() call at probe error
  ALSA: sonicvibes: Fix the missing snd_card_free() call at probe error
  ALSA: rme96: Fix the missing snd_card_free() call at probe error
  ALSA: rme32: Fix the missing snd_card_free() call at probe error
  ...
2022-04-14 11:08:12 -07:00
Linus Torvalds
ec9c57a732 Merge tag 'fscache-fixes-20220413' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
Pull fscache fixes from David Howells:
 "Here's a collection of fscache and cachefiles fixes and misc small
  cleanups. The two main fixes are:

   - Add a missing unmark of the inode in-use mark in an error path.

   - Fix a KASAN slab-out-of-bounds error when setting the xattr on a
     cachefiles volume due to the wrong length being given to memcpy().

  In addition, there's the removal of an unused parameter, removal of an
  unused Kconfig option, conditionalising a bit of procfs-related stuff
  and some doc fixes"

* tag 'fscache-fixes-20220413' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  fscache: remove FSCACHE_OLD_API Kconfig option
  fscache: Use wrapper fscache_set_cache_state() directly when relinquishing
  fscache: Move fscache_cookies_seq_ops specific code under CONFIG_PROC_FS
  fscache: Remove the cookie parameter from fscache_clear_page_bits()
  docs: filesystems: caching/backend-api.rst: fix an object withdrawn API
  docs: filesystems: caching/backend-api.rst: correct two relinquish APIs use
  cachefiles: Fix KASAN slab-out-of-bounds in cachefiles_set_volume_xattr
  cachefiles: unmark inode in use in error path
2022-04-14 10:51:20 -07:00
Jason Gunthorpe
1ef3342a93 vfio/pci: Fix vf_token mechanism when device-specific VF drivers are used
get_pf_vdev() tries to check if a PF is a VFIO PF by looking at the driver:

       if (pci_dev_driver(physfn) != pci_dev_driver(vdev->pdev)) {

However now that we have multiple VF and PF drivers this is no longer
reliable.

This means that security tests realted to vf_token can be skipped by
mixing and matching different VFIO PCI drivers.

Instead of trying to use the driver core to find the PF devices maintain a
linked list of all PF vfio_pci_core_device's that we have called
pci_enable_sriov() on.

When registering a VF just search the list to see if the PF is present and
record the match permanently in the struct. PCI core locking prevents a PF
from passing pci_disable_sriov() while VF drivers are attached so the VFIO
owned PF becomes a static property of the VF.

In common cases where vfio does not own the PF the global list remains
empty and the VF's pointer is statically NULL.

This also fixes a lockdep splat from recursive locking of the
vfio_group::device_lock between vfio_device_get_from_name() and
vfio_device_get_from_dev(). If the VF and PF share the same group this
would deadlock.

Fixes: ff53edf6d6 ("vfio/pci: Split the pci_driver code out of vfio_pci_core.c")
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/0-v3-876570980634+f2e8-vfio_vf_token_jgg@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-04-13 11:37:44 -06:00
Jason A. Donenfeld
b0c3e796f2 random: make random_get_entropy() return an unsigned long
Some implementations were returning type `unsigned long`, while others
that fell back to get_cycles() were implicitly returning a `cycles_t` or
an untyped constant int literal. That makes for weird and confusing
code, and basically all code in the kernel already handled it like it
was an `unsigned long`. I recently tried to handle it as the largest
type it could be, a `cycles_t`, but doing so doesn't really help with
much.

Instead let's just make random_get_entropy() return an unsigned long all
the time. This also matches the commonly used `arch_get_random_long()`
function, so now RDRAND and RDTSC return the same sized integer, which
means one can fallback to the other more gracefully.

Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Theodore Ts'o <tytso@mit.edu>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-04-13 13:58:57 +02:00
Takashi Iwai
925ca893b4 ALSA: memalloc: Add fallback SG-buffer allocations for x86
The recent change for memory allocator replaced the SG-buffer handling
helper for x86 with the standard non-contiguous page handler.  This
works for most cases, but there is a corner case I obviously
overlooked, namely, the fallback of non-contiguous handler without
IOMMU.  When the system runs without IOMMU, the core handler tries to
use the continuous pages with a single SGL entry.  It works nicely for
most cases, but when the system memory gets fragmented, the large
allocation may fail frequently.

Ideally the non-contig handler could deal with the proper SG pages,
it's cumbersome to extend for now.  As a workaround, here we add new
types for (minimalistic) SG allocations, instead, so that the
allocator falls back to those types automatically when the allocation
with the standard API failed.

BTW, one better (but pretty minor) improvement from the previous
SG-buffer code is that this provides the proper mmap support without
the PCM's page fault handling.

Fixes: 2c95b92ecd ("ALSA: memalloc: Unify x86 SG-buffer handling (take#3)")
BugLink: https://gitlab.freedesktop.org/pipewire/pipewire/-/issues/2272
BugLink: https://bugzilla.suse.com/show_bug.cgi?id=1198248
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220413054808.7547-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-04-13 07:48:53 +02:00
Linus Torvalds
a19944809f Merge tag 'hardening-v5.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull hardening fixes from Kees Cook:

 - latent_entropy: Use /dev/urandom instead of small GCC seed (Jason
   Donenfeld)

 - uapi/stddef.h: add missed include guards (Tadeusz Struk)

* tag 'hardening-v5.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  gcc-plugins: latent_entropy: use /dev/urandom
  uapi/linux/stddef.h: Add include guards
2022-04-12 14:29:40 -10:00
Linus Torvalds
c1488c9751 Merge tag 'nfsd-5.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Pull nfsd fixes from Chuck Lever:

 - Fix a write performance regression

 - Fix crashes during request deferral on RDMA transports

* tag 'nfsd-5.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
  SUNRPC: Fix the svc_deferred_event trace class
  SUNRPC: Fix NFSD's request deferral on RDMA transports
  nfsd: Clean up nfsd_file_put()
  nfsd: Fix a write performance regression
  SUNRPC: Return true/false (not 1/0) from bool functions
2022-04-12 14:23:19 -10:00
Alexander Lobakin
b97687527b asm-generic: fix __get_unaligned_be48() on 32 bit platforms
While testing the new macros for working with 48 bit containers,
I faced a weird problem:

32 + 16: 0x2ef6e8da 0x79e60000
48: 0xffffe8da + 0x79e60000

All the bits starting from the 32nd were getting 1d in 9/10 cases.
The debug showed:

p[0]: 0x00002e0000000000
p[1]: 0x00002ef600000000
p[2]: 0xffffffffe8000000
p[3]: 0xffffffffe8da0000
p[4]: 0xffffffffe8da7900
p[5]: 0xffffffffe8da79e6

that the value becomes a garbage after the third OR, i.e. on
`p[2] << 24`.
When the 31st bit is 1 and there's no explicit cast to an unsigned,
it's being considered as a signed int and getting sign-extended on
OR, so `e8000000` becomes `ffffffffe8000000` and messes up the
result.
Cast the @p[2] to u64 as well to avoid this. Now:

32 + 16: 0x7ef6a490 0xddc10000
48: 0x7ef6a490 + 0xddc10000

p[0]: 0x00007e0000000000
p[1]: 0x00007ef600000000
p[2]: 0x00007ef6a4000000
p[3]: 0x00007ef6a4900000
p[4]: 0x00007ef6a490dd00
p[5]: 0x00007ef6a490ddc1

Fixes: c2ea5fcf53 ("asm-generic: introduce be48 unaligned accessors")
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Link: https://lore.kernel.org/r/20220412215220.75677-1-alobakin@pm.me
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-04-12 16:31:38 -06:00
Takashi Iwai
fee2b871d8 ALSA: core: Add snd_card_free_on_error() helper
This is a small helper function to handle the error path more easily
when an error happens during the probe for the device with the
device-managed card.  Since devres releases in the reverser order of
the creations, usually snd_card_free() gets called at the last in the
probe error path unless it already reached snd_card_register() calls.
Due to this nature, when a driver expects the resource releases in
card->private_free, this might be called too lately.

As a workaround, one should call the probe like:

 static int __some_probe(...) { // do real probe.... }

 static int some_probe(...)
 {
	return snd_card_free_on_error(dev, __some_probe(dev, ...));
 }

so that the snd_card_free() is called explicitly at the beginning of
the error path from the probe.

This function will be used in the upcoming fixes to address the
regressions by devres usages.

Fixes: e8ad415b7a ("ALSA: core: Add managed card creation")
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220412093141.8008-2-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-04-12 11:47:57 +02:00
Mike Christie
44ac97109e scsi: iscsi: Fix NOP handling during conn recovery
If a offload driver doesn't use the xmit workqueue, then when we are doing
ep_disconnect libiscsi can still inject PDUs to the driver. This adds a
check for if the connection is bound before trying to inject PDUs.

Link: https://lore.kernel.org/r/20220408001314.5014-9-michael.christie@oracle.com
Tested-by: Manish Rangankar <mrangankar@marvell.com>
Reviewed-by: Lee Duncan <lduncan@suse.com>
Reviewed-by: Chris Leech <cleech@redhat.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-04-11 22:09:35 -04:00