Commit Graph

121167 Commits

Author SHA1 Message Date
Randy Dunlap
fe5e26a70c nvme-fc: drop a duplicated word in a comment
Drop the repeated word "a" in a comment.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-29 07:45:20 +02:00
Coly Li
ffa4703275 bcache: add bucket_size_hi into struct cache_sb_disk for large bucket
The large bucket feature is to extend bucket_size from 16bit to 32bit.

When create cache device on zoned device (e.g. zoned NVMe SSD), making
a single bucket cover one or more zones of the zoned device is the
simplest way to support zoned device as cache by bcache.

But current maximum bucket size is 16MB and a typical zone size of zoned
device is 256MB, this is the major motiviation to extend bucket size to
a larger bit width.

This patch is the basic and first change to support large bucket size,
the major changes it makes are,
- Add BCH_FEATURE_INCOMPAT_LARGE_BUCKET for the large bucket feature,
  INCOMPAT means it introduces incompatible on-disk format change.
- Add BCH_FEATURE_INCOMPAT_FUNCS(large_bucket, LARGE_BUCKET) routines.
- Adds __le16 bucket_size_hi into struct cache_sb_disk at offset 0x8d0
  for the on-disk super block format.
- For the in-memory super block struct cache_sb, member bucket_size is
  extended from __u16 to __32.
- Add get_bucket_size() to combine the bucket_size and bucket_size_hi
  from struct cache_sb_disk into an unsigned int value.

Since we already have large bucket size helpers meta_bucket_pages(),
meta_bucket_bytes() and alloc_meta_bucket_pages(), they make sure when
bucket size > 8MB, the memory allocation for bcache meta data bucket
won't fail no matter how large the bucket size extended. So these meta
data buckets are handled properly when the bucket size width increase
from 16bit to 32bit, we don't need to worry about them.

Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-25 07:38:21 -06:00
Coly Li
4c1ccd0896 bcache: struct cache_sb is only for in-memory super block now
We have struct cache_sb_disk for on-disk super block already, it is
unnecessary to keep the in-memory super block format exactly mapping
to the on-disk struct layout.

This patch adds code comments to notice that struct cache_sb is not
exactly mapping to cache_sb_disk, and removes the useless member csum
and pad[5].

Although struct cache_sb does not belong to uapi, but there are still
some on-disk format related macros reference it and it is unncessary to
get rid of such dependency now. So struct cache_sb will continue to stay
in include/uapi/linux/bache.h for now.

Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-25 07:38:20 -06:00
Coly Li
d721a43ff6 bcache: increase super block version for cache device and backing device
The new added super block version BCACHE_SB_VERSION_BDEV_WITH_FEATURES
(5) BCACHE_SB_VERSION_CDEV_WITH_FEATURES value (6), is for the feature
set bits.

Devices have super block version equal to the new version will have
three new members for feature set bits in the on-disk super block,
        __le64                  feature_compat;
        __le64                  feature_incompat;
        __le64                  feature_ro_compat;

They are used for further new features which may introduce on-disk
format change, and avoid unncessary super block version increase.

The very basic features handling code skeleton is also initialized in
this patch.

Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-25 07:38:20 -06:00
Randy Dunlap
c333f9495c raid: md_p.h: drop duplicated word in a comment
Drop the doubled word "the" in a comment.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: linux-raid@vger.kernel.org
Signed-off-by: Song Liu <songliubraving@fb.com>
2020-07-21 22:05:32 -07:00
Niklas Cassel
659bf827ba block: add max_active_zones to blk-sysfs
Add a new max_active zones definition in the sysfs documentation.
This definition will be common for all devices utilizing the zoned block
device support in the kernel.

Export max_active_zones according to this new definition for NVMe Zoned
Namespace devices, ZAC ATA devices (which are treated as SCSI devices by
the kernel), and ZBC SCSI devices.

Add the new max_active_zones member to struct request_queue, rather
than as a queue limit, since this property cannot be split across stacking
drivers.

For SCSI devices, even though max active zones is not part of the ZBC/ZAC
spec, export max_active_zones as 0, signifying "no limit".

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Javier González <javier@javigon.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-15 14:26:11 -06:00
Niklas Cassel
e15864f8ea block: add max_open_zones to blk-sysfs
Add a new max_open_zones definition in the sysfs documentation.
This definition will be common for all devices utilizing the zoned block
device support in the kernel.

Export max open zones according to this new definition for NVMe Zoned
Namespace devices, ZAC ATA devices (which are treated as SCSI devices by
the kernel), and ZBC SCSI devices.

Add the new max_open_zones member to struct request_queue, rather
than as a queue limit, since this property cannot be split across stacking
drivers.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Javier González <javier@javigon.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-15 14:26:11 -06:00
Keith Busch
240e6ee272 nvme: support for zoned namespaces
Add support for NVM Express Zoned Namespaces (ZNS) Command Set defined
in NVM Express TP4053. Zoned namespaces are discovered based on their
Command Set Identifier reported in the namespaces Namespace
Identification Descriptor list. A successfully discovered Zoned
Namespace will be registered with the block layer as a host managed
zoned block device with Zone Append command support. A namespace that
does not support append is not supported by the driver.

Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Javier González <javier.gonz@samsung.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com>
Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Signed-off-by: Ajay Joshi <ajay.joshi@wdc.com>
Signed-off-by: Aravind Ramesh <aravind.ramesh@wdc.com>
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: Matias Bjørling <matias.bjorling@wdc.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Keith Busch <keith.busch@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-08 16:16:20 +02:00
Keith Busch
be93e87e78 nvme: support for multiple Command Sets Supported and Effects log pages
The Commands Supported and Effects log page was extended with a CSI
field that enables the host to query the log page for each command set
supported. Retrieve this log page for each command set that an attached
namespace supports, and save a pointer to that log in the namespace head.

Reviewed-by: Matias Bjørling <matias.bjorling@wdc.com>
Reviewed-by: Javier González <javier.gonz@samsung.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Keith Busch <keith.busch@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-08 16:16:20 +02:00
Niklas Cassel
71010c3094 nvme: implement multiple I/O Command Set support
Implements support for multiple I/O Command Sets.  NVMe TP 4056
introduces a method to enumerate multiple command sets per namespace. If
the command set is exposed, this method for enumeration will be used
instead of the traditional method that uses the CC.CSS register command
set register for command set identification.

For namespaces where the Command Set Identifier is not supported or
recognized, the specific namespace will not be created.

Reviewed-by: Javier González <javier.gonz@samsung.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Matias Bjørling <matias.bjorling@wdc.com>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-08 16:16:19 +02:00
Matias Bjørling
82394db738 block: add capacity field to zone descriptors
In the zoned storage model, the sectors within a zone are typically all
writeable. With the introduction of the Zoned Namespace (ZNS) Command
Set in the NVM Express organization, the model was extended to have a
specific writeable capacity.

Extend the zone descriptor data structure with a zone capacity field to
indicate to the user how many sectors in a zone are writeable.

Introduce backward compatibility in the zone report ioctl by extending
the zone report header data structure with a flags field to indicate if
the capacity field is available.

Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Javier González <javier.gonz@samsung.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Matias Bjørling <matias.bjorling@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-08 16:16:19 +02:00
Jens Axboe
482c6b614a Merge tag 'v5.8-rc4' into for-5.9/drivers
Merge in 5.8-rc4 for-5.9/block to setup for-5.9/drivers, to provide
a clean base and making the life for the NVMe changes easier.

Signed-off-by: Jens Axboe <axboe@kernel.dk>

* tag 'v5.8-rc4': (732 commits)
  Linux 5.8-rc4
  x86/ldt: use "pr_info_once()" instead of open-coding it badly
  MIPS: Do not use smp_processor_id() in preemptible code
  MIPS: Add missing EHB in mtc0 -> mfc0 sequence for DSPen
  .gitignore: Do not track `defconfig` from `make savedefconfig`
  io_uring: fix regression with always ignoring signals in io_cqring_wait()
  x86/ldt: Disable 16-bit segments on Xen PV
  x86/entry/32: Fix #MC and #DB wiring on x86_32
  x86/entry/xen: Route #DB correctly on Xen PV
  x86/entry, selftests: Further improve user entry sanity checks
  x86/entry/compat: Clear RAX high bits on Xen PV SYSENTER
  i2c: mlxcpld: check correct size of maximum RECV_LEN packet
  i2c: add Kconfig help text for slave mode
  i2c: slave-eeprom: update documentation
  i2c: eg20t: Load module automatically if ID matches
  i2c: designware: platdrv: Set class based on DMI
  i2c: algo-pca: Add 0x78 as SCL stuck low status for PCA9665
  mm/page_alloc: fix documentation error
  vmalloc: fix the owner argument for the new __vmalloc_node_range callers
  mm/cma.c: use exact_nid true to fix possible per-numa cma leak
  ...
2020-07-08 08:02:13 -06:00
Linus Torvalds
7fec3ce50a Merge tag 'pci-v5.8-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI fix from Bjorn Helgaas:
 "Fix a pcie_find_root_port() simplification that broke power management
  because it didn't handle the edge case of finding the Root Port of a
  Root Port itself (Mika Westerberg)""

* tag 'pci-v5.8-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI: Make pcie_find_root_port() work for Root Ports
2020-07-03 12:14:51 -07:00
Linus Torvalds
7cc2a8ea10 Merge tag 'block-5.8-2020-07-01' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:

 - Use kvfree_sensitive() for the block keyslot free (Eric)

 - Sync blk-mq debugfs flags (Hou)

 - Memory leak fix in virtio-blk error path (Hou)

* tag 'block-5.8-2020-07-01' of git://git.kernel.dk/linux-block:
  virtio-blk: free vblk-vqs in error path of virtblk_probe()
  block/keyslot-manager: use kvfree_sensitive()
  blk-mq-debugfs: update blk_queue_flag_name[] accordingly for new flags
2020-07-02 15:13:51 -07:00
Linus Torvalds
c93493b7cd Merge tag 'io_uring-5.8-2020-07-01' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
 "One fix in here, for a regression in 5.7 where a task is waiting in
  the kernel for a condition, but that condition won't become true until
  task_work is run. And the task_work can't be run exactly because the
  task is waiting in the kernel, so we'll never make any progress.

  One example of that is registering an eventfd and queueing io_uring
  work, and then the task goes and waits in eventfd read with the
  expectation that it'll get woken (and read an event) when the io_uring
  request completes. The io_uring request is finished through task_work,
  which won't get run while the task is looping in eventfd read"

* tag 'io_uring-5.8-2020-07-01' of git://git.kernel.dk/linux-block:
  io_uring: use signal based task_work running
  task_work: teach task_work_add() to do signal_wake_up()
2020-07-02 14:56:22 -07:00
Christoph Hellwig
1008fe6dc3 block: remove the all_bdevs list
Instead just iterate over the inodes for the block device superblock.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-01 08:08:25 -06:00
Christoph Hellwig
47b5e00322 block: remove the unused bd_private field from struct block_device
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-01 08:08:23 -06:00
Christoph Hellwig
e556f6ba10 block: remove the bd_queue field from struct block_device
Just use bd_disk->queue instead.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-01 08:08:20 -06:00
Christoph Hellwig
6b7b181b67 block: remove the bd_block_size field from struct block_device
We can trivially calculate the block size from the inodes i_blkbits
variable.  Use that instead of keeping two redundant copies of the
information in slightly different formats.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-01 08:08:17 -06:00
Christoph Hellwig
5a6c35f9af block: remove direct_make_request
Now that submit_bio_noacct has a decent blk-mq fast path there is no
more need for this bypass.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-01 07:27:24 -06:00
Christoph Hellwig
ed00aabd5e block: rename generic_make_request to submit_bio_noacct
generic_make_request has always been very confusingly misnamed, so rename
it to submit_bio_noacct to make it clear that it is submit_bio minus
accounting and a few checks.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-01 07:27:24 -06:00
Christoph Hellwig
c62b37d96b block: move ->make_request_fn to struct block_device_operations
The make_request_fn is a little weird in that it sits directly in
struct request_queue instead of an operation vector.  Replace it with
a block_device_operations method called submit_bio (which describes much
better what it does).  Also remove the request_queue argument to it, as
the queue can be derived pretty trivially from the bio.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-01 07:27:24 -06:00
Christoph Hellwig
f695ca3886 block: remove the request_queue argument from blk_queue_split
The queue can be trivially derived from the bio, so pass one less
argument.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-01 07:27:23 -06:00
Mika Westerberg
5396956cc7 PCI: Make pcie_find_root_port() work for Root Ports
Commit 6ae72bfa65 ("PCI: Unify pcie_find_root_port() and
pci_find_pcie_root_port()") broke acpi_pci_bridge_d3() because calling
pcie_find_root_port() on a Root Port returned NULL when it should return
the Root Port, which in turn broke power management of PCIe hierarchies.

Rework pcie_find_root_port() so it returns its argument when it is already
a Root Port.

[bhelgaas: test device only once, test for PCIe]
Fixes: 6ae72bfa65 ("PCI: Unify pcie_find_root_port() and pci_find_pcie_root_port()")
Link: https://lore.kernel.org/r/20200622161248.51099-1-mika.westerberg@linux.intel.com
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-06-30 16:58:27 -05:00
Linus Torvalds
615bc218d6 Merge tag 'fixes-v5.8-rc3-a' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem fixes from James Morris:
 "Two simple fixes for v5.8:

   - Fix hook iteration and default value for inode_copy_up_xattr
     (KP Singh)

   - Fix the key_permission LSM hook function type (Sami Tolvanen)"

* tag 'fixes-v5.8-rc3-a' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  security: Fix hook iteration and default value for inode_copy_up_xattr
  security: fix the key_permission LSM hook function type
2020-06-30 12:21:53 -07:00
Oleg Nesterov
e91b481623 task_work: teach task_work_add() to do signal_wake_up()
So that the target task will exit the wait_event_interruptible-like
loop and call task_work_run() asap.

The patch turns "bool notify" into 0,TWA_RESUME,TWA_SIGNAL enum, the
new TWA_SIGNAL flag implies signal_wake_up().  However, it needs to
avoid the race with recalc_sigpending(), so the patch also adds the
new JOBCTL_TASK_WORK bit included in JOBCTL_PENDING_MASK.

TODO: once this patch is merged we need to change all current users
of task_work_add(notify = true) to use TWA_RESUME.

Cc: stable@vger.kernel.org # v5.7
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-06-30 12:18:08 -06:00
Ming Lei
65c7636943 blk-mq: pass request queue into get/put budget callback
blk-mq budget is abstract from scsi's device queue depth, and it is
always per-request-queue instead of hctx.

It can be quite absurd to get a budget from one hctx, then dequeue a
request from scheduler queue, and this request may not belong to this
hctx, at least for bfq and deadline.

So fix the mess and always pass request queue to get/put budget
callback.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Tested-by: Baolin Wang <baolin.wang7@gmail.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Baolin Wang <baolin.wang7@gmail.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-06-30 07:51:48 -06:00
Linus Torvalds
2cfa46dadd Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
 "This fixes two race conditions, one in padata and one in af_alg"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  padata: upgrade smp_mb__after_atomic to smp_mb in padata_do_serial
  crypto: af_alg - fix use-after-free in af_alg_accept() due to bh_lock_sock()
2020-06-29 10:06:26 -07:00
Christoph Hellwig
42fdc5e49c blk-mq: remove the BLK_MQ_REQ_INTERNAL flag
Just check for a non-NULL elevator directly to make the code more clear.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-06-29 09:56:18 -06:00
Christoph Hellwig
db18a53e5b blk-cgroup: remove blkcg_bio_issue_check
blkcg_bio_issue_check is a giant inline function that does three entirely
different things.  Factor out the blk-cgroup related bio initalization
into a new helper, and the open code the sequence in the only caller,
relying on the fact that all the actual functionality is stubbed out for
non-cgroup builds.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-06-29 09:09:08 -06:00
Christoph Hellwig
93b8063804 blk-cgroup: move rcu locking from blkcg_bio_issue_check to blk_throtl_bio
The only thing in blkcg_bio_issue_check that needs to be under
rcu_read_lock is blk_throtl_bio, so move the locking there.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-06-29 09:09:08 -06:00
Christoph Hellwig
81630e27ff blk-cgroup: remove the !bio->bi_blkg check in blkcg_bio_issue_check
This is purely a sanity check for grave programming errors.  Remove it
to simplify further work in this area.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-06-29 09:09:08 -06:00
Christoph Hellwig
28fc591ff9 block: move the bio cgroup associatation helpers to blk-cgroup.c
Keep the cgroup code together.

Acked-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-06-29 09:09:08 -06:00
Christoph Hellwig
a18b9b1590 block: move bio_associate_blkg_from_page to mm/page_io.c
bio_associate_blkg_from_page is a special purpose helper for swap bios
that doesn't need access to bio internals.  Move it to the swap code
instead of having it in bio.c.

Acked-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-06-29 09:09:08 -06:00
Christoph Hellwig
db9819c76c block: remove bio_disassociate_blkg
bio_disassociate_blkg has two callers, of which one immediately assigns
a new value to >bi_blkg.  Just open code the function in the two callers.

Acked-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-06-29 09:09:08 -06:00
Hou Tao
bfe373f608 blk-mq-debugfs: update blk_queue_flag_name[] accordingly for new flags
Else there may be magic numbers in /sys/kernel/debug/block/*/state.

Signed-off-by: Hou Tao <houtao1@huawei.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-06-29 07:45:09 -06:00
Linus Torvalds
668f532da4 Merge tag 'timers-urgent-2020-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fix from Ingo Molnar:
 "A single DocBook fix"

* tag 'timers-urgent-2020-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  timekeeping: Fix kerneldoc system_device_crosststamp & al
2020-06-28 11:59:08 -07:00
Linus Torvalds
bc53f67d24 Merge tag 'efi-urgent-2020-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI fixes from Ingo Molnar:

 - Fix build regression on v4.8 and older

 - Robustness fix for TPM log parsing code

 - kobject refcount fix for the ESRT parsing code

 - Two efivarfs fixes to make it behave more like an ordinary file
   system

 - Style fixup for zero length arrays

 - Fix a regression in path separator handling in the initrd loader

 - Fix a missing prototype warning

 - Add some kerneldoc headers for newly introduced stub routines

 - Allow support for SSDT overrides via EFI variables to be disabled

 - Report CPU mode and MMU state upon entry for 32-bit ARM

 - Use the correct stack pointer alignment when entering from mixed mode

* tag 'efi-urgent-2020-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  efi/libstub: arm: Print CPU boot mode and MMU state at boot
  efi/libstub: arm: Omit arch specific config table matching array on arm64
  efi/x86: Setup stack correctly for efi_pe_entry
  efi: Make it possible to disable efivar_ssdt entirely
  efi/libstub: Descriptions for stub helper functions
  efi/libstub: Fix path separator regression
  efi/libstub: Fix missing-prototype warning for skip_spaces()
  efi: Replace zero-length array and use struct_size() helper
  efivarfs: Don't return -EINTR when rate-limiting reads
  efivarfs: Update inode modification time for successful writes
  efi/esrt: Fix reference count leak in esre_create_sysfs_entry.
  efi/tpm: Verify event log header before parsing
  efi/x86: Fix build with gcc 4
2020-06-28 11:42:16 -07:00
Linus Torvalds
91a9a90d04 Merge tag 'sched_urgent_for_5.8_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Borislav Petkov:
 "The most anticipated fix in this pull request is probably the horrible
  build fix for the RANDSTRUCT fail that didn't make -rc2. Also included
  is the cleanup that removes those BUILD_BUG_ON()s and replaces it with
  ugly unions.

  Also included is the try_to_wake_up() race fix that was first
  triggered by Paul's RCU-torture runs, but was independently hit by
  Dave Chinner's fstest runs as well"

* tag 'sched_urgent_for_5.8_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/cfs: change initial value of runnable_avg
  smp, irq_work: Continue smp_call_function*() and irq_work*() integration
  sched/core: s/WF_ON_RQ/WQ_ON_CPU/
  sched/core: Fix ttwu() race
  sched/core: Fix PI boosting between RT and DEADLINE tasks
  sched/deadline: Initialize ->dl_boosted
  sched/core: Check cpus_mask, not cpus_ptr in __set_cpus_allowed_ptr(), to fix mask corruption
  sched/core: Fix CONFIG_GCC_PLUGIN_RANDSTRUCT build fail
2020-06-28 10:37:39 -07:00
Linus Torvalds
098c793821 Merge tag 'x86_urgent_for_5.8_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:

 - AMD Memory bandwidth counter width fix, by Babu Moger.

 - Use the proper length type in the 32-bit truncate() syscall variant,
   by Jiri Slaby.

 - Reinit IA32_FEAT_CTL during wakeup to fix the case where after
   resume, VMXON would #GP due to VMX not being properly enabled, by
   Sean Christopherson.

 - Fix a static checker warning in the resctrl code, by Dan Carpenter.

 - Add a CR4 pinning mask for bits which cannot change after boot, by
   Kees Cook.

 - Align the start of the loop of __clear_user() to 16 bytes, to improve
   performance on AMD zen1 and zen2 microarchitectures, by Matt Fleming.

* tag 'x86_urgent_for_5.8_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/asm/64: Align start of __clear_user() loop to 16-bytes
  x86/cpu: Use pinning mask for CR4 bits needing to be 0
  x86/resctrl: Fix a NULL vs IS_ERR() static checker warning in rdt_cdp_peer_get()
  x86/cpu: Reinitialize IA32_FEAT_CTL MSR on BSP during wakeup
  syscalls: Fix offset type of ksys_ftruncate()
  x86/resctrl: Fix memory bandwidth counter width for AMD
2020-06-28 10:35:01 -07:00
Linus Torvalds
c141b30e99 Merge tag 'rcu_urgent_for_5.8_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU-vs-KCSAN fixes from Borislav Petkov:
 "A single commit that uses "arch_" atomic operations to avoid the
  instrumentation that comes with the non-"arch_" versions.

  In preparation for that commit, it also has another commit that makes
  these "arch_" atomic operations available to generic code.

  Without these commits, KCSAN uses can see pointless errors"

* tag 'rcu_urgent_for_5.8_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  rcu: Fixup noinstr warnings
  locking/atomics: Provide the arch_atomic_ interface to generic code
2020-06-28 10:29:38 -07:00
Linus Torvalds
a358505d8a Merge tag 'x86_entry_for_5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 entry fixes from Borislav Petkov:
 "This is the x86/entry urgent pile which has accumulated since the
  merge window.

  It is not the smallest but considering the almost complete entry core
  rewrite, the amount of fixes to follow is somewhat higher than usual,
  which is to be expected.

  Peter Zijlstra says:
   'These patches address a number of instrumentation issues that were
    found after the x86/entry overhaul. When combined with rcu/urgent
    and objtool/urgent, these patches make UBSAN/KASAN/KCSAN happy
    again.

    Part of making this all work is bumping the minimum GCC version for
    KASAN builds to gcc-8.3, the reason for this is that the
    __no_sanitize_address function attribute is broken in GCC releases
    before that.

    No known GCC version has a working __no_sanitize_undefined, however
    because the only noinstr violation that results from this happens
    when an UB is found, we treat it like WARN. That is, we allow it to
    violate the noinstr rules in order to get the warning out'"

* tag 'x86_entry_for_5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/entry: Fix #UD vs WARN more
  x86/entry: Increase entry_stack size to a full page
  x86/entry: Fixup bad_iret vs noinstr
  objtool: Don't consider vmlinux a C-file
  kasan: Fix required compiler version
  compiler_attributes.h: Support no_sanitize_undefined check with GCC 4
  x86/entry, bug: Comment the instrumentation_begin() usage for WARN()
  x86/entry, ubsan, objtool: Whitelist __ubsan_handle_*()
  x86/entry, cpumask: Provide non-instrumented variant of cpu_is_offline()
  compiler_types.h: Add __no_sanitize_{address,undefined} to noinstr
  kasan: Bump required compiler version
  x86, kcsan: Add __no_kcsan to noinstr
  kcsan: Remove __no_kcsan_or_inline
  x86, kcsan: Remove __no_kcsan_or_inline usage
2020-06-28 09:42:47 -07:00
Peter Zijlstra
8c4890d1c3 smp, irq_work: Continue smp_call_function*() and irq_work*() integration
Instead of relying on BUG_ON() to ensure the various data structures
line up, use a bunch of horrible unions to make it all automatic.

Much of the union magic is to ensure irq_work and smp_call_function do
not (yet) see the members of their respective data structures change
name.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lkml.kernel.org/r/20200622100825.844455025@infradead.org
2020-06-28 17:01:20 +02:00
Peter Zijlstra
4f311afc20 sched/core: Fix CONFIG_GCC_PLUGIN_RANDSTRUCT build fail
As a temporary build fix, the proper cleanup needs more work.

Reported-by: Guenter Roeck <linux@roeck-us.net>
Reported-by: Eric Biggers <ebiggers@kernel.org>
Suggested-by: Eric Biggers <ebiggers@kernel.org>
Suggested-by: Kees Cook <keescook@chromium.org>
Fixes: a148866489 ("sched: Replace rq::wake_list")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-06-28 17:01:20 +02:00
Linus Torvalds
3cd1c5d582 Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
 "Six small fixes, five in drivers and one to correct another minor
  regression from cc97923a5b ("block: move dma drain handling to
  scsi") where we still need the drain stub to be built in to the kernel
  for the modular libata, non-modular SAS driver case"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: mptscsih: Fix read sense data size
  scsi: zfcp: Fix panic on ERP timeout for previously dismissed ERP action
  scsi: lpfc: Avoid another null dereference in lpfc_sli4_hba_unset()
  scsi: libata: Fix the ata_scsi_dma_need_drain stub
  scsi: qla2xxx: Keep initiator ports after RSCN
  scsi: qla2xxx: Set NVMe status code for failed NVMe FCP request
2020-06-27 15:20:03 -07:00
Linus Torvalds
c322f5399f Merge tag 'vfio-v5.8-rc3' of git://github.com/awilliam/linux-vfio
Pull VFIO fixes from Alex Williamson:

 - Fix double free of eventfd ctx (Alex Williamson)

 - Fix duplicate use of capability ID (Alex Williamson)

 - Fix SR-IOV VF memory enable handling (Alex Williamson)

* tag 'vfio-v5.8-rc3' of git://github.com/awilliam/linux-vfio:
  vfio/pci: Fix SR-IOV VF handling with MMIO blocking
  vfio/type1: Fix migration info capability ID
  vfio/pci: Clear error and request eventfd ctx after releasing
2020-06-27 15:17:21 -07:00
Linus Torvalds
f05baa066d Merge tag 'dma-mapping-5.8-4' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping fixes from Christoph Hellwig:

 - fix dma coherent mmap in nommu (me)

 - more AMD SEV fallout (David Rientjes, me)

 - fix alignment in dma_common_*_remap (Eric Auger)

* tag 'dma-mapping-5.8-4' of git://git.infradead.org/users/hch/dma-mapping:
  dma-remap: align the size in dma_common_*_remap()
  dma-mapping: DMA_COHERENT_POOL should select GENERIC_ALLOCATOR
  dma-direct: add missing set_memory_decrypted() for coherent mapping
  dma-direct: check return value when encrypting or decrypting memory
  dma-direct: re-encrypt memory if dma_direct_alloc_pages() fails
  dma-direct: always align allocation size in dma_direct_alloc_pages()
  dma-direct: mark __dma_direct_alloc_pages static
  dma-direct: re-enable mmap for !CONFIG_MMU
2020-06-27 13:06:22 -07:00
Linus Torvalds
6116dea80d Merge tag 'kgdb-5.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux
Pull kgdb fixes from Daniel Thompson:
 "The main change here is a fix for a number of unsafe interactions
  between kdb and the console system. The fixes are specific to kdb
  (pure kgdb debugging does not use the console system at all). On
  systems with an NMI then kdb, if it is enabled, must get messages to
  the user despite potentially running from some "difficult" calling
  contexts. These fixes avoid using the console system where we have
  been provided an alternative (safer) way to interact with the user
  and, if using the console system in unavoidable, use oops_in_progress
  for deadlock avoidance. These fixes also ensure kdb honours the
  console enable flag.

  Also included is a fix that wraps kgdb trap handling in an RCU read
  lock to avoids triggering diagnostic warnings. This is a wide lock
  scope but this is OK because kgdb is a stop-the-world debugger. When
  we stop the world we put all the CPUs into holding pens and this
  inhibits RCU update anyway"

* tag 'kgdb-5.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux:
  kgdb: Avoid suspicious RCU usage warning
  kdb: Switch to use safer dbg_io_ops over console APIs
  kdb: Make kdb_printf() console handling more robust
  kdb: Check status of console prior to invoking handlers
  kdb: Re-factor kdb_printf() message write code
2020-06-27 08:53:49 -07:00
Linus Torvalds
bd37cdf8ba Merge tag 'iommu-fixes-v5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu fixes from Joerg Roedel:
 "A couple of Intel VT-d fixes:

   - Make Intel SVM code 64bit only. The code uses pgd_t* and the IOMMU
     only supports long-mode page-table formats, so its broken on 32bit
     anyway.

   - Make sure GFX quirks in for Intel VT-d are not applied to untrusted
     devices. Those devices might gain full memory access otherwise.

   - Identity mapping setup fix.

   - Fix ACS enabling when Intel IOMMU is off and untrusted devices are
     detected.

   - Two smaller fixes for coherency and IO page-table setup"

* tag 'iommu-fixes-v5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/vt-d: Fix misuse of iommu_domain_identity_map()
  iommu/vt-d: Update scalable mode paging structure coherency
  iommu/vt-d: Enable PCI ACS for platform opt in hint
  iommu/vt-d: Don't apply gfx quirks to untrusted devices
  iommu/vt-d: Set U/S bit in first level page table by default
  iommu/vt-d: Make Intel SVM code 64-bit only
2020-06-26 12:30:07 -07:00
Linus Torvalds
6a6c9b220a Merge tag 'drm-fixes-2020-06-26' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
 "Usual rc3 pickup, lots of little fixes all over.

  The core VT registration regression fix is probably the largest,
  otherwise ttm, amdgpu and tegra are the bulk, with some minor driver
  fixes.

  No i915 pull this week which may or may not mean I get 2x of it next
  week, we'll see how it goes.

  core:
   - fix VT registration regression

  ttm:
   - fix two fence leaks

  amdgpu:
   - Fix missed mutex unlock in DC error path
   - Fix firmware leak for sdma5
   - DC bpc property fixes

  amdkfd:
   - Fix memleak in an error path

  radeon:
   - Fix copy paste typo in NI DPM spll validation

  rcar-du:
   - build fix

  tegra:
   - add missing zpos property
   - child driver registeration fix
   - debugfs cleanup fix
   - doc fix

  mcde:
   - reorder fbdev setup

  panel:
   - fix connector type
   - fix orienation for some panels

  sun4i:
   - fix dma/iommu configuration

  uvesafb:
   - respect blank flag"

* tag 'drm-fixes-2020-06-26' of git://anongit.freedesktop.org/drm/drm: (25 commits)
  drm/amd: fix potential memleak in err branch
  drm/amd/display: Fix ineffective setting of max bpc property
  drm/amd/display: Enable output_bpc property on all outputs
  drm/amdgpu: add fw release for sdma v5_0
  drm/fb-helper: Fix vt restore
  drm/radeon: fix fb_div check in ni_init_smc_spll_table()
  drm/amdgpu/display: Unlock mutex on error
  drm/sun4i: mixer: Call of_dma_configure if there's an IOMMU
  drm: panel-orientation-quirks: Use generic orientation-data for Acer S1003
  drm: panel-orientation-quirks: Add quirk for Asus T101HA panel
  video: fbdev: uvesafb: fix "noblank" option handling
  drm/panel-simple: fix connector type for newhaven_nhd_43_480272ef_atxl
  drm/panel-simple: fix connector type for LogicPD Type28 Display
  drm: rcar-du: Fix build error
  drm: mcde: Fix forgotten user of drm->dev_private
  drm: mcde: Fix display initialization problem
  drm/tegra: Add zpos property for cursor planes
  gpu: host1x: Detach driver on unregister
  gpu: host1x: Correct trivial kernel-doc inconsistencies
  drm/tegra: hub: Register child devices
  ...
2020-06-26 12:27:25 -07:00