Bisection steps on the way to 5.12-rc1
There's something broken with the 582cd91f69 ("Merge tag
'for-5.12/block-2021-02-17' of git://git.kernel.dk/linux-block") merge,
so let's take it in smaller chunks for now.
Change-Id: I9c655a3e714d044eca7a0da5f094605a5d7fdd5e
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Enable in-kernel MTE (Memory Tagging Extension) support via
CONFIG_KASAN_HW_TAGS=y. With this change in-kernel MTE will be
auto-enabled during boot on hardware that supports MTE.
Currently, in-kernel MTE is only supported for slab and page_alloc
allocations. Future changes might include support for vmalloc, stack,
and globals.
By default:
- MTE works in synchronous mode, which means that tag faults are being
reported at the point of occurence.
- When a tag fault is detected, a report is printed into the kernel log.
Only the first tag fault gets reported. No panic occurs unless either
"kasan.fault=panic" or "panic_on_warn" is set via command line.
- A report contains the address and a stack trace of the access.
There are no alloc/free stack traces for the accessed page or slab
object (as specified via CONFIG_CMDLINE in this change).
These defaults can be overridden via command line parameters, see
Documentation/dev-tools/kasan.rst for details. In particular, using
the "kasan=off" command line parameter will turn in-kernel MTE off.
Note, that enabling alloc/free stacktraces requires specifying both
"kasan.stacktrace=on" and "stack_depot_disable=off".
On MTE-enabled hardware, a performance impact of ~10% is expected, but
there is no such hardware yet to run benchmarks. A future integration of
in-kernel MTE with init_on_alloc/free might significantly bring down the
perfomance impact.
There is no performance impact when in-kernel MTE is disabled via
command line or when hardware without MTE (pre-ARMv8.5) is in use.
There is still a side-effect of TTBR1 TBI (Top Byte Ignore) getting
enabled with CONFIG_KASAN_HW_TAGS=y.
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Bug: 172318110
Change-Id: I2f9bb845ae43292c182532e5e42f43e07b4d0d56
Wire up f2fs with fscrypt direct I/O support. direct I/O with fscrypt is
only supported through blk-crypto (i.e. CONFIG_BLK_INLINE_ENCRYPTION must
have been enabled, the 'inlinecrypt' mount option must have been specified,
and either hardware inline encryption support must be present or
CONFIG_BLK_INLINE_ENCYRPTION_FALLBACK must have been enabled). Further,
direct I/O on encrypted files is only supported when I/O is aligned
to the filesystem block size (which is *not* necessarily the same as the
block device's block size).
Signed-off-by: Eric Biggers <ebiggers@google.com>
Co-developed-by: Satya Tangirala <satyat@google.com>
Signed-off-by: Satya Tangirala <satyat@google.com>
Acked-by: Jaegeuk Kim <jaegeuk@kernel.org>
Bug: 162255927
Link: https://lore.kernel.org/r/20200724184501.1651378-6-satyat@google.com
Change-Id: I2efde5aed559ba59f964d7d2e54f73414062daf8
Signed-off-by: Eric Biggers <ebiggers@google.com>
Wire up ext4 with fscrypt direct I/O support. Direct I/O with fscrypt is
only supported through blk-crypto (i.e. CONFIG_BLK_INLINE_ENCRYPTION must
have been enabled, the 'inlinecrypt' mount option must have been specified,
and either hardware inline encryption support must be present or
CONFIG_BLK_INLINE_ENCYRPTION_FALLBACK must have been enabled). Further,
direct I/O on encrypted files is only supported when I/O is aligned
to the filesystem block size (which is *not* necessarily the same as the
block device's block size).
fscrypt_limit_io_blocks() is called before setting up the iomap to ensure
that the blocks of each bio that iomap will submit will have contiguous
DUNs. Note that fscrypt_limit_io_blocks() is normally a no-op, as normally
the DUNs simply increment along with the logical blocks. But it's needed
to handle an edge case in one of the fscrypt IV generation methods.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Co-developed-by: Satya Tangirala <satyat@google.com>
Signed-off-by: Satya Tangirala <satyat@google.com>
Reviewed-by: Jaegeuk Kim <jaegeuk@kernel.org>
Bug: 162255927
Link: https://lore.kernel.org/r/20200724184501.1651378-5-satyat@google.com
Change-Id: Ia3d869cefabdff070f4e77c46190351f6cb5d74c
Signed-off-by: Eric Biggers <ebiggers@google.com>
Set bio crypt contexts on bios by calling into fscrypt when required.
No DUN contiguity checks are done - callers are expected to set up the
iomap correctly to ensure that each bio submitted by iomap will not have
blocks with incontiguous DUNs by calling fscrypt_limit_io_blocks()
appropriately.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Co-developed-by: Satya Tangirala <satyat@google.com>
Signed-off-by: Satya Tangirala <satyat@google.com>
Bug: 162255927
Link: https://lore.kernel.org/r/20200724184501.1651378-4-satyat@google.com
Change-Id: I34bd73001d53c854b5905799d3a9c31762914763
Signed-off-by: Eric Biggers <ebiggers@google.com>
Set bio crypt contexts on bios by calling into fscrypt when required,
and explicitly check for DUN continuity when adding pages to the bio.
(While DUN continuity is usually implied by logical block contiguity,
this is not the case when using certain fscrypt IV generation methods
like IV_INO_LBLK_32).
Signed-off-by: Eric Biggers <ebiggers@google.com>
Co-developed-by: Satya Tangirala <satyat@google.com>
Signed-off-by: Satya Tangirala <satyat@google.com>
Reviewed-by: Jaegeuk Kim <jaegeuk@kernel.org>
Bug: 162255927
Link: https://lore.kernel.org/r/20200724184501.1651378-3-satyat@google.com
Change-Id: I57ff74185004371c01ec35d806b0749583375c58
Signed-off-by: Eric Biggers <ebiggers@google.com>
The upstream version of the direct I/O on encrypted files patch series
missed exporting this function, which is needed if ext4 is built as a
module.
Bug: 162255927
Fixes: 914bc8e7646a ("FROMLIST: fscrypt: Add functions for direct I/O support")
Change-Id: Ib827b4743423c7446436a47fcf95b255466288a3
Signed-off-by: Satya Tangirala <satyat@google.com>
Introduce fscrypt_dio_supported() to check whether a direct I/O request
is unsupported due to encryption constraints.
Also introduce fscrypt_limit_io_blocks() to limit how many blocks can be
added to a bio being prepared for direct I/O. This is needed for
filesystems that use the iomap direct I/O implementation to avoid DUN
wraparound in the middle of a bio (which is possible with the
IV_INO_LBLK_32 IV generation method). Elsewhere fscrypt_mergeable_bio()
is used for this, but iomap operates on logical ranges directly, so
filesystems using iomap won't have a chance to call fscrypt_mergeable_bio()
on every block added to a bio. So we need this function which limits a
logical range in one go.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Co-developed-by: Satya Tangirala <satyat@google.com>
Signed-off-by: Satya Tangirala <satyat@google.com>
Bug: 162255927
Link: https://lore.kernel.org/r/20200724184501.1651378-2-satyat@google.com
Change-Id: I1dbd4f382d510d9b779d5e44a77fadf7040cf077
Signed-off-by: Eric Biggers <ebiggers@google.com>
Revert the direct I/O support for encrypted files so that we can bring
in the latest version of the patches from the mailing list. This is
needed because in v5.5 and later, the ext4 support (via fs/iomap/) is
broken as-is -- not only is the second call to fscrypt_limit_dio_pages()
in the wrong place, but bios can exceed the intended nr_pages limit due
to multipage bvecs. In order to fix this we need the v6 patches which
make fs/ext4/ handle the limiting instead of fs/iomap/.
On android-mainline, this fixes a failure in vts_kernel_encryption_test
(specifically, FBEPolicyTest#TestAesEmmcOptimizedPolicy) when run on a
device that uses the inlinecrypt mount option on ext4 (e.g. db845c).
Bug: 162255927
Bug: 171462575
Change-Id: I0da753dc9e0e7bc8d84bbcadfdfcdb9328cdb8d8
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Satya Tangirala <satyat@google.com>
Enable GPIO LEDs as module for use on Amlogic boards such as Khadas
VIM3 family.
Bug: 179406580
Change-Id: I0952970ae3b7c2c1d6c09ca4a405c2815caf92ae
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
Steps on the way to 5.12-rc1.
Resolves merge issues with:
fs/verity/signature.c
include/linux/fsverity.h
Cc: Eric Biggers <ebiggers@google.com>
Cc: Paul Lawrence <paullawrence@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I38ec011e7931f81341afed6cf24de550234b893b
Steps on the way to 5.12-rc1.
Resolves merge conflicts in:
drivers/base/power/main.c
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Iad58c40d8ec08d50284751ed3530186f373546d1
In order to provide better reviewer suggestions and to introduce fine
grained additional approval permissions, sprinkle some OWNERS files over
the tree.
This essentially grants additional OWNERS permissions in
- net/
- arch/arm*/net
- arch/x86/net
- drivers/net
- fs/crypto
- fs/f2fs
- fs/verity
All other permissions are already covered by ref/meta/config:OWNERS.
Signed-off-by: Matthias Maennich <maennich@google.com>
Change-Id: I02a6563e4dd63842af37c50f7180bf38e9b9b4be
TTY/Staging/USB merges on the way to 5.12-rc1
Resolves merge issues with:
drivers/usb/gadget/configfs.c
drivers/usb/typec/tcpm/tcpm.c
include/linux/usb/tcpm.h
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ic094f04295e248c167193506a4bd1fc8b1856a93
Big arm merge on the way to 5.12-rc1
Resolves conflicts with:
arch/arm/mach-prima2/rstc.c
arch/arm64/boot/dts/amlogic/Makefile
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I58a576c81f4f78f59111389a472b94c5e751d31d
Updates the documentation and comments for the MODULE_SCMVERSION feature.
Bug: 180027765
Fixes: 4b9c11a373 ("ANDROID: modules: introduce the MODULE_SCMVERSION config")
Change-Id: I648b31c4810c777ec3d2cb141b61f5924559c76f
Signed-off-by: Will McVicker <willmcvicker@google.com>
First steps of the 5.12-rc1 merge, the large networking chunk.
Resolves merge conflicts in:
net/core/filter.c
net/ipv6/route.c
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Id1650a4e7ab7104647e85beddddb672f779d4d1f
With cgroup v2 hierarchy enabled PSI accounts stalls for each cgroup
separately and aggregates at each level of the hierarchy. That causes
additional overhead since psi_avgs_work would be called for each
cgroup in the hierarchy.
In Android we use PSI only at the system level, therefore this overhead
can be avoided.
Introduce CONFIG_PSI_PER_CGROUP_ACCT that controls per-cgroup PSI
tracking and is disabled by default.
Bug: 178872719
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I70a418aba76b46a27eb9e66080434aa870496384
(cherry picked from commit bd3983c8a8)
(cherry picked from commit 5e00dceecb6d26c2d7382045b488fd9d1ced1c11)
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Backing file needs to have write permissions for all users
even though the mounted view doesn't - otherwise incfs can't
change the internal file data.
Bug: 180535478
Test: adb install <apk>
Signed-off-by: Yurii Zubrytskyi <zyy@google.com>
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Change-Id: I5d7915b28072cff1508ba45b56e844cb678ca466
Pull libata updates from Jens Axboe:
"Regulartors management addition from Florian, and a trivial change to
avoid comma separated statements from Joe"
* tag 'for-5.12/libata-2021-02-17' of git://git.kernel.dk/linux-block:
ata: Avoid comma separated statements
ata: ahci_brcm: Add back regulators management
Pull oprofile and dcookies removal from Viresh Kumar:
"Remove oprofile and dcookies support
The 'oprofile' user-space tools don't use the kernel OPROFILE support
any more, and haven't in a long time. User-space has been converted to
the perf interfaces.
The dcookies stuff is only used by the oprofile code. Now that
oprofile's support is getting removed from the kernel, there is no
need for dcookies as well.
Remove kernel's old oprofile and dcookies support"
* tag 'oprofile-removal-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/linux:
fs: Remove dcookies support
drivers: Remove CONFIG_OPROFILE support
arch: xtensa: Remove CONFIG_OPROFILE support
arch: x86: Remove CONFIG_OPROFILE support
arch: sparc: Remove CONFIG_OPROFILE support
arch: sh: Remove CONFIG_OPROFILE support
arch: s390: Remove CONFIG_OPROFILE support
arch: powerpc: Remove oprofile
arch: powerpc: Stop building and using oprofile
arch: parisc: Remove CONFIG_OPROFILE support
arch: mips: Remove CONFIG_OPROFILE support
arch: microblaze: Remove CONFIG_OPROFILE support
arch: ia64: Remove rest of perfmon support
arch: ia64: Remove CONFIG_OPROFILE support
arch: hexagon: Don't select HAVE_OPROFILE
arch: arc: Remove CONFIG_OPROFILE support
arch: arm: Remove CONFIG_OPROFILE support
arch: alpha: Remove CONFIG_OPROFILE support
Pull xfs updates from Darrick Wong:
"There's a lot going on this time, which seems about right for this
drama-filled year.
Community developers added some code to speed up freezing when
read-only workloads are still running, refactored the logging code,
added checks to prevent file extent counter overflow, reduced iolock
cycling to speed up fsync and gc scans, and started the slow march
towards supporting filesystem shrinking.
There's a huge refactoring of the internal speculative preallocation
garbage collection code which fixes a bunch of bugs, makes the gc
scheduling per-AG and hence multithreaded, and standardizes the retry
logic when we try to reserve space or quota, can't, and want to
trigger a gc scan. We also enable multithreaded quotacheck to reduce
mount times further. This is also preparation for background file gc,
which may or may not land for 5.13.
We also fixed some deadlocks in the rename code, fixed a quota
accounting leak when FSSETXATTR fails, restored the behavior that
write faults to an mmap'd region actually cause a SIGBUS, fixed a bug
where sgid directory inheritance wasn't quite working properly, and
fixed a bug where symlinks weren't working properly in ecryptfs. We
also now advertise the inode btree counters feature that was
introduced two cycles ago.
Summary:
- Fix an ABBA deadlock when renaming files on overlayfs.
- Make sure that we can't overflow the inode extent counters when
adding to or removing extents from a file.
- Make directory sgid inheritance work the same way as all the other
filesystems.
- Don't drain the buffer cache on freeze and ro remount, which should
reduce the amount of time if read-only workloads are continuing
during the freeze.
- Fix a bug where symlink size isn't reported to the vfs in ecryptfs.
- Disentangle log cleaning from log covering. This refactoring sets
us up for future changes to the log, though for now it simply means
that we can use covering for freezes, and cleaning becomes
something we only do at unmount.
- Speed up file fsyncs by reducing iolock cycling.
- Fix delalloc blocks leaking when changing the project id fails
because of input validation errors in FSSETXATTR.
- Fix oversized quota reservation when converting unwritten extents
during a DAX write.
- Create a transaction allocation helper function to standardize the
idiom of allocating a transaction, reserving blocks, locking
inodes, and reserving quota. Replace all the open-coded logic for
file creation, file ownership changes, and file modifications to
use them.
- Actually shut down the fs if the incore quota reservations get
corrupted.
- Fix background block garbage collection scans to not block and to
actually clean out CoW staging extents properly.
- Run block gc scans when we run low on project quota.
- Use the standardized transaction allocation helpers to make it so
that ENOSPC and EDQUOT errors during reservation will back out,
invoke the block gc scanner, and try again. This is preparation for
introducing background inode garbage collection in the next cycle.
- Combine speculative post-EOF block garbage collection with
speculative copy on write block garbage collection.
- Enable multithreaded quotacheck.
- Allow sysadmins to tweak the CPU affinities and maximum concurrency
levels of quotacheck and background blockgc worker pools.
- Expose the inode btree counter feature in the fs geometry ioctl.
- Cleanups of the growfs code in preparation for starting work on
filesystem shrinking.
- Fix all the bloody gcc warnings that the maintainer knows about. :P
- Fix a RST syntax error.
- Don't trigger bmbt corruption assertions after the fs shuts down.
- Restore behavior of forcing SIGBUS on a shut down filesystem when
someone triggers a mmap write fault (or really, any buffered
write)"
* tag 'xfs-5.12-merge-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (85 commits)
xfs: consider shutdown in bmapbt cursor delete assert
xfs: fix boolreturn.cocci warnings
xfs: restore shutdown check in mapped write fault path
xfs: fix rst syntax error in admin guide
xfs: fix incorrect root dquot corruption error when switching group/project quota types
xfs: get rid of xfs_growfs_{data,log}_t
xfs: rename `new' to `delta' in xfs_growfs_data_private()
libxfs: expose inobtcount in xfs geometry
xfs: don't bounce the iolock between free_{eof,cow}blocks
xfs: expose the blockgc workqueue knobs publicly
xfs: parallelize block preallocation garbage collection
xfs: rename block gc start and stop functions
xfs: only walk the incore inode tree once per blockgc scan
xfs: consolidate the eofblocks and cowblocks workers
xfs: consolidate incore inode radix tree posteof/cowblocks tags
xfs: remove trivial eof/cowblocks functions
xfs: hide xfs_icache_free_cowblocks
xfs: hide xfs_icache_free_eofblocks
xfs: relocate the eofb/cowb workqueue functions
xfs: set WQ_SYSFS on all workqueues in debug mode
...
Pull iomap updates from Darrick Wong:
"The big change in this cycle is some new code to make it possible for
XFS to try unaligned directio overwrites without taking locks. If the
block is fully written and within EOF (i.e. doesn't require any
further fs intervention) then we can let the unlocked write proceed.
If not, we fall back to synchronizing direct writes.
Summary:
- Adjust the final parameter of iomap_dio_rw.
- Add a new flag to request that iomap directio writes return EAGAIN
if the write is not a pure overwrite within EOF; this will be used
to reduce lock contention with unaligned direct writes on XFS.
- Amend XFS' directio code to eliminate exclusive locking for
unaligned direct writes if the circumstances permit"
* tag 'iomap-5.12-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: reduce exclusive locking on unaligned dio
xfs: split the unaligned DIO write code out
xfs: improve the reflink_bounce_dio_write tracepoint
xfs: simplify the read/write tracepoints
xfs: remove the buffered I/O fallback assert
xfs: cleanup the read/write helper naming
xfs: make xfs_file_aio_write_checks IOCB_NOWAIT-aware
xfs: factor out a xfs_ilock_iocb helper
iomap: add a IOMAP_DIO_OVERWRITE_ONLY flag
iomap: pass a flags argument to iomap_dio_rw
iomap: rename the flags variable in __iomap_dio_rw
Pull pstore fix from Kees Cook:
"Fix a CONFIG typo (Jiri Bohac)"
* tag 'pstore-v5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
pstore: Fix typo in compression option name
Pull fsverity updates from Eric Biggers:
"Add an ioctl which allows reading fs-verity metadata from a file.
This is useful when a file with fs-verity enabled needs to be served
somewhere, and the other end wants to do its own fs-verity compatible
verification of the file. See the commit messages for details.
This new ioctl has been tested using new xfstests I've written for it"
* tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt:
fs-verity: support reading signature with ioctl
fs-verity: support reading descriptor with ioctl
fs-verity: support reading Merkle tree with ioctl
fs-verity: add FS_IOC_READ_VERITY_METADATA ioctl
fs-verity: don't pass whole descriptor to fsverity_verify_signature()
fs-verity: factor out fsverity_get_descriptor()
Pull nfsd updates from Chuck Lever:
- Update NFSv2 and NFSv3 XDR decoding functions
- Further improve support for re-exporting NFS mounts
- Convert NFSD stats to per-CPU counters
- Add batch Receive posting to the server's RPC/RDMA transport
* tag 'nfsd-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (65 commits)
nfsd: skip some unnecessary stats in the v4 case
nfs: use change attribute for NFS re-exports
NFSv4_2: SSC helper should use its own config.
nfsd: cstate->session->se_client -> cstate->clp
nfsd: simplify nfsd4_check_open_reclaim
nfsd: remove unused set_client argument
nfsd: find_cpntf_state cleanup
nfsd: refactor set_client
nfsd: rename lookup_clientid->set_client
nfsd: simplify nfsd_renew
nfsd: simplify process_lock
nfsd4: simplify process_lookup1
SUNRPC: Correct a comment
svcrdma: DMA-sync the receive buffer in svc_rdma_recvfrom()
svcrdma: Reduce Receive doorbell rate
svcrdma: Deprecate stat variables that are no longer used
svcrdma: Restore read and write stats
svcrdma: Convert rdma_stat_sq_starve to a per-CPU counter
svcrdma: Convert rdma_stat_recv to a per-CPU counter
svcrdma: Refactor svc_rdma_init() and svc_rdma_clean_up()
...
Pull erofs updates from Gao Xiang:
"This contains a somewhat important but rarely reproduced fix reported
month ago for platforms which have weak memory model (e.g. arm64).
The root cause is that test_bit/set_bit atomic operations are actually
implemented in relaxed forms, and uninitialized fields governed by an
atomic bit could be observed in advance due to memory reordering thus
memory barrier pairs should be used.
There is also a trivial fix of crafted blkszbits generated by
syzkaller.
Summary:
- fix shift-out-of-bounds of crafted blkszbits generated by syzkaller
- ensure initialized fields can only be observed after bit is set"
* tag 'erofs-for-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
erofs: initialized fields can only be observed after bit is set
erofs: fix shift-out-of-bounds of blkszbits
Pull f2fs updates from Jaegeuk Kim:
"We've added two major features: 1) compression level and 2)
checkpoint_merge, in this round.
Compression level expands 'compress_algorithm' mount option to accept
parameter as format of <algorithm>:<level>, by this way, it gives a
way to allow user to do more specified config on lz4 and zstd
compression level, then f2fs compression can provide higher compress
ratio.
checkpoint_merge creates a kernel daemon and makes it to merge
concurrent checkpoint requests as much as possible to eliminate
redundant checkpoint issues. Plus, we can eliminate the sluggish issue
caused by slow checkpoint operation when the checkpoint is done in a
process context in a cgroup having low i/o budget and cpu shares.
Enhancements:
- add compress level for lz4 and zstd in mount option
- checkpoint_merge mount option
- deprecate f2fs_trace_io
Bug fixes:
- flush data when enabling checkpoint back
- handle corner cases of mount options
- missing ACL update and lock for I_LINKABLE flag
- attach FIEMAP_EXTENT_MERGED in f2fs_fiemap
- fix potential deadlock in compression flow
- fix wrong submit_io condition
As usual, we've cleaned up many code flows and fixed minor bugs"
* tag 'f2fs-for-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (32 commits)
Documentation: f2fs: fix typo s/automaic/automatic
f2fs: give a warning only for readonly partition
f2fs: don't grab superblock freeze for flush/ckpt thread
f2fs: add ckpt_thread_ioprio sysfs node
f2fs: introduce checkpoint_merge mount option
f2fs: relocate inline conversion from mmap() to mkwrite()
f2fs: fix a wrong condition in __submit_bio
f2fs: remove unnecessary initialization in xattr.c
f2fs: fix to avoid inconsistent quota data
f2fs: flush data when enabling checkpoint back
f2fs: deprecate f2fs_trace_io
f2fs: Remove readahead collision detection
f2fs: remove unused stat_{inc, dec}_atomic_write
f2fs: introduce sb_status sysfs node
f2fs: fix to use per-inode maxbytes
f2fs: compress: fix potential deadlock
libfs: unexport generic_ci_d_compare() and generic_ci_d_hash()
f2fs: fix to set/clear I_LINKABLE under i_lock
f2fs: fix null page reference in redirty_blocks
f2fs: clean up post-read processing
...
Pull btrfs updates from David Sterba:
"This brings updates of space handling, performance improvements or bug
fixes. The subpage block size and zoned mode features have reached
state where they're usable but with limitations.
Performance or related:
- do not block on deleted block group mutex in the cleaner, avoids
some long stalls
- improved flushing: make it work better with ticket space
reservations and avoid excessive transaction commits in some
scenarios, slightly improves throughput for random write load
- preemptive background flushing: separate the logic from ticket
reservations, improve the accounting and decisions when to flush in
low space conditions
- less lock contention related to running delayed refs, let just one
thread do the flushing when there are many inside transaction
commit
- dbench workload improvements: avoid unnecessary work when logging
inodes, fewer fallbacks to transaction commit and thus less waiting
for it (+7% throughput, -20% latency)
Core:
- subpage block size
- currently read-only support
- refactor and generalize code where sectorsize is assumed to be
page size, add the subpage handling everywhere
- the read-write support is on the way, page sizes are still
limited to 4K or 64K
- zoned mode, first working version but with limitations
- SMR/ZBC/ZNS friendly allocation mode, utilizing the "no fixed
location for structures" and chunked allocation
- superblock as the only fixed data structure needs special
handling, uses 2 consecutive zones as a ring buffer
- tree-log support with a dedicated block group to avoid unordered
writes
- emulated zones on non-zoned devices
- not yet working
- all non-single block group profiles, requires more zone write
pointer synchronization between the multiple block groups
- fitrim due to dependency on space cache, can be implemented
Fixes:
- ref-verify: proper tree owner and node level tracking
- fix pinned byte accounting, causing some early ENOSPC now more
likely due to other changes in delayed refs
Other:
- error handling fixes and improvements
- more error injection points
- more function documentation
- more and updated tracepoints
- subset of W=1 checked by default
- update comments to allow more automatic kdoc parameter checks"
* tag 'for-5.12-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (144 commits)
btrfs: zoned: enable to mount ZONED incompat flag
btrfs: zoned: deal with holes writing out tree-log pages
btrfs: zoned: reorder log node allocation on zoned filesystem
btrfs: zoned: serialize log transaction on zoned filesystems
btrfs: zoned: extend zoned allocator to use dedicated tree-log block group
btrfs: split alloc_log_tree()
btrfs: zoned: relocate block group to repair IO failure in zoned filesystems
btrfs: zoned: enable relocation on a zoned filesystem
btrfs: zoned: support dev-replace in zoned filesystems
btrfs: zoned: implement copying for zoned device-replace
btrfs: zoned: implement cloning for zoned device-replace
btrfs: zoned: mark block groups to copy for device-replace
btrfs: zoned: do not use async metadata checksum on zoned filesystems
btrfs: zoned: wait for existing extents before truncating
btrfs: zoned: serialize metadata IO
btrfs: zoned: introduce dedicated data write path for zoned filesystems
btrfs: zoned: enable zone append writing for direct IO
btrfs: zoned: use ZONE_APPEND write for zoned mode
btrfs: save irq flags when looking up an ordered extent
btrfs: zoned: cache if block group is on a sequential zone
...
Pull AFFS fix from David Sterba:
"One minor fix for error handling in rename exchange"
* tag 'affs-for-5.12-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
fs/affs: release old buffer head on error path
Pull jfs updates from David Kleikamp:
"A few jfs fixes"
* tag 'jfs-5.12' of git://github.com/kleikamp/linux-shaggy:
fs/jfs: fix potential integer overflow on shift of a int
jfs: turn diLog(), dataLog() and txLog() into void functions
JFS: more checks for invalid superblock
Pull fcntl fix from Jeff Layton.
* tag 'locks-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux:
fcntl: make F_GETOWN(EX) return 0 on dead owner task
Pull namei updates from Al Viro:
"Most of that pile is LOOKUP_CACHED series; the rest is a couple of
misc cleanups in the general area...
There's a minor bisect hazard in the end of series, and normally I
would've just folded the fix into the previous commit, but this branch
is shared with Jens' tree, with stuff on top of it in there, so that
would've required rebases outside of vfs.git"
* 'work.namei' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fix handling of nd->depth on LOOKUP_CACHED failures in try_to_unlazy*
fs: expose LOOKUP_CACHED through openat2() RESOLVE_CACHED
fs: add support for LOOKUP_CACHED
saner calling conventions for unlazy_child()
fs: make unlazy_walk() error handling consistent
fs/namei.c: Remove unlikely of status being -ECHILD in lookup_fast()
do_tmpfile(): don't mess with finish_open()
Pull ELF compat updates from Al Viro:
"Sanitizing ELF compat support, especially for triarch architectures:
- X32 handling cleaned up
- MIPS64 uses compat_binfmt_elf.c both for O32 and N32 now
- Kconfig side of things regularized
Eventually I hope to have compat_binfmt_elf.c killed, with both native
and compat built from fs/binfmt_elf.c, with -DELF_BITS={64,32} passed
by kbuild, but that's a separate story - not included here"
* 'work.elf-compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
get rid of COMPAT_ELF_EXEC_PAGESIZE
compat_binfmt_elf: don't bother with undef of ELF_ARCH
Kconfig: regularize selection of CONFIG_BINFMT_ELF
mips compat: switch to compat_binfmt_elf.c
mips: don't bother with ELF_CORE_EFLAGS
mips compat: don't bother with ELF_ET_DYN_BASE
mips: KVM_GUEST makes no sense for 64bit builds...
mips: kill unused definitions in binfmt_elf[on]32.c
mips binfmt_elf*32.c: use elfcore-compat.h
x32: make X32, !IA32_EMULATION setups able to execute x32 binaries
[amd64] clean PRSTATUS_SIZE/SET_PR_FPVALID up properly
elf_prstatus: collect the common part (everything before pr_reg) into a struct
binfmt_elf: partially sanitize PRSTATUS_SIZE and SET_PR_FPVALID
Pull sendfile updates from Al Viro:
"Make sendfile() to pipe destination do the right thing, should make
'fs/pipe: allow sendfile() to pipe again' redundant"
* 'work.sendfile' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
teach sendfile(2) to handle send-to-pipe directly
take the guts of file-to-pipe splice into a helper function
do_splice_to(): move the logics for limiting the read length in
Pull PNP updates from Rafael Wysocki:
"These make two janitorial changes of the code.
Specifics:
- Add printf annotation to a logging function (Tom Rix)
- Use DEFINE_SPINLOCK() for defining a spinlock so as to initialize
it statically (Zheng Yongjun)"
* tag 'pnp-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PNP: pnpbios: Use DEFINE_SPINLOCK() for spinlock
PNP: add printf attribute to log function