If userspace does not include the trailing end of batch message, then
nfnetlink aborts the transaction. This allows to check that ruleset
updates trigger no errors.
After this patch, invoking this command from the prerouting chain:
# nft -c add rule x y fib saddr . oif type local
fails since oif is not supported there.
This patch fixes the lack of rule validation from the abort/check path
to catch configuration errors such as the one above.
Fixes: a654de8fdc ("netfilter: nf_tables: fix chain dependency validation")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
If netfilter changes the packet mark when mangling, the packet is
rerouted using the route_me_harder set of functions. Prior to this
commit, there's one big difference between route_me_harder and the
ordinary initial routing functions, described in the comment above
__ip_queue_xmit():
/* Note: skb->sk can be different from sk, in case of tunnels */
int __ip_queue_xmit(struct sock *sk, struct sk_buff *skb, struct flowi *fl,
That function goes on to correctly make use of sk->sk_bound_dev_if,
rather than skb->sk->sk_bound_dev_if. And indeed the comment is true: a
tunnel will receive a packet in ndo_start_xmit with an initial skb->sk.
It will make some transformations to that packet, and then it will send
the encapsulated packet out of a *new* socket. That new socket will
basically always have a different sk_bound_dev_if (otherwise there'd be
a routing loop). So for the purposes of routing the encapsulated packet,
the routing information as it pertains to the socket should come from
that socket's sk, rather than the packet's original skb->sk. For that
reason __ip_queue_xmit() and related functions all do the right thing.
One might argue that all tunnels should just call skb_orphan(skb) before
transmitting the encapsulated packet into the new socket. But tunnels do
*not* do this -- and this is wisely avoided in skb_scrub_packet() too --
because features like TSQ rely on skb->destructor() being called when
that buffer space is truely available again. Calling skb_orphan(skb) too
early would result in buffers filling up unnecessarily and accounting
info being all wrong. Instead, additional routing must take into account
the new sk, just as __ip_queue_xmit() notes.
So, this commit addresses the problem by fishing the correct sk out of
state->sk -- it's already set properly in the call to nf_hook() in
__ip_local_out(), which receives the sk as part of its normal
functionality. So we make sure to plumb state->sk through the various
route_me_harder functions, and then make correct use of it following the
example of __ip_queue_xmit().
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pull fallthrough fix from Gustavo A. R. Silva:
"This fixes a ton of fall-through warnings when building with Clang
12.0.0 and -Wimplicit-fallthrough"
* tag 'fallthrough-fixes-clang-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux:
include: jhash/signal: Fix fall-through warnings for Clang
Pull rdma fixes from Jason Gunthorpe:
"The good news is people are testing rc1 in the RDMA world - the bad
news is testing of the for-next area is not as good as I had hoped, as
we really should have caught at least the rdma_connect_locked() issue
before now.
Notable merge window regressions that didn't get caught/fixed in time
for rc1:
- Fix in kernel users of rxe, they were broken by the rapid fix to
undo the uABI breakage in rxe from another patch
- EFA userspace needs to read the GID table but was broken with the
new GID table logic
- Fix user triggerable deadlock in mlx5 using devlink reload
- Fix deadlock in several ULPs using rdma_connect from the CM handler
callbacks
- Memory leak in qedr"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/qedr: Fix memory leak in iWARP CM
RDMA: Add rdma_connect_locked()
RDMA/uverbs: Fix false error in query gid IOCTL
RDMA/mlx5: Fix devlink deadlock on net namespace deletion
RDMA/rxe: Fix small problem in network_type patch
In preparation to enable -Wimplicit-fallthrough for Clang, explicitly
add break statements instead of letting the code fall through to the
next case.
This patch adds four break statements that, together, fix almost 40,000
warnings when building Linux 5.10-rc1 with Clang 12.0.0 and this[1] change
reverted. Notice that in order to enable -Wimplicit-fallthrough for Clang,
such change[1] is meant to be reverted at some point. So, this patch helps
to move in that direction.
Something important to mention is that there is currently a discrepancy
between GCC and Clang when dealing with switch fall-through to empty case
statements or to cases that only contain a break/continue/return
statement[2][3][4].
Now that the -Wimplicit-fallthrough option has been globally enabled[5],
any compiler should really warn on missing either a fallthrough annotation
or any of the other case-terminating statements (break/continue/return/
goto) when falling through to the next case statement. Making exceptions
to this introduces variation in case handling which may continue to lead
to bugs, misunderstandings, and a general lack of robustness. The point
of enabling options like -Wimplicit-fallthrough is to prevent human error
and aid developers in spotting bugs before their code is even built/
submitted/committed, therefore eliminating classes of bugs. So, in order
to really accomplish this, we should, and can, move in the direction of
addressing any error-prone scenarios and get rid of the unintentional
fallthrough bug-class in the kernel, entirely, even if there is some minor
redundancy. Better to have explicit case-ending statements than continue to
have exceptions where one must guess as to the right result. The compiler
will eliminate any actual redundancy.
[1] commit e2079e93f5 ("kbuild: Do not enable -Wimplicit-fallthrough for clang for now")
[2] https://github.com/ClangBuiltLinux/linux/issues/636
[3] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91432
[4] https://godbolt.org/z/xgkvIh
[5] commit a035d552a9 ("Makefile: Globally enable fall-through warning")
Co-developed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Pull AFS fixes from David Howells:
- Fix copy_file_range() to an afs file now returning EINVAL if the
splice_write file op isn't supplied.
- Fix a deref-before-check in afs_unuse_cell().
- Fix a use-after-free in afs_xattr_get_acl().
- Fix afs to not try to clear PG_writeback when laundering a page.
- Fix afs to take a ref on a page that it sets PG_private on and to
drop that ref when clearing PG_private. This is done through recently
added helpers.
- Fix a page leak if write_begin() fails.
- Fix afs_write_begin() to not alter the dirty region info stored in
page->private, but rather do this in afs_write_end() instead when we
know what we actually changed.
- Fix afs_invalidatepage() to alter the dirty region info on a page
when partial page invalidation occurs so that we don't inadvertantly
include a span of zeros that will get written back if a page gets
laundered due to a remote 3rd-party induced invalidation.
We mustn't, however, reduce the dirty region if the page has been
seen to be mapped (ie. we got called through the page_mkwrite vector)
as the page might still be mapped and we might lose data if the file
is extended again.
- Fix the dirty region info to have a lower resolution if the size of
the page is too large for this to be encoded (e.g. powerpc32 with 64K
pages).
Note that this might not be the ideal way to handle this, since it
may allow some leakage of undirtied zero bytes to the server's copy
in the case of a 3rd-party conflict.
To aid the last two fixes, two additional changes:
- Wrap the manipulations of the dirty region info stored in
page->private into helper functions.
- Alter the encoding of the dirty region so that the region bounds can
be stored with one fewer bit, making a bit available for the
indication of mappedness.
* tag 'afs-fixes-20201029' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
afs: Fix dirty-region encoding on ppc32 with 64K pages
afs: Fix afs_invalidatepage to adjust the dirty region
afs: Alter dirty range encoding in page->private
afs: Wrap page->private manipulations in inline functions
afs: Fix where page->private is set during write
afs: Fix page leak on afs_write_begin() failure
afs: Fix to take ref on page when PG_private is set
afs: Fix afs_launder_page to not clear PG_writeback
afs: Fix a use after free in afs_xattr_get_acl()
afs: Fix tracing deref-before-check
afs: Fix copy_file_range()
Pull ext4 fixes from Ted Ts'o:
"Bug fixes for the new ext4 fast commit feature, plus a fix for the
'data=journal' bug fix.
Also use the generic casefolding support which has now landed in
fs/libfs.c for 5.10"
* tag 'ext4_for_linus_fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: indicate that fast_commit is available via /sys/fs/ext4/feature/...
ext4: use generic casefolding support
ext4: do not use extent after put_bh
ext4: use IS_ERR() for error checking of path
ext4: fix mmap write protection for data=journal mode
jbd2: fix a kernel-doc markup
ext4: use s_mount_flags instead of s_mount_state for fast commit state
ext4: make num of fast commit blocks configurable
ext4: properly check for dirty state in ext4_inode_datasync_dirty()
ext4: fix double locking in ext4_fc_commit_dentry_updates()
Fix afs_invalidatepage() to adjust the dirty region recorded in
page->private when truncating a page. If the dirty region is entirely
removed, then the private data is cleared and the page dirty state is
cleared.
Without this, if the page is truncated and then expanded again by truncate,
zeros from the expanded, but no-longer dirty region may get written back to
the server if the page gets laundered due to a conflicting 3rd-party write.
It mustn't, however, shorten the dirty region of the page if that page is
still mmapped and has been marked dirty by afs_page_mkwrite(), so a flag is
stored in page->private to record this.
Fixes: 4343d00872 ("afs: Get rid of the afs_writeback record")
Signed-off-by: David Howells <dhowells@redhat.com>
The afs filesystem uses page->private to store the dirty range within a
page such that in the event of a conflicting 3rd-party write to the server,
we write back just the bits that got changed locally.
However, there are a couple of problems with this:
(1) I need a bit to note if the page might be mapped so that partial
invalidation doesn't shrink the range.
(2) There aren't necessarily sufficient bits to store the entire range of
data altered (say it's a 32-bit system with 64KiB pages or transparent
huge pages are in use).
So wrap the accesses in inline functions so that future commits can change
how this works.
Also move them out of the tracing header into the in-directory header.
There's not really any need for them to be in the tracing header.
Signed-off-by: David Howells <dhowells@redhat.com>
There are two flows for handling RDMA_CM_EVENT_ROUTE_RESOLVED, either the
handler triggers a completion and another thread does rdma_connect() or
the handler directly calls rdma_connect().
In all cases rdma_connect() needs to hold the handler_mutex, but when
handler's are invoked this is already held by the core code. This causes
ULPs using the 2nd method to deadlock.
Provide a rdma_connect_locked() and have all ULPs call it from their
handlers.
Link: https://lore.kernel.org/r/0-v2-53c22d5c1405+33-rdma_connect_locking_jgg@nvidia.com
Reported-and-tested-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Fixes: 2a7cec5381 ("RDMA/cma: Fix locking for the RDMA_CM_CONNECT state")
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Under some circumstances, the compiler generates .ctors.* sections. This
is seen doing a cross compile of x86_64 from a powerpc64el host:
x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from `kernel/trace/trace_clock.o' being
placed in section `.ctors.65435'
x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from `kernel/trace/ftrace.o' being
placed in section `.ctors.65435'
x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from `kernel/trace/ring_buffer.o' being
placed in section `.ctors.65435'
Include these orphans along with the regular .ctors section.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Tested-by: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: 83109d5d5f ("x86/build: Warn on orphan section placement")
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lore.kernel.org/r/20201005025720.2599682-1-keescook@chromium.org
When a mlx5 core devlink instance is reloaded in different net namespace,
its associated IB device is deleted and recreated.
Example sequence is:
$ ip netns add foo
$ devlink dev reload pci/0000:00:08.0 netns foo
$ ip netns del foo
mlx5 IB device needs to attach and detach the netdevice to it through the
netdev notifier chain during load and unload sequence. A below call graph
of the unload flow.
cleanup_net()
down_read(&pernet_ops_rwsem); <- first sem acquired
ops_pre_exit_list()
pre_exit()
devlink_pernet_pre_exit()
devlink_reload()
mlx5_devlink_reload_down()
mlx5_unload_one()
[...]
mlx5_ib_remove()
mlx5_ib_unbind_slave_port()
mlx5_remove_netdev_notifier()
unregister_netdevice_notifier()
down_write(&pernet_ops_rwsem);<- recurrsive lock
Hence, when net namespace is deleted, mlx5 reload results in deadlock.
When deadlock occurs, devlink mutex is also held. This not only deadlocks
the mlx5 device under reload, but all the processes which attempt to
access unrelated devlink devices are deadlocked.
Hence, fix this by mlx5 ib driver to register for per net netdev notifier
instead of global one, which operats on the net namespace without holding
the pernet_ops_rwsem.
Fixes: 4383cfcc65 ("net/mlx5: Add devlink reload")
Link: https://lore.kernel.org/r/20201026134359.23150-1-parav@nvidia.com
Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Commit 453431a549 ("mm, treewide: rename kzfree() to
kfree_sensitive()") renamed kzfree() to kfree_sensitive(),
but it left a compatibility definition of kzfree() to avoid
being too disruptive.
Since then a few more instances of kzfree() have slipped in.
Just get rid of them and remove the compatibility definition
once and for all.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull perf fix from Thomas Gleixner:
"A single fix to compute the field offset of the SNOOPX bit in the data
source bitmask of perf events correctly"
* tag 'perf-urgent-2020-10-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf: correct SNOOPX field offset
Pull locking fix from Thomas Gleixner:
"Just a trivial fix for kernel-doc warnings"
* tag 'locking-urgent-2020-10-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/seqlocks: Fix kernel-doc warnings
Pull more parisc updates from Helge Deller:
- During this merge window O_NONBLOCK was changed to become 000200000,
but we missed that the syscalls timerfd_create(), signalfd4(),
eventfd2(), pipe2(), inotify_init1() and userfaultfd() do a strict
bit-wise check of the flags parameter.
To provide backward compatibility with existing userspace we
introduce parisc specific wrappers for those syscalls which filter
out the old O_NONBLOCK value and replaces it with the new one.
- Prevent HIL bus driver to get stuck when keyboard or mouse isn't
attached
- Improve error return codes when setting rtc time
- Minor documentation fix in pata_ns87415.c
* 'parisc-5.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
ata: pata_ns87415.c: Document support on parisc with superio chip
parisc: Add wrapper syscalls to fix O_NONBLOCK flag usage
hil/parisc: Disable HIL driver when it gets stuck
parisc: Improve error return codes when setting rtc time
Pull more xen updates from Juergen Gross:
- a series for the Xen pv block drivers adding module parameters for
better control of resource usge
- a cleanup series for the Xen event driver
* tag 'for-linus-5.10b-rc1c-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
Documentation: add xen.fifo_events kernel parameter description
xen/events: unmask a fifo event channel only if it was masked
xen/events: only register debug interrupt for 2-level events
xen/events: make struct irq_info private to events_base.c
xen: remove no longer used functions
xen-blkfront: Apply changed parameter name to the document
xen-blkfront: add a parameter for disabling of persistent grants
xen-blkback: add a parameter for disabling of persistent grants
Pull random32 updates from Willy Tarreau:
"Make prandom_u32() less predictable.
This is the cleanup of the latest series of prandom_u32
experimentations consisting in using SipHash instead of Tausworthe to
produce the randoms used by the network stack.
The changes to the files were kept minimal, and the controversial
commit that used to take noise from the fast_pool (f227e3ec3b) was
reverted. Instead, a dedicated "net_rand_noise" per_cpu variable is
fed from various sources of activities (networking, scheduling) to
perturb the SipHash state using fast, non-trivially predictable data,
instead of keeping it fully deterministic. The goal is essentially to
make any occasional memory leakage or brute-force attempt useless.
The resulting code was verified to be very slightly faster on x86_64
than what is was with the controversial commit above, though this
remains barely above measurement noise. It was also tested on i386 and
arm, and build- tested only on arm64"
Link: https://lore.kernel.org/netdev/20200808152628.GA27941@SDF.ORG/
* tag '20201024-v4-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/wtarreau/prandom:
random32: add a selftest for the prandom32 code
random32: add noise from network and scheduling activity
random32: make prandom_u32() output unpredictable
Pull block fixes from Jens Axboe:
- NVMe pull request from Christoph
- rdma error handling fixes (Chao Leng)
- fc error handling and reconnect fixes (James Smart)
- fix the qid displace when tracing ioctl command (Keith Busch)
- don't use BLK_MQ_REQ_NOWAIT for passthru (Chaitanya Kulkarni)
- fix MTDT for passthru (Logan Gunthorpe)
- blacklist Write Same on more devices (Kai-Heng Feng)
- fix an uninitialized work struct (zhenwei pi)"
- lightnvm out-of-bounds fix (Colin)
- SG allocation leak fix (Doug)
- rnbd fixes (Gioh, Guoqing, Jack)
- zone error translation fixes (Keith)
- kerneldoc markup fix (Mauro)
- zram lockdep fix (Peter)
- Kill unused io_context members (Yufen)
- NUMA memory allocation cleanup (Xianting)
- NBD config wakeup fix (Xiubo)
* tag 'block-5.10-2020-10-24' of git://git.kernel.dk/linux-block: (27 commits)
block: blk-mq: fix a kernel-doc markup
nvme-fc: shorten reconnect delay if possible for FC
nvme-fc: wait for queues to freeze before calling update_hr_hw_queues
nvme-fc: fix error loop in create_hw_io_queues
nvme-fc: fix io timeout to abort I/O
null_blk: use zone status for max active/open
nvmet: don't use BLK_MQ_REQ_NOWAIT for passthru
nvmet: cleanup nvmet_passthru_map_sg()
nvmet: limit passthru MTDS by BIO_MAX_PAGES
nvmet: fix uninitialized work for zero kato
nvme-pci: disable Write Zeroes on Sandisk Skyhawk
nvme: use queuedata for nvme_req_qid
nvme-rdma: fix crash due to incorrect cqe
nvme-rdma: fix crash when connect rejected
block: remove unused members for io_context
blk-mq: remove the calling of local_memory_node()
zram: Fix __zram_bvec_{read,write}() locking order
skd_main: remove unused including <linux/version.h>
sgl_alloc_order: fix memory leak
lightnvm: fix out-of-bounds write to array devices->info[]
...
Pull io_uring fixes from Jens Axboe:
- fsize was missed in previous unification of work flags
- Few fixes cleaning up the flags unification creds cases (Pavel)
- Fix NUMA affinities for completely unplugged/replugged node for io-wq
- Two fallout fixes from the set_fs changes. One local to io_uring, one
for the splice entry point that io_uring uses.
- Linked timeout fixes (Pavel)
- Removal of ->flush() ->files work-around that we don't need anymore
with referenced files (Pavel)
- Various cleanups (Pavel)
* tag 'io_uring-5.10-2020-10-24' of git://git.kernel.dk/linux-block:
splice: change exported internal do_splice() helper to take kernel offset
io_uring: make loop_rw_iter() use original user supplied pointers
io_uring: remove req cancel in ->flush()
io-wq: re-set NUMA node affinities if CPUs come online
io_uring: don't reuse linked_timeout
io_uring: unify fsize with def->work_flags
io_uring: fix racy REQ_F_LINK_TIMEOUT clearing
io_uring: do poll's hash_node init in common code
io_uring: inline io_poll_task_handler()
io_uring: remove extra ->file check in poll prep
io_uring: make cached_cq_overflow non atomic_t
io_uring: inline io_fail_links()
io_uring: kill ref get/drop in personality init
io_uring: flags-based creds init in queue
Pull misc vfs updates from Al Viro:
"Assorted stuff all over the place (the largest group here is
Christoph's stat cleanups)"
* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fs: remove KSTAT_QUERY_FLAGS
fs: remove vfs_stat_set_lookup_flags
fs: move vfs_fstatat out of line
fs: implement vfs_stat and vfs_lstat in terms of vfs_fstatat
fs: remove vfs_statx_fd
fs: omfs: use kmemdup() rather than kmalloc+memcpy
[PATCH] reduce boilerplate in fsid handling
fs: Remove duplicated flag O_NDELAY occurring twice in VALID_OPEN_FLAGS
selftests: mount: add nosymfollow tests
Add a "nosymfollow" mount option.
Pull dma-mapping fixes from Christoph Hellwig:
- document the new dma_{alloc,free}_pages() API
- two fixups for the dma-mapping.h split
* tag 'dma-mapping-5.10-1' of git://git.infradead.org/users/hch/dma-mapping:
dma-mapping: document dma_{alloc,free}_pages
dma-mapping: move more functions to dma-map-ops.h
ARM/sa1111: add a missing include of dma-map-ops.h
With the removal of the interrupt perturbations in previous random32
change (random32: make prandom_u32() output unpredictable), the PRNG
has become 100% deterministic again. While SipHash is expected to be
way more robust against brute force than the previous Tausworthe LFSR,
there's still the risk that whoever has even one temporary access to
the PRNG's internal state is able to predict all subsequent draws till
the next reseed (roughly every minute). This may happen through a side
channel attack or any data leak.
This patch restores the spirit of commit f227e3ec3b ("random32: update
the net random state on interrupt and activity") in that it will perturb
the internal PRNG's statee using externally collected noise, except that
it will not pick that noise from the random pool's bits nor upon
interrupt, but will rather combine a few elements along the Tx path
that are collectively hard to predict, such as dev, skb and txq
pointers, packet length and jiffies values. These ones are combined
using a single round of SipHash into a single long variable that is
mixed with the net_rand_state upon each invocation.
The operation was inlined because it produces very small and efficient
code, typically 3 xor, 2 add and 2 rol. The performance was measured
to be the same (even very slightly better) than before the switch to
SipHash; on a 6-core 12-thread Core i7-8700k equipped with a 40G NIC
(i40e), the connection rate dropped from 556k/s to 555k/s while the
SYN cookie rate grew from 5.38 Mpps to 5.45 Mpps.
Link: https://lore.kernel.org/netdev/20200808152628.GA27941@SDF.ORG/
Cc: George Spelvin <lkml@sdf.org>
Cc: Amit Klein <aksecurity@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: tytso@mit.edu
Cc: Florian Westphal <fw@strlen.de>
Cc: Marc Plumb <lkml.mplumb@gmail.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Non-cryptographic PRNGs may have great statistical properties, but
are usually trivially predictable to someone who knows the algorithm,
given a small sample of their output. An LFSR like prandom_u32() is
particularly simple, even if the sample is widely scattered bits.
It turns out the network stack uses prandom_u32() for some things like
random port numbers which it would prefer are *not* trivially predictable.
Predictability led to a practical DNS spoofing attack. Oops.
This patch replaces the LFSR with a homebrew cryptographic PRNG based
on the SipHash round function, which is in turn seeded with 128 bits
of strong random key. (The authors of SipHash have *not* been consulted
about this abuse of their algorithm.) Speed is prioritized over security;
attacks are rare, while performance is always wanted.
Replacing all callers of prandom_u32() is the quick fix.
Whether to reinstate a weaker PRNG for uses which can tolerate it
is an open question.
Commit f227e3ec3b ("random32: update the net random state on interrupt
and activity") was an earlier attempt at a solution. This patch replaces
it.
Reported-by: Amit Klein <aksecurity@gmail.com>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Eric Dumazet <edumazet@google.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: tytso@mit.edu
Cc: Florian Westphal <fw@strlen.de>
Cc: Marc Plumb <lkml.mplumb@gmail.com>
Fixes: f227e3ec3b ("random32: update the net random state on interrupt and activity")
Signed-off-by: George Spelvin <lkml@sdf.org>
Link: https://lore.kernel.org/netdev/20200808152628.GA27941@SDF.ORG/
[ willy: partial reversal of f227e3ec3b5c; moved SIPROUND definitions
to prandom.h for later use; merged George's prandom_seed() proposal;
inlined siprand_u32(); replaced the net_rand_state[] array with 4
members to fix a build issue; cosmetic cleanups to make checkpatch
happy; fixed RANDOM32_SELFTEST build ]
Signed-off-by: Willy Tarreau <w@1wt.eu>
Pull more RISC-V updates from Palmer Dabbelt:
"Just a single patch set: the remainder of Christoph's work to remove
set_fs, including the RISC-V portion"
* tag 'riscv-for-linus-5.10-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: remove address space overrides using set_fs()
riscv: implement __get_kernel_nofault and __put_user_nofault
riscv: refactor __get_user and __put_user
riscv: use memcpy based uaccess for nommu again
asm-generic: make the set_fs implementation optional
asm-generic: add nommu implementations of __{get,put}_kernel_nofault
asm-generic: improve the nommu {get,put}_user handling
uaccess: provide a generic TASK_SIZE_MAX definition
Pull ARM SoC-related driver updates from Olof Johansson:
"Various driver updates for platforms. A bulk of this is smaller fixes
or cleanups, but some of the new material this time around is:
- Support for Nvidia Tegra234 SoC
- Ring accelerator support for TI AM65x
- PRUSS driver for TI platforms
- Renesas support for R-Car V3U SoC
- Reset support for Cortex-M4 processor on i.MX8MQ
There are also new socinfo entries for a handful of different SoCs and
platforms"
* tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (131 commits)
drm/mediatek: reduce clear event
soc: mediatek: cmdq: add clear option in cmdq_pkt_wfe api
soc: mediatek: cmdq: add jump function
soc: mediatek: cmdq: add write_s_mask value function
soc: mediatek: cmdq: add write_s value function
soc: mediatek: cmdq: add read_s function
soc: mediatek: cmdq: add write_s_mask function
soc: mediatek: cmdq: add write_s function
soc: mediatek: cmdq: add address shift in jump
soc: mediatek: mtk-infracfg: Fix kerneldoc
soc: amlogic: pm-domains: use always-on flag
reset: sti: reset-syscfg: fix struct description warnings
reset: imx7: add the cm4 reset for i.MX8MQ
dt-bindings: reset: imx8mq: add m4 reset
reset: Fix and extend kerneldoc
reset: reset-zynqmp: Added support for Versal platform
dt-bindings: reset: Updated binding for Versal reset driver
reset: imx7: Support module build
soc: fsl: qe: Remove unnessesary check in ucc_set_tdm_rxtx_clk
soc: fsl: qman: convert to use be32_add_cpu()
...
Pull ARM SoC platform updates from Olof Johansson:
"SoC changes, a substantial part of this is cleanup of some of the
older platforms that used to have a bunch of board files.
In particular:
- Remove non-DT i.MX platforms that haven't seen activity in years,
it's time to remove them.
- A bunch of cleanup and removal of platform data for TI/OMAP
platforms, moving over to genpd for power/reset control (yay!)
- Major cleanup of Samsung S3C24xx and S3C64xx platforms, moving them
closer to multiplatform support (not quite there yet, but getting
close).
There are a few other changes too, smaller fixlets, etc. For new
platform support, the primary ones are:
- New SoC: Hisilicon SD5203, ARM926EJ-S platform.
- Cpufreq support for i.MX7ULP"
* tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (121 commits)
ARM: mstar: Select MStar intc
ARM: stm32: Replace HTTP links with HTTPS ones
ARM: debug: add UART early console support for SD5203
ARM: hisi: add support for SD5203 SoC
ARM: omap3: enable off mode automatically
clk: imx: imx35: Remove mx35_clocks_init()
clk: imx: imx31: Remove mx31_clocks_init()
clk: imx: imx27: Remove mx27_clocks_init()
ARM: imx: Remove unused definitions
ARM: imx35: Retrieve the IIM base address from devicetree
ARM: imx3: Retrieve the AVIC base address from devicetree
ARM: imx3: Retrieve the CCM base address from devicetree
ARM: imx31: Retrieve the IIM base address from devicetree
ARM: imx27: Retrieve the CCM base address from devicetree
ARM: imx27: Retrieve the SYSCTRL base address from devicetree
ARM: s3c64xx: bring back notes from removed debug-macro.S
ARM: s3c24xx: fix Wunused-variable warning on !MMU
ARM: samsung: fix PM debug build with DEBUG_LL but !MMU
MAINTAINERS: mark linux-samsung-soc list non-moderated
ARM: imx: Remove remnant board file support pieces
...
Pull ARM SoC fixes from Olof Johansson:
"I had queued up a batch of fixes that got a bit close to the release
for sending in before the merge window opened, so I'm including them
in the merge window batch instead.
Mostly smaller DT tweaks and fixes, the usual mix that we tend to have
through the releases"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
ARM: dts: iwg20d-q7-common: Fix touch controller probe failure
ARM: OMAP2+: Restore MPU power domain if cpu_cluster_pm_enter() fails
ARM: dts: am33xx: modify AM33XX_IOPAD for #pinctrl-cells = 2
soc: actions: include header to fix missing prototype
arm64: dts: ti: k3-j721e: Rename mux header and update macro names
soc: qcom: pdr: Fixup array type of get_domain_list_resp message
arm64: dts: qcom: pm660: Fix missing pound sign in interrupt-cells
arm64: dts: qcom: kitakami: Temporarily disable SDHCI1
arm64: dts: sdm630: Temporarily disable SMMUs by default
arm64: dts: sdm845: Fixup OPP table for all qup devices
arm64: dts: allwinner: h5: remove Mali GPU PMU module
ARM: dts: sun8i: r40: bananapi-m2-ultra: Fix dcdc1 regulator
soc: xilinx: Fix error code in zynqmp_pm_probe()
Pull more power management updates from Rafael Wysocki:
"First of all, the adaptive voltage scaling (AVS) drivers go to new
platform-specific locations as planned (this part was reported to have
merge conflicts against the new arm-soc updates in linux-next).
In addition to that, there are some fixes (intel_idle, intel_pstate,
RAPL, acpi_cpufreq), the addition of on/off notifiers and idle state
accounting support to the generic power domains (genpd) code and some
janitorial changes all over.
Specifics:
- Move the AVS drivers to new platform-specific locations and get rid
of the drivers/power/avs directory (Ulf Hansson).
- Add on/off notifiers and idle state accounting support to the
generic power domains (genpd) framework (Ulf Hansson, Lina Iyer).
- Ulf will maintain the PM domain part of cpuidle-psci (Ulf Hansson).
- Make intel_idle disregard ACPI _CST if it cannot use the data
returned by that method (Mel Gorman).
- Modify intel_pstate to avoid leaving useless sysfs directory
structure behind if it cannot be registered (Chen Yu).
- Fix domain detection in the RAPL power capping driver and prevent
it from failing to enumerate the Psys RAPL domain (Zhang Rui).
- Allow acpi-cpufreq to use ACPI _PSD information with Family 19 and
later AMD chips (Wei Huang).
- Update the driver assumptions comment in intel_idle and fix a
kerneldoc comment in the runtime PM framework (Alexander Monakov,
Bean Huo).
- Avoid unnecessary resets of the cached frequency in the schedutil
cpufreq governor to reduce overhead (Wei Wang).
- Clean up the cpufreq core a bit (Viresh Kumar).
- Make assorted minor janitorial changes (Daniel Lezcano, Geert
Uytterhoeven, Hubert Jasudowicz, Tom Rix).
- Clean up and optimize the cpupower utility somewhat (Colin Ian
King, Martin Kaistra)"
* tag 'pm-5.10-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (23 commits)
PM: sleep: remove unreachable break
PM: AVS: Drop the avs directory and the corresponding Kconfig
PM: AVS: qcom-cpr: Move the driver to the qcom specific drivers
PM: runtime: Fix typo in pm_runtime_set_active() helper comment
PM: domains: Fix build error for genpd notifiers
powercap: Fix typo in Kconfig "Plance" -> "Plane"
cpufreq: schedutil: restore cached freq when next_f is not changed
acpi-cpufreq: Honor _PSD table setting on new AMD CPUs
PM: AVS: smartreflex Move driver to soc specific drivers
PM: AVS: rockchip-io: Move the driver to the rockchip specific drivers
PM: domains: enable domain idle state accounting
PM: domains: Add curly braces to delimit comment + statement block
PM: domains: Add support for PM domain on/off notifiers for genpd
powercap/intel_rapl: enumerate Psys RAPL domain together with package RAPL domain
powercap/intel_rapl: Fix domain detection
intel_idle: Ignore _CST if control cannot be taken from the platform
cpuidle: Remove pointless stub
intel_idle: mention assumption that WBINVD is not needed
MAINTAINERS: Add section for cpuidle-psci PM domain
cpufreq: intel_pstate: Delete intel_pstate sysfs if failed to register the driver
...
Pull more SCSI updates from James Bottomley:
"The set of core changes here is Christoph's submission path cleanups.
These introduced a couple of regressions when first proposed so they
got held over from the initial merge window pull request to give more
testing time, which they've now had and Syzbot has confirmed the
regression it detected is fixed.
The other main changes are two driver updates (arcmsr, pm80xx) and
assorted minor clean ups"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (38 commits)
scsi: qla2xxx: Fix return of uninitialized value in rval
scsi: core: Set sc_data_direction to DMA_NONE for no-transfer commands
scsi: sr: Initialize ->cmd_len
scsi: arcmsr: Update driver version to v1.50.00.02-20200819
scsi: arcmsr: Add support for ARC-1886 series RAID controllers
scsi: arcmsr: Fix device hot-plug monitoring timer stop
scsi: arcmsr: Remove unnecessary syntax
scsi: pm80xx: Driver version update
scsi: pm80xx: Increase the number of outstanding I/O supported to 1024
scsi: pm80xx: Remove DMA memory allocation for ccb and device structures
scsi: pm80xx: Increase number of supported queues
scsi: sym53c8xx_2: Fix sizeof() mismatch
scsi: isci: Fix a typo in a comment
scsi: qla4xxx: Fix inconsistent format argument type
scsi: myrb: Fix inconsistent format argument types
scsi: myrb: Remove redundant assignment to variable timeout
scsi: bfa: Fix error return in bfad_pci_init()
scsi: fcoe: Simplify the return expression of fcoe_sysfs_setup()
scsi: snic: Simplify the return expression of svnic_cq_alloc()
scsi: fnic: Simplify the return expression of vnic_wq_copy_alloc()
...
Pull networking fixes from Jakub Kicinski:
"Cross-tree/merge window issues:
- rtl8150: don't incorrectly assign random MAC addresses; fix late in
the 5.9 cycle started depending on a return code from a function
which changed with the 5.10 PR from the usb subsystem
Current release regressions:
- Revert "virtio-net: ethtool configurable RXCSUM", it was causing
crashes at probe when control vq was not negotiated/available
Previous release regressions:
- ixgbe: fix probing of multi-port 10 Gigabit Intel NICs with an MDIO
bus, only first device would be probed correctly
- nexthop: Fix performance regression in nexthop deletion by
effectively switching from recently added synchronize_rcu() to
synchronize_rcu_expedited()
- netsec: ignore 'phy-mode' device property on ACPI systems; the
property is not populated correctly by the firmware, but firmware
configures the PHY so just keep boot settings
Previous releases - always broken:
- tcp: fix to update snd_wl1 in bulk receiver fast path, addressing
bulk transfers getting "stuck"
- icmp: randomize the global rate limiter to prevent attackers from
getting useful signal
- r8169: fix operation under forced interrupt threading, make the
driver always use hard irqs, even on RT, given the handler is light
and only wants to schedule napi (and do so through a _irqoff()
variant, preferably)
- bpf: Enforce pointer id generation for all may-be-null register
type to avoid pointers erroneously getting marked as null-checked
- tipc: re-configure queue limit for broadcast link
- net/sched: act_tunnel_key: fix OOB write in case of IPv6 ERSPAN
tunnels
- fix various issues in chelsio inline tls driver
Misc:
- bpf: improve just-added bpf_redirect_neigh() helper api to support
supplying nexthop by the caller - in case BPF program has already
done a lookup we can avoid doing another one
- remove unnecessary break statements
- make MCTCP not select IPV6, but rather depend on it"
* tag 'net-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (62 commits)
tcp: fix to update snd_wl1 in bulk receiver fast path
net: Properly typecast int values to set sk_max_pacing_rate
netfilter: nf_fwd_netdev: clear timestamp in forwarding path
ibmvnic: save changed mac address to adapter->mac_addr
selftests: mptcp: depends on built-in IPv6
Revert "virtio-net: ethtool configurable RXCSUM"
rtnetlink: fix data overflow in rtnl_calcit()
net: ethernet: mtk-star-emac: select REGMAP_MMIO
net: hdlc_raw_eth: Clear the IFF_TX_SKB_SHARING flag after calling ether_setup
net: hdlc: In hdlc_rcv, check to make sure dev is an HDLC device
bpf, libbpf: Guard bpf inline asm from bpf_tail_call_static
bpf, selftests: Extend test_tc_redirect to use modified bpf_redirect_neigh()
bpf: Fix bpf_redirect_neigh helper api to support supplying nexthop
mptcp: depends on IPV6 but not as a module
sfc: move initialisation of efx->filter_sem to efx_init_struct()
mpls: load mpls_gso after mpls_iptunnel
net/sched: act_tunnel_key: fix OOB write in case of IPv6 ERSPAN tunnels
net/sched: act_gate: Unlock ->tcfa_lock in tc_setup_flow_action()
net: dsa: bcm_sf2: make const array static, makes object smaller
mptcp: MPTCP_IPV6 should depend on IPV6 instead of selecting it
...
Pull clone/dedupe/remap code refactoring from Darrick Wong:
"Move the generic file range remap (aka reflink and dedupe) functions
out of mm/filemap.c and fs/read_write.c and into fs/remap_range.c to
reduce clutter in the first two files"
* tag 'vfs-5.10-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
vfs: move the generic write and copy checks out of mm
vfs: move the remap range helpers to remap_range.c
vfs: move generic_remap_checks out of mm
Pull KVM updates from Paolo Bonzini:
"For x86, there is a new alternative and (in the future) more scalable
implementation of extended page tables that does not need a reverse
map from guest physical addresses to host physical addresses.
For now it is disabled by default because it is still lacking a few of
the existing MMU's bells and whistles. However it is a very solid
piece of work and it is already available for people to hammer on it.
Other updates:
ARM:
- New page table code for both hypervisor and guest stage-2
- Introduction of a new EL2-private host context
- Allow EL2 to have its own private per-CPU variables
- Support of PMU event filtering
- Complete rework of the Spectre mitigation
PPC:
- Fix for running nested guests with in-kernel IRQ chip
- Fix race condition causing occasional host hard lockup
- Minor cleanups and bugfixes
x86:
- allow trapping unknown MSRs to userspace
- allow userspace to force #GP on specific MSRs
- INVPCID support on AMD
- nested AMD cleanup, on demand allocation of nested SVM state
- hide PV MSRs and hypercalls for features not enabled in CPUID
- new test for MSR_IA32_TSC writes from host and guest
- cleanups: MMU, CPUID, shared MSRs
- LAPIC latency optimizations ad bugfixes"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (232 commits)
kvm: x86/mmu: NX largepage recovery for TDP MMU
kvm: x86/mmu: Don't clear write flooding count for direct roots
kvm: x86/mmu: Support MMIO in the TDP MMU
kvm: x86/mmu: Support write protection for nesting in tdp MMU
kvm: x86/mmu: Support disabling dirty logging for the tdp MMU
kvm: x86/mmu: Support dirty logging for the TDP MMU
kvm: x86/mmu: Support changed pte notifier in tdp MMU
kvm: x86/mmu: Add access tracking for tdp_mmu
kvm: x86/mmu: Support invalidate range MMU notifier for TDP MMU
kvm: x86/mmu: Allocate struct kvm_mmu_pages for all pages in TDP MMU
kvm: x86/mmu: Add TDP MMU PF handler
kvm: x86/mmu: Remove disallowed_hugepage_adjust shadow_walk_iterator arg
kvm: x86/mmu: Support zapping SPTEs in the TDP MMU
KVM: Cache as_id in kvm_memory_slot
kvm: x86/mmu: Add functions to handle changed TDP SPTEs
kvm: x86/mmu: Allocate and free TDP MMU roots
kvm: x86/mmu: Init / Uninit the TDP MMU
kvm: x86/mmu: Introduce tdp_iter
KVM: mmu: extract spte.h and spte.c
KVM: mmu: Separate updating a PTE from kvm_set_pte_rmapp
...
Pull virtio updates from Michael Tsirkin:
"vhost, vdpa, and virtio cleanups and fixes
A very quiet cycle, no new features"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
MAINTAINERS: add URL for virtio-mem
vhost_vdpa: remove unnecessary spin_lock in vhost_vring_call
vringh: fix __vringh_iov() when riov and wiov are different
vdpa/mlx5: Setup driver only if VIRTIO_CONFIG_S_DRIVER_OK
s390: virtio: PV needs VIRTIO I/O device protection
virtio: let arch advertise guest's memory access restrictions
vhost_vdpa: Fix duplicate included kernel.h
vhost: reduce stack usage in log_used
virtio-mem: Constify mem_id_table
virtio_input: Constify id_table
virtio-balloon: Constify id_table
vdpa/mlx5: Fix failure to bring link up
vdpa/mlx5: Make use of a specific 16 bit endianness API
Pull arch task_work cleanups from Jens Axboe:
"Two cleanups that don't fit other categories:
- Finally get the task_work_add() cleanup done properly, so we don't
have random 0/1/false/true/TWA_SIGNAL confusing use cases. Updates
all callers, and also fixes up the documentation for
task_work_add().
- While working on some TIF related changes for 5.11, this
TIF_NOTIFY_RESUME cleanup fell out of that. Remove some arch
duplication for how that is handled"
* tag 'arch-cleanup-2020-10-22' of git://git.kernel.dk/linux-block:
task_work: cleanup notification modes
tracehook: clear TIF_NOTIFY_RESUME in tracehook_notify_resume()
* pm-cpufreq:
cpufreq: schedutil: restore cached freq when next_f is not changed
acpi-cpufreq: Honor _PSD table setting on new AMD CPUs
cpufreq: intel_pstate: Delete intel_pstate sysfs if failed to register the driver
cpufreq: Improve code around unlisted freq check
* pm-cpuidle:
intel_idle: Ignore _CST if control cannot be taken from the platform
cpuidle: Remove pointless stub
intel_idle: mention assumption that WBINVD is not needed
MAINTAINERS: Add section for cpuidle-psci PM domain
When starting a HP machine with HIL driver but without an HIL keyboard
or HIL mouse attached, it may happen that data written to the HIL loop
gets stuck (e.g. because the transaction queue is full). Usually one
will then have to reboot the machine because all you see is and endless
output of:
Transaction add failed: transaction already queued?
In the higher layers hp_sdc_enqueue_transaction() is called to queued up
a HIL packet. This function returns an error code, and this patch adds
the necessary checks for this return code and disables the HIL driver if
further packets can't be sent.
Tested on a HP 730 and a HP 715/64 machine.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org>
With the set_fs change, we can no longer rely on copy_{to,from}_user()
accepting a kernel pointer, and it was bad form to do so anyway. Clean
this up and change the internal helper that io_uring uses to deal with
kernel pointers instead. This puts the offset copy in/out in __do_splice()
instead, which just calls the same helper.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pull Kbuild updates from Masahiro Yamada:
- Support 'make compile_commands.json' to generate the compilation
database more easily, avoiding stale entries
- Support 'make clang-analyzer' and 'make clang-tidy' for static checks
using clang-tidy
- Preprocess scripts/modules.lds.S to allow CONFIG options in the
module linker script
- Drop cc-option tests from compiler flags supported by our minimal
GCC/Clang versions
- Use always 12-digits commit hash for CONFIG_LOCALVERSION_AUTO=y
- Use sha1 build id for both BFD linker and LLD
- Improve deb-pkg for reproducible builds and rootless builds
- Remove stale, useless scripts/namespace.pl
- Turn -Wreturn-type warning into error
- Fix build error of deb-pkg when CONFIG_MODULES=n
- Replace 'hostname' command with more portable 'uname -n'
- Various Makefile cleanups
* tag 'kbuild-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (34 commits)
kbuild: Use uname for LINUX_COMPILE_HOST detection
kbuild: Only add -fno-var-tracking-assignments for old GCC versions
kbuild: remove leftover comment for filechk utility
treewide: remove DISABLE_LTO
kbuild: deb-pkg: clean up package name variables
kbuild: deb-pkg: do not build linux-headers package if CONFIG_MODULES=n
kbuild: enforce -Werror=return-type
scripts: remove namespace.pl
builddeb: Add support for all required debian/rules targets
builddeb: Enable rootless builds
builddeb: Pass -n to gzip for reproducible packages
kbuild: split the build log of kallsyms
kbuild: explicitly specify the build id style
scripts/setlocalversion: make git describe output more reliable
kbuild: remove cc-option test of -Werror=date-time
kbuild: remove cc-option test of -fno-stack-check
kbuild: remove cc-option test of -fno-strict-overflow
kbuild: move CFLAGS_{KASAN,UBSAN,KCSAN} exports to relevant Makefiles
kbuild: remove redundant CONFIG_KASAN check from scripts/Makefile.kasan
kbuild: do not create built-in objects for external module builds
...
Pull VFIO updates from Alex Williamson:
- New fsl-mc vfio bus driver supporting userspace drivers of objects
within NXP's DPAA2 architecture (Diana Craciun)
- Support for exposing zPCI information on s390 (Matthew Rosato)
- Fixes for "detached" VFs on s390 (Matthew Rosato)
- Fixes for pin-pages and dma-rw accesses (Yan Zhao)
- Cleanups and optimize vconfig regen (Zenghui Yu)
- Fix duplicate irq-bypass token registration (Alex Williamson)
* tag 'vfio-v5.10-rc1' of git://github.com/awilliam/linux-vfio: (30 commits)
vfio iommu type1: Fix memory leak in vfio_iommu_type1_pin_pages
vfio/pci: Clear token on bypass registration failure
vfio/fsl-mc: fix the return of the uninitialized variable ret
vfio/fsl-mc: Fix the dead code in vfio_fsl_mc_set_irq_trigger
vfio/fsl-mc: Fixed vfio-fsl-mc driver compilation on 32 bit
MAINTAINERS: Add entry for s390 vfio-pci
vfio-pci/zdev: Add zPCI capabilities to VFIO_DEVICE_GET_INFO
vfio/fsl-mc: Add support for device reset
vfio/fsl-mc: Add read/write support for fsl-mc devices
vfio/fsl-mc: trigger an interrupt via eventfd
vfio/fsl-mc: Add irq infrastructure for fsl-mc devices
vfio/fsl-mc: Added lock support in preparation for interrupt handling
vfio/fsl-mc: Allow userspace to MMAP fsl-mc device MMIO regions
vfio/fsl-mc: Implement VFIO_DEVICE_GET_REGION_INFO ioctl call
vfio/fsl-mc: Implement VFIO_DEVICE_GET_INFO ioctl
vfio/fsl-mc: Scan DPRC objects on vfio-fsl-mc driver bind
vfio: Introduce capability definitions for VFIO_DEVICE_GET_INFO
s390/pci: track whether util_str is valid in the zpci_dev
s390/pci: stash version in the zpci_dev
vfio/fsl-mc: Add VFIO framework skeleton for fsl-mc devices
...
Pull remoteproc updates from Bjorn Andersson:
"This introduces support for the Mediatek MT9182 SCP and controlling
the Cortex R5F processors found in TI K3 platforms. It clones the
longstanding debugfs interface for controlling crash handling to
sysfs. Lastly it solves a bug where after a warm reset of Qualcomm
platforms the modem would crash upon first boot"
* tag 'rproc-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/andersson/remoteproc:
remoteproc/mediatek: Remove non-standard dsb()
remoteproc: Add recovery configuration to the sysfs interface
remoteproc: Add coredump as part of sysfs interface
remoteproc: Change default dump configuration to "disabled"
remoteproc: k3-r5: Add loading support for on-chip SRAM regions
remoteproc: k3-r5: Initialize TCM memories for ECC
remoteproc: k3-r5: Add a remoteproc driver for R5F subsystem
dt-bindings: remoteproc: Add bindings for R5F subsystem on TI K3 SoCs
remoteproc/mediatek: Add support for mt8192 SCP
remoteproc: Fixup coredump debugfs disable request
remoteproc: qcom_q6v5: Assign mpss region to Q6 before MBA boot
remoteproc/mediatek: fix null pointer dereference on null scp pointer
remoteproc: stm32: Fix pointer assignement
remoteproc: scp: add COMPILE_TEST dependency
Pull clk updates from Stephen Boyd:
"This contains no changes to the core framework. It is a collection of
various clk driver updates.
The biggest driver updates in terms of lines of code is the Allwinner
driver, closely followed by the Qualcomm and Mediatek drivers. All of
those hit high because we add so many lines of clk data. Coming in
fourth place is i.MX which also adds a bunch of clk data. This
accounts for the new driver additions this time around.
Otherwise the patches are lots of little cleanups and fixes for
various clk drivers that have baked in linux-next for a while. I
suppose one highlight or theme is that more clk drivers are being
updated to work as modules, which is interesting to see such critical
SoC infrastructure work as a loadable module.
New Drivers:
- Support qcom SM8150/SM8250 video and display clks
- Support Mediatek MT8167 clks
- Add clock for CRC block found on vf610 SoCs
- Add support for the Renesas R-Car V3U (R8A779A0) SoC
- Add support for the VSP for Resizing clock on Renesas RZ/G1H
- Support Allwinner A100 SoC clks
Removed Drivers:
- Remove i.MX21 clock driver, as i.MX21 platform support is being
dropped
Updates:
- Change how qcom's display port clks work
- Small non-critical fixes for TI clk driver
- Remove various unused variables in clk drivers
- Allow Rockchip clk driver to be a module
- Remove most __clk_lookup() calls in Samsung drivers (yay!)
- Support building i.MX ARMv8 platforms clock driver as module
- Some kerneldoc fixes here and there
- A couple of minor i.MX clk data corrections
- Update audio clock inverter and fdiv2 flag on Amlogic g12
- Make amlogic clk drivers configurable in Kconfig
- Fix Renesas VSP clock names to match corrected hardware
documentation
- Sigma-delta modulation on Allwinner R40
- Various fixes for at91 clk driver
- Use semicolons instead of commas in some places
- Mark some variables const so they can move to RO memory"
* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (102 commits)
clk: imx8mq: Fix usdhc parents order
clk: qcom: gdsc: Keep RETAIN_FF bit set if gdsc is already on
clk: Restrict CLK_HSDK to ARC_SOC_HSDK
clk: at91: sam9x60: support only two programmable clocks
clk: ingenic: Respect CLK_SET_RATE_PARENT in .round_rate
clk: ingenic: Don't tag custom clocks with CLK_SET_RATE_PARENT
clk: ingenic: Don't use CLK_SET_RATE_GATE for PLL
clk: ingenic: Use readl_poll_timeout instead of custom loop
clk: ingenic: Use to_clk_info() macro for all clocks
clk: bcm2835: add missing release if devm_clk_hw_register fails
clk: at91: clk-sam9x60-pll: remove unused variable
clk: at91: clk-main: update key before writing AT91_CKGR_MOR
clk: at91: remove the checking of parent_name
clk: clk-prima2: fix return value check in prima2_clk_init()
clk: mmp2: Fix the display clock divider base
clk: pxa: Constify static struct clk_ops
clk: baikal-t1: Mark Ethernet PLL as critical
clk: qoriq: modify MAX_PLL_DIV to 32
clk: axi-clkgen: Set power bits for fractional mode
clk: axi-clkgen: Add support for fractional dividers
...