Masami Hiramatsu says:
====================
Hi,
These fprobe patches are for fixing the warnings by Smatch and sparse.
This is arch independent part of the fixes.
Thank you,
---
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
'dma-ranges' in the example is written for cell sizes of 2 cells, but
the schema and example specify sizes of 1 cell. As the h/w has a bus
address of >32-bits, cell sizes of 2 is correct. Update the schema's
'#address-cells' and '#size-cells' to be 2 and adjust the example
throughout.
There's no error currently because dtc only checks 'dma-ranges' is a
correct multiple number of cells (3) and the schema checking is based on
bracketing of entries.
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20220301233501.2110047-1-robh@kernel.org
Pull ptrace cleanups from Eric Biederman:
"This set of changes removes tracehook.h, moves modification of all of
the ptrace fields inside of siglock to remove races, adds a missing
permission check to ptrace.c
The removal of tracehook.h is quite significant as it has been a major
source of confusion in recent years. Much of that confusion was around
task_work and TIF_NOTIFY_SIGNAL (which I have now decoupled making the
semantics clearer).
For people who don't know tracehook.h is a vestiage of an attempt to
implement uprobes like functionality that was never fully merged, and
was later superseeded by uprobes when uprobes was merged. For many
years now we have been removing what tracehook functionaly a little
bit at a time. To the point where anything left in tracehook.h was
some weird strange thing that was difficult to understand"
* tag 'ptrace-cleanups-for-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
ptrace: Remove duplicated include in ptrace.c
ptrace: Check PTRACE_O_SUSPEND_SECCOMP permission on PTRACE_SEIZE
ptrace: Return the signal to continue with from ptrace_stop
ptrace: Move setting/clearing ptrace_message into ptrace_stop
tracehook: Remove tracehook.h
resume_user_mode: Move to resume_user_mode.h
resume_user_mode: Remove #ifdef TIF_NOTIFY_RESUME in set_notify_resume
signal: Move set_notify_signal and clear_notify_signal into sched/signal.h
task_work: Decouple TIF_NOTIFY_SIGNAL and task_work
task_work: Call tracehook_notify_signal from get_signal on all architectures
task_work: Introduce task_work_pending
task_work: Remove unnecessary include from posix_timers.h
ptrace: Remove tracehook_signal_handler
ptrace: Remove arch_syscall_{enter,exit}_tracehook
ptrace: Create ptrace_report_syscall_{entry,exit} in ptrace.h
ptrace/arm: Rename tracehook_report_syscall report_syscall
ptrace: Move ptrace_report_syscall into ptrace.h
Pull shm ucounts fix from Eric Biederman:
"The introduction of a new failure mode when the code was converted to
ucounts resulted in user_shm_lock misbehaving.
The change simplifies the code to make the code easier to follow and
removes the known misbehaviors"
* tag 'ucount-rlimit-for-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
mm/mlock: fix two bugs in user_shm_lock()
Pull networking fixes from Jakub Kicinski:
"Including fixes from netfilter.
Current release - regressions:
- llc: only change llc->dev when bind() succeeds, fix null-deref
Current release - new code bugs:
- smc: fix a memory leak in smc_sysctl_net_exit()
- dsa: realtek: make interface drivers depend on OF
Previous releases - regressions:
- sched: act_ct: fix ref leak when switching zones
Previous releases - always broken:
- netfilter: egress: report interface as outgoing
- vsock/virtio: enable VQs early on probe and finish the setup before
using them
Misc:
- memcg: enable accounting for nft objects"
* tag 'net-5.18-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (39 commits)
Revert "selftests: net: Add tls config dependency for tls selftests"
net/smc: Send out the remaining data in sndbuf before close
net: move net_unlink_todo() out of the header
net: dsa: bcm_sf2_cfp: fix an incorrect NULL check on list iterator
net: bnxt_ptp: fix compilation error
selftests: net: Add tls config dependency for tls selftests
memcg: enable accounting for nft objects
net/sched: act_ct: fix ref leak when switching zones
net/smc: fix a memory leak in smc_sysctl_net_exit()
selftests: tls: skip cmsg_to_pipe tests with TLS=n
octeontx2-af: initialize action variable
net: sparx5: switchdev: fix possible NULL pointer dereference
net/x25: Fix null-ptr-deref caused by x25_disconnect
qlcnic: dcb: default to returning -EOPNOTSUPP
net: sparx5: depends on PTP_1588_CLOCK_OPTIONAL
net: hns3: fix phy can not link up when autoneg off and reset
net: hns3: add NULL pointer check for hns3_set/get_ringparam()
net: hns3: add netdev reset check for hns3_set_tunable()
net: hns3: clean residual vf config after disable sriov
net: hns3: add max order judgement for tx spare buffer
...
If there is already an entry present that is of order >= XA_CHUNK_SHIFT
when we call xas_create_range(), xas_create_range() will misinterpret
that entry as a node and dereference xa_node->parent, generally leading
to a crash that looks something like this:
general protection fault, probably for non-canonical address 0xdffffc0000000001:
0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f]
CPU: 0 PID: 32 Comm: khugepaged Not tainted 5.17.0-rc8-syzkaller-00003-g56e337f2cf13 #0
RIP: 0010:xa_parent_locked include/linux/xarray.h:1207 [inline]
RIP: 0010:xas_create_range+0x2d9/0x6e0 lib/xarray.c:725
It's deterministically reproducable once you know what the problem is,
but producing it in a live kernel requires khugepaged to hit a race.
While the problem has been present since xas_create_range() was
introduced, I'm not aware of a way to hit it before the page cache was
converted to use multi-index entries.
Fixes: 6b24ca4a1a ("mm: Use multi-index entries in the page cache")
Reported-by: syzbot+0d2b0bf32ca5cfd09f2e@syzkaller.appspotmail.com
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
The current autocork algorithms will delay the data transmission
in BH context to smc_release_cb() when sock_lock is hold by user.
So there is a possibility that when connection is being actively
closed (sock_lock is hold by user now), some corked data still
remains in sndbuf, waiting to be sent by smc_release_cb(). This
will cause:
- smc_close_stream_wait(), which is called under the sock_lock,
has a high probability of timeout because data transmission is
delayed until sock_lock is released.
- Unexpected data sends may happen after connction closed and use
the rtoken which has been deleted by remote peer through
LLC_DELETE_RKEY messages.
So this patch will try to send out the remaining corked data in
sndbuf before active close process, to ensure data integrity and
avoid unexpected data transmission after close.
Reported-by: Guangguan Wang <guangguan.wang@linux.alibaba.com>
Fixes: 6b88af839d ("net/smc: don't send in the BH context if sock_owned_by_user")
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Acked-by: Karsten Graul <kgraul@linux.ibm.com>
Link: https://lore.kernel.org/r/1648447836-111521-1-git-send-email-guwen@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Currently the way the tid (tree connection) status is tracked
is confusing. The same enum is used for structs cifs_tcon
and cifs_ses and TCP_Server_info, but each of these three has
different states that they transition among. The current
code also unnecessarily uses camelCase.
Convert from use of statusEnum to a new tid_status_enum for
tree connections. The valid states for a tid are:
TID_NEW = 0,
TID_GOOD,
TID_EXITING,
TID_NEED_RECON,
TID_NEED_TCON,
TID_IN_TCON,
TID_NEED_FILES_INVALIDATE, /* unused, considering removing in future */
TID_IN_FILES_INVALIDATE
It also removes CifsNeedTcon, CifsInTcon, CifsNeedFilesInvalidate and
CifsInFilesInvalidate from the statusEnum used for session and
TCP_Server_Info since they are not relevant for those.
A follow on patch will fix the places where we use the
tcon->need_reconnect flag to be more consistent with the tid->status.
Also fixes a bug that was:
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Pull kgdb update from Daniel Thompson:
"Only a single patch this cycle. Fix an obvious mistake with the kdb
memory accessors.
It was a stupid mistake (to/from backwards) but it has been there for
a long time since many architectures tolerated it with surprisingly
good grace"
* tag 'kgdb-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux:
kdb: Fix the putarea helper function
Pull hexagon update from Brian Cain:
"Maintainer email update"
* tag 'hexagon-5.18-0' of git://git.kernel.org/pub/scm/linux/kernel/git/bcain/linux:
MAINTAINERS: update hexagon maintainer email, tree
Pull microblaze updates from Michal Simek:
- Small fixups
- Remove unused pci_phys_mem_access_prot()
* tag 'microblaze-v5.18' of git://git.monstr.eu/linux-2.6-microblaze:
microblaze/PCI: Remove pci_phys_mem_access_prot() dead code
microblaze: add const to of_device_id
microblaze: fix typo in a comment
The bug is here:
return rule;
The list iterator value 'rule' will *always* be set and non-NULL
by list_for_each_entry(), so it is incorrect to assume that the
iterator value will be NULL if the list is empty or no element
is found.
To fix the bug, return 'rule' when found, otherwise return NULL.
Fixes: ae7a5aff78 ("net: dsa: bcm_sf2: Keep copy of inserted rules")
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Xiaomeng Tong <xiam0nd.tong@gmail.com>
Link: https://lore.kernel.org/r/20220328032431.22538-1-xiam0nd.tong@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Pull livepatching updates from Petr Mladek:
- Forced transitions block only to-be-removed livepatches [Chengming]
- Detect when ftrace handler could not be disabled in self-tests [David]
- Calm down warning from a static analyzer [Tom]
* tag 'livepatching-for-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching:
livepatch: Reorder to use before freeing a pointer
livepatch: Don't block removal of patches that are safe to unload
livepatch: Skip livepatch tests if ftrace cannot be configured
The Architecture chapter of the KUnit documentation tried to include
copies of the kernel-doc for a couple of things, despite these already
existing in the API documentation. This lead to some warnings:
architecture:31: ./include/kunit/test.h:3: WARNING: Duplicate C declaration, also defined at dev-tools/kunit/api/test:66.
Declaration is '.. c:struct:: kunit_case'.
architecture:163: ./include/kunit/test.h:1217: WARNING: Duplicate C declaration, also defined at dev-tools/kunit/api/test:1217.
Declaration is '.. c:macro:: KUNIT_ARRAY_PARAM'.
architecture.rst:3: WARNING: Duplicate C declaration, also defined at dev-tools/kunit/api/test:66.
Declaration is '.. c:struct:: kunit_case'.
architecture.rst:1217: WARNING: Duplicate C declaration, also defined at dev-tools/kunit/api/test:1217.
Declaration is '.. c:macro:: KUNIT_ARRAY_PARAM'.
Get rid of these, and cleanup the mentions of the struct and macro in
question so that sphinx generates a link to the existing copy of the
documentation in the api/test document.
Fixes: bc145b370c ("Documentation: KUnit: Added KUnit Architecture")
Signed-off-by: David Gow <davidgow@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Tested-by: Brendan Higgins <brendanhiggins@google.com>
Link: https://lore.kernel.org/r/20220326054414.637293-1-davidgow@google.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Pull xen updates from Juergen Gross:
- A bunch of minor cleanups
- A fix for kexec in Xen dom0 when executed on a high cpu number
- A fix for resuming after suspend of a Xen guest with assigned PCI
devices
- A fix for a crash due to not disabled preemption when resuming as Xen
dom0
* tag 'for-linus-5.18-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen: fix is_xen_pmu()
xen: don't hang when resuming PCI device
arch:x86:xen: Remove unnecessary assignment in xen_apic_read()
xen/grant-table: remove readonly parameter from functions
xen/grant-table: remove gnttab_*transfer*() functions
drivers/xen: use helper macro __ATTR_RW
x86/xen: Fix kerneldoc warning
xen: delay xen_hvm_init_time_ops() if kdump is boot on vcpu>=32
xen: use time_is_before_eq_jiffies() instead of open coding it
jgnop mnemonic is only available since binutils 2.36,
kernel minimal required version is 2.23. Stick to brcl
to avoid build errors.
Reported-by: Nathan Chancellor <nathan@kernel.org>
Fixes: 4afeb67071 ("s390/alternatives: use instructions instead of byte patterns")
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
If mlx5_vdpa gets unloaded while a VM is running, the workqueue will be
destroyed. However, vhost might still have reference to the kick
function and might attempt to push new works. This could lead to null
pointer dereference.
To fix this, set mvdev->wq to NULL just before destroying and verify
that the workqueue is not NULL in mlx5_vdpa_kick_vq before attempting to
push a new work.
Fixes: 5262912ef3 ("vdpa/mlx5: Add support for control VQ and MAC setting")
Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20220321141303.9586-1-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
vhost_iotlb_add_range_ctx() handles the range [0, ULONG_MAX] by
splitting it into two ranges and adding them separately. The return
value of adding the first range to the iotlb is currently ignored.
Check the return value and bail out in case of an error.
Signed-off-by: Anirudh Rayabharam <mail@anirudhrb.com>
Link: https://lore.kernel.org/r/20220312141121.4981-1-mail@anirudhrb.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Fixes: e2ae38cf3d ("vhost: fix hung thread due to erroneous iotlb entries")
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
When MAC Address has been modified in guest, we only re-add the
Mac to mpfs, it is not enough, because the guest network will
not work correctly: the reply package from outside will go
straight away to the host VF net interface.
This patch recreate the flow rules, and make it work correctly.
Signed-off-by: Michael Qiu <qiudayu@archeros.com>
Link: https://lore.kernel.org/r/1648446492-17614-1-git-send-email-08005325@163.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
virtio pci config structures may in future have non-standard bar
values in the bar field. We should anticipate this by skipping any
structures containing such a reserved value.
The bar value should never change: check for harmful modified values
we re-read it from the config space in vp_modern_map_capability().
Also clean up an existing check to consistently use PCI_STD_NUM_BARS.
Signed-off-by: Keir Fraser <keirf@google.com>
Link: https://lore.kernel.org/r/20220323140727.3499235-1-keirf@google.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Now it's possible to control supported hashflows.
Added hashflow set/get callbacks.
Also, disabling RXH_IP_SRC/DST for TCP would disable then for UDP.
TCP and UDP supports only:
ethtool -U eth0 rx-flow-hash tcp4 sd
RXH_IP_SRC + RXH_IP_DST
ethtool -U eth0 rx-flow-hash tcp4 sdfn
RXH_IP_SRC + RXH_IP_DST + RXH_L4_B_0_1 + RXH_L4_B_2_3
Disabling happens because VirtioNET hashtype for IP doesn't check L4 proto,
it works for all IP packets(TCP, UDP, ICMP, etc.).
For TCP and UDP, it's possible to set IP+PORT hashes.
But disabling IP hashes will disable them for TCP and UDP simultaneously.
It's possible to set IP+PORT for TCP/UDP and disable/enable IP
for everything else(UDP, ICMP, etc.).
Signed-off-by: Andrew Melnychenko <andrew@daynix.com>
Link: https://lore.kernel.org/r/20220328175336.10802-5-andrew@daynix.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Added features for RSS.
Added initialization, RXHASH feature and ethtool ops.
By default RSS/RXHASH is disabled.
Virtio RSS "IPv6 extensions" hashes disabled.
Added ethtools ops to set key and indirection table.
Signed-off-by: Andrew Melnychenko <andrew@daynix.com>
Link: https://lore.kernel.org/r/20220328175336.10802-3-andrew@daynix.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
After waking up a suspended VM, the kernel prints the following trace
for virtio drivers which do not directly call virtio_device_ready() in
the .restore:
PM: suspend exit
irq 22: nobody cared (try booting with the "irqpoll" option)
Call Trace:
<IRQ>
dump_stack_lvl+0x38/0x49
dump_stack+0x10/0x12
__report_bad_irq+0x3a/0xaf
note_interrupt.cold+0xb/0x60
handle_irq_event+0x71/0x80
handle_fasteoi_irq+0x95/0x1e0
__common_interrupt+0x6b/0x110
common_interrupt+0x63/0xe0
asm_common_interrupt+0x1e/0x40
? __do_softirq+0x75/0x2f3
irq_exit_rcu+0x93/0xe0
sysvec_apic_timer_interrupt+0xac/0xd0
</IRQ>
<TASK>
asm_sysvec_apic_timer_interrupt+0x12/0x20
arch_cpu_idle+0x12/0x20
default_idle_call+0x39/0xf0
do_idle+0x1b5/0x210
cpu_startup_entry+0x20/0x30
start_secondary+0xf3/0x100
secondary_startup_64_no_verify+0xc3/0xcb
</TASK>
handlers:
[<000000008f9bac49>] vp_interrupt
[<000000008f9bac49>] vp_interrupt
Disabling IRQ #22
This happens because we don't invoke .enable_cbs callback in
virtio_device_restore(). That callback is used by some transports
(e.g. virtio-pci) to enable interrupts.
Let's fix it, by calling virtio_device_ready() as we do in
virtio_dev_probe(). This function calls .enable_cts callback and sets
DRIVER_OK status bit.
This fix also avoids setting DRIVER_OK twice for those drivers that
call virtio_device_ready() in the .restore.
Fixes: d50497eb4e ("virtio_config: introduce a new .enable_cbs method")
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20220322114313.116516-1-sgarzare@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
When using pthreads, one has to compile and link with -lpthread,
otherwise e.g. glibc is not guaranteed to be reentrant.
This replaces -lpthread.
Reported-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Allow an admin creating a vdpa device to specify the max MTU for the
net device.
For example, to create a device with max MTU of 1000, the following
command can be used:
$ vdpa dev add name vdpa-a mgmtdev auxiliary/mlx5_core.sf.1 mtu 1000
This configuration mechanism assumes that vdpa is the sole real user of
the function. mlx5_core could theoretically change the mtu of the
function using the ip command on the mlx5_core net device but this
should not be done.
Reviewed-by: Si-Wei Liu<si-wei.liu@oracle.com>
Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20220221121927.194728-1-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>