The commit 314001f0bf ("af_unix: Add OOB support") introduced OOB for
AF_UNIX, but it lacks some changes for POLLPRI. Let's add the missing
piece.
In the selftest, normal datagrams are sent followed by OOB data, so this
commit replaces `POLLIN | POLLPRI` with just `POLLPRI` in the first test
case.
Fixes: 314001f0bf ("af_unix: Add OOB support")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
Out-of-band data automatically places a "mark" showing wherein the
sequence the out-of-band data would have been. If the out-of-band data
implies cancelling everything sent so far, the "mark" is helpful to flush
them. When the socket's read pointer reaches the "mark", the ioctl() below
sets a non zero value to the arg `atmark`:
The out-of-band data is queued in sk->sk_receive_queue as well as ordinary
data and also saved in unix_sk(sk)->oob_skb. It can be used to test if the
head of the receive queue is the out-of-band data meaning the socket is at
the "mark".
While testing that, unix_ioctl() reads unix_sk(sk)->oob_skb locklessly.
Thus, all accesses to oob_skb need some basic protection to avoid
load/store tearing which KCSAN detects when these are called concurrently:
- ioctl(fd_a, SIOCATMARK, &atmark, sizeof(atmark))
- send(fd_b_connected_to_a, buf, sizeof(buf), MSG_OOB)
BUG: KCSAN: data-race in unix_ioctl / unix_stream_sendmsg
write to 0xffff888003d9cff0 of 8 bytes by task 175 on cpu 1:
unix_stream_sendmsg (net/unix/af_unix.c:2087 net/unix/af_unix.c:2191)
sock_sendmsg (net/socket.c:705 net/socket.c:725)
__sys_sendto (net/socket.c:2040)
__x64_sys_sendto (net/socket.c:2048)
do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:113)
read to 0xffff888003d9cff0 of 8 bytes by task 176 on cpu 0:
unix_ioctl (net/unix/af_unix.c:3101 (discriminator 1))
sock_do_ioctl (net/socket.c:1128)
sock_ioctl (net/socket.c:1242)
__x64_sys_ioctl (fs/ioctl.c:52 fs/ioctl.c:874 fs/ioctl.c:860 fs/ioctl.c:860)
do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:113)
value changed: 0xffff888003da0c00 -> 0xffff888003da0d00
Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 176 Comm: unix_race_oob_i Not tainted 5.17.0-rc5-59529-g83dc4c2af682 #12
Hardware name: Red Hat KVM, BIOS 1.11.0-2.amzn2 04/01/2014
Fixes: 314001f0bf ("af_unix: Add OOB support")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
The function soc_device_match() is difficult to read for various
reasons:
- There are two loop conditions using different styles: "while (...)"
(which is BTW always true) vs. "if ... break",
- The are two return condition using different logic: "if ... return
foo" vs. "if ... else return bar".
Make the code easier to read by:
1. Removing the always-true "!ret" loop condition, and dropping the
now unneeded pre-initialization of "ret",
2. Converting "if ... break" to a proper "while (...)" loop condition,
3. Inverting the logic of the second return condition.
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/9f9107c06f7d065ae6581e5290ef5d72f7298fd1.1646132835.git.geert+renesas@glider.be
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There are 3 copies of the same device cleanup code used for probe failure,
testing re-probing, and device unbinding. Changes to this code often miss
at least one of the copies of the code. See commits d0243bbd5d ("drivers
core: Free dma_range_map when driver probe failed") and d8f7a5484f
("driver core: Free DMA range map when device is released") for example.
Let's refactor the code to its own function.
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20220223225257.1681968-2-robh@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There is a race between reset and the transmit paths that can lead to
ibmvnic_xmit() accessing an scrq after it has been freed in the reset
path. It can result in a crash like:
Kernel attempted to read user page (0) - exploit attempt? (uid: 0)
BUG: Kernel NULL pointer dereference on read at 0x00000000
Faulting instruction address: 0xc0080000016189f8
Oops: Kernel access of bad area, sig: 11 [#1]
...
NIP [c0080000016189f8] ibmvnic_xmit+0x60/0xb60 [ibmvnic]
LR [c000000000c0046c] dev_hard_start_xmit+0x11c/0x280
Call Trace:
[c008000001618f08] ibmvnic_xmit+0x570/0xb60 [ibmvnic] (unreliable)
[c000000000c0046c] dev_hard_start_xmit+0x11c/0x280
[c000000000c9cfcc] sch_direct_xmit+0xec/0x330
[c000000000bfe640] __dev_xmit_skb+0x3a0/0x9d0
[c000000000c00ad4] __dev_queue_xmit+0x394/0x730
[c008000002db813c] __bond_start_xmit+0x254/0x450 [bonding]
[c008000002db8378] bond_start_xmit+0x40/0xc0 [bonding]
[c000000000c0046c] dev_hard_start_xmit+0x11c/0x280
[c000000000c00ca4] __dev_queue_xmit+0x564/0x730
[c000000000cf97e0] neigh_hh_output+0xd0/0x180
[c000000000cfa69c] ip_finish_output2+0x31c/0x5c0
[c000000000cfd244] __ip_queue_xmit+0x194/0x4f0
[c000000000d2a3c4] __tcp_transmit_skb+0x434/0x9b0
[c000000000d2d1e0] __tcp_retransmit_skb+0x1d0/0x6a0
[c000000000d2d984] tcp_retransmit_skb+0x34/0x130
[c000000000d310e8] tcp_retransmit_timer+0x388/0x6d0
[c000000000d315ec] tcp_write_timer_handler+0x1bc/0x330
[c000000000d317bc] tcp_write_timer+0x5c/0x200
[c000000000243270] call_timer_fn+0x50/0x1c0
[c000000000243704] __run_timers.part.0+0x324/0x460
[c000000000243894] run_timer_softirq+0x54/0xa0
[c000000000ea713c] __do_softirq+0x15c/0x3e0
[c000000000166258] __irq_exit_rcu+0x158/0x190
[c000000000166420] irq_exit+0x20/0x40
[c00000000002853c] timer_interrupt+0x14c/0x2b0
[c000000000009a00] decrementer_common_virt+0x210/0x220
--- interrupt: 900 at plpar_hcall_norets_notrace+0x18/0x2c
The immediate cause of the crash is the access of tx_scrq in the following
snippet during a reset, where the tx_scrq can be either NULL or an address
that will soon be invalid:
ibmvnic_xmit()
{
...
tx_scrq = adapter->tx_scrq[queue_num];
txq = netdev_get_tx_queue(netdev, queue_num);
ind_bufp = &tx_scrq->ind_buf;
if (test_bit(0, &adapter->resetting)) {
...
}
But beyond that, the call to ibmvnic_xmit() itself is not safe during a
reset and the reset path attempts to avoid this by stopping the queue in
ibmvnic_cleanup(). However just after the queue was stopped, an in-flight
ibmvnic_complete_tx() could have restarted the queue even as the reset is
progressing.
Since the queue was restarted we could get a call to ibmvnic_xmit() which
can then access the bad tx_scrq (or other fields).
We cannot however simply have ibmvnic_complete_tx() check the ->resetting
bit and skip starting the queue. This can race at the "back-end" of a good
reset which just restarted the queue but has not cleared the ->resetting
bit yet. If we skip restarting the queue due to ->resetting being true,
the queue would remain stopped indefinitely potentially leading to transmit
timeouts.
IOW ->resetting is too broad for this purpose. Instead use a new flag
that indicates whether or not the queues are active. Only the open/
reset paths control when the queues are active. ibmvnic_complete_tx()
and others wake up the queue only if the queue is marked active.
So we will have:
A. reset/open thread in ibmvnic_cleanup() and __ibmvnic_open()
->resetting = true
->tx_queues_active = false
disable tx queues
...
->tx_queues_active = true
start tx queues
B. Tx interrupt in ibmvnic_complete_tx():
if (->tx_queues_active)
netif_wake_subqueue();
To ensure that ->tx_queues_active and state of the queues are consistent,
we need a lock which:
- must also be taken in the interrupt path (ibmvnic_complete_tx())
- shared across the multiple queues in the adapter (so they don't
become serialized)
Use rcu_read_lock() and have the reset thread synchronize_rcu() after
updating the ->tx_queues_active state.
While here, consolidate a few boolean fields in ibmvnic_adapter for
better alignment.
Based on discussions with Brian King and Dany Madden.
Fixes: 7ed5b31f4a ("net/ibmvnic: prevent more than one thread from running in reset")
Reported-by: Vaishnavi Bhat <vaish123@in.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Google Coreboot implementation requires IOMEM functions
(memmremap, memunmap, devm_memremap), but does not specify this is its
Kconfig. This results in build errors when HAS_IOMEM is not set, such as
on some UML configurations:
/usr/bin/ld: drivers/firmware/google/coreboot_table.o: in function `coreboot_table_probe':
coreboot_table.c:(.text+0x311): undefined reference to `memremap'
/usr/bin/ld: coreboot_table.c:(.text+0x34e): undefined reference to `memunmap'
/usr/bin/ld: drivers/firmware/google/memconsole-coreboot.o: in function `memconsole_probe':
memconsole-coreboot.c:(.text+0x12d): undefined reference to `memremap'
/usr/bin/ld: memconsole-coreboot.c:(.text+0x17e): undefined reference to `devm_memremap'
/usr/bin/ld: memconsole-coreboot.c:(.text+0x191): undefined reference to `memunmap'
/usr/bin/ld: drivers/firmware/google/vpd.o: in function `vpd_section_destroy.isra.0':
vpd.c:(.text+0x300): undefined reference to `memunmap'
/usr/bin/ld: drivers/firmware/google/vpd.o: in function `vpd_section_init':
vpd.c:(.text+0x382): undefined reference to `memremap'
/usr/bin/ld: vpd.c:(.text+0x459): undefined reference to `memunmap'
/usr/bin/ld: drivers/firmware/google/vpd.o: in function `vpd_probe':
vpd.c:(.text+0x59d): undefined reference to `memremap'
/usr/bin/ld: vpd.c:(.text+0x5d3): undefined reference to `memunmap'
collect2: error: ld returned 1 exit status
Fixes: a28aad66da ("firmware: coreboot: Collapse platform drivers into bus core")
Acked-By: anton ivanov <anton.ivanov@cambridgegreys.com>
Acked-By: Julius Werner <jwerner@chromium.org>
Signed-off-by: David Gow <davidgow@google.com>
Link: https://lore.kernel.org/r/20220225041502.1901806-1-davidgow@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ADSP/MDSP/SDSP are by default secured, which means it can only be loaded
with a Signed process.
Where as CDSP can be either be secured/unsecured. non-secured Compute DSP
would allow users to load unsigned process and run hexagon instructions,
but blocking access to secured hardware within the DSP. Where as signed
process with secure CDSP would be allowed to access all the dsp resources.
This patch adds basic code to create device nodes as per device tree property.
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Link: https://lore.kernel.org/r/20220214161002.6831-6-srinivas.kandagatla@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently fastrpc misc device instance is within channel context struct
with a kref. So we have 2 structs with refcount, both of them managing the
same channel context structure.
Separate fastrpc device from channel context and by adding a dedicated
fastrpc_device structure, this should clean the structures a bit and also help
when adding secure device node support.
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Link: https://lore.kernel.org/r/20220214161002.6831-2-srinivas.kandagatla@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Multiple pr_infos generate newlines, so the hexdump looks like...
> 0x81: count=16, status:
> 01
> 00
> 20
(...16 lines...)
We switch to a single %*ph hexdump, using the built-in %ph format,
which leads to this:
[52769.348789] usb 2-1.3.1: Clearing ep0x83.
[52769.349729] usb 2-1.3.1: ep_status=0x81, count=16,...
...status=01:00:20:40:05:04:04:00:20:53:00:00:00:00:00:00
Signed-off-by: Christian Vogel <vogelchr@vogel.cx>
Link: https://lore.kernel.org/r/20220311192833.1792-2-vogelchr@vogel.cx
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Functions like mhi_read_reg_field(), mhi_poll_reg_field() and
mhi_write_reg_field() could be modified to not depend on the shift value
passed as an argument. Instead, the bitfield operation could be used to
extract the shift value from the mask itself.
This eliminates the need to define _SHIFT (and _SHFT) macros and
simplifies the code a bit. For shift values those cannot be determined
during build time, "__ffs()" helper is used find the shift value during
runtime.
While at it, let's also get rid of 32-bit masks like CHDBOFF_CHDBOFF_MASK
by doing the full 32-bit register read.
Suggested-by: Alex Elder <elder@linaro.org>
Reviewed-by: Alex Elder <elder@linaro.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20220301160308.107452-6-manivannan.sadhasivam@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>