commit 171ed0fcd8 upstream.
Commit 14a3ae34bf ("cxl: Prevent read/write to AFU config space while AFU
not configured") introduced a rwsem to fix an invalid memory access that
occurred when someone attempts to access the config space of an AFU on a
vPHB whilst the AFU is deconfigured, such as during EEH recovery.
It turns out that it's possible to run into a nested locking issue when EEH
recovery fails and a full device hotplug is required.
cxl_pci_error_detected() deconfigures the AFU, taking a writer lock on
configured_rwsem. When EEH recovery fails, the EEH code calls
pci_hp_remove_devices() to remove the device, which in turn calls
cxl_remove() -> cxl_pci_remove_afu() -> pci_deconfigure_afu(), which tries
to grab the writer lock that's already held.
Standard rwsem semantics don't express what we really want to do here and
don't allow for nested locking. Fix this by replacing the rwsem with an
atomic_t which we can control more finely. Allow the AFU to be locked
multiple times so long as there are no readers.
Fixes: 14a3ae34bf ("cxl: Prevent read/write to AFU config space while AFU not configured")
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 14a3ae34bf upstream.
During EEH recovery, we deconfigure all AFUs whilst leaving the
corresponding vPHB and virtual PCI device in place.
If something attempts to interact with the AFU's PCI config space (e.g.
running lspci) after the AFU has been deconfigured and before it's
reconfigured, cxl_pcie_{read,write}_config() will read invalid values from
the deconfigured struct cxl_afu and proceed to Oops when they try to
dereference pointers that have been set to NULL during deconfiguration.
Add a rwsem to struct cxl_afu so we can prevent interaction with config
space while the AFU is deconfigured.
Reported-by: Pradipta Ghosh <pradghos@in.ibm.com>
Suggested-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 239a3b6636 upstream.
When TX descriptors are filled in, the buffer DMA address is split
between the tx_desc->buf_phys_addr field (high-order bits) and
tx_desc->packet_offset field (5 low-order bits).
However, when we re-calculate the DMA address from the TX descriptor in
mvpp2_txq_inc_put(), we do not take tx_desc->packet_offset into
account. This means that when the DMA address is not aligned on a 32
bytes boundary, we end up calling dma_unmap_single() with a DMA address
that was not the one returned by dma_map_single().
This inconsistency is detected by the kernel when DMA_API_DEBUG is
enabled. We fix this problem by properly calculating the DMA address in
mvpp2_txq_inc_put().
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4920e3cf77 upstream.
The current implementation of setup_randomness uses the stack address
and therefore the pointer to the SYSIB 3.2.2 block as input data
address. Furthermore the length of the input data is the number of
virtual-machine description blocks which is typically one.
This means that typically a single zero byte is fed to
add_device_randomness.
Fix both of these and use the address of the first virtual machine
description block as input data address and also use the correct
length.
Fixes: bcfcbb6bae ("s390: add system information as device randomness")
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit da8fd820f3 upstream.
Commit bcfcbb6bae ("s390: add system information as device
randomness") intended to add some virtual machine specific information
to the randomness pool.
Unfortunately it uses the page allocator before it is ready to use. In
result the page allocator always returns NULL and the setup_randomness
function never adds anything to the randomness pool.
To fix this use memblock_alloc and memblock_free instead.
Fixes: bcfcbb6bae ("s390: add system information as device randomness")
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fb94a687d9 upstream.
Return a sensible value if TASK_SIZE if called from a kernel thread.
This gets us around an issue with copy_mount_options that does a magic
size calculation "TASK_SIZE - (unsigned long)data" while in a kernel
thread and data pointing to kernel space.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a4a81d8eeb upstream.
In binutils/libbfd (bfd/elf.c) it is enforced that all s390 specific ELF
notes like e.g. NT_S390_PREFIX or NT_S390_CTRS have "LINUX" specified
as note name. Otherwise the notes are ignored.
For /proc/vmcore we currently use "CORE" for these notes.
Up to now this has not been a real problem because the dump analysis tool
"crash" does not check the note name. But it will break all programs that
use libbfd for processing ELF notes.
So fix this and use "LINUX" for all s390 specific notes to comply with
libbfd.
Reported-by: Philipp Rudo <prudo@linux.vnet.ibm.com>
Reviewed-by: Philipp Rudo <prudo@linux.vnet.ibm.com>
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a63f53e34d upstream.
Since commit dd22f551 "block: Change direct_access calling convention",
the device size calculation in dcssblk_direct_access() is off-by-one.
This results in bdev_direct_access() always returning -ENXIO because the
returned value is not page aligned.
Fix this by adding 1 to the dev_sz calculation.
Fixes: dd22f551 ("block: Change direct_access calling convention")
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1e4a382fdc upstream.
For devices with multiple input queues, tiqdio_call_inq_handlers()
iterates over all input queues and clears the device's DSCI
during each iteration. If the DSCI is re-armed during one
of the later iterations, we therefore do not scan the previous
queues again.
The re-arming also raises a new adapter interrupt. But its
handler does not trigger a rescan for the device, as the DSCI
has already been erroneously cleared.
This can result in queue stalls on devices with multiple
input queues.
Fix it by clearing the DSCI just once, prior to scanning the queues.
As the code is moved in front of the loop, we also need to access
the DSCI directly (ie irq->dsci) instead of going via each queue's
parent pointer to the same irq. This is not a functional change,
and a follow-up patch will clean up the other users.
In practice, this bug only affects CQ-enabled HiperSockets devices,
ie. devices with sysfs-attribute "hsuid" set. Setting a hsuid is
needed for AF_IUCV socket applications that use HiperSockets
communication.
Fixes: 104ea556ee ("qdio: support asynchronous delivery of storage blocks")
Reviewed-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 96794e4ed4 upstream.
Guest segment selector is 16 bit field and guest segment base is natural
width field. Fix two incorrect invocations accordingly.
Without this patch, build fails when aggressive inlining is used with ICC.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e1e8a9624f upstream.
User controlled KVM guests do not support the dirty log, as they have
no single gmap that we can check for changes.
As they have no single gmap, kvm->arch.gmap is NULL and all further
referencing to it for dirty checking will result in a NULL
dereference.
Let's return -EINVAL if a caller tries to sync dirty logs for a
UCONTROL guest.
Fixes: 15f36eb ("KVM: s390: Add proper dirty bitmap support to S390 kvm.")
Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Reported-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1c9c858e2f upstream.
The MKS Instruments SCOM-0800 and SCOM-0801 cards (originally by Tenta
Technologies) are 3U CompactPCI serial cards with 4 and 8 serial ports,
respectively. The first 4 ports are implemented by an OX16PCI954 chip,
and the second 4 ports are implemented by an OX16C954 chip on a local
bus, bridged by the second PCI function of the OX16PCI954. The ports
are jumper-selectable as RS-232 and RS-422/485, and the UARTs use a
non-standard oscillator frequency of 20 MHz (base_baud = 1250000).
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 82f2341c94 upstream.
Currently N_HDLC line discipline uses a self-made singly linked list for
data buffers and has n_hdlc.tbuf pointer for buffer retransmitting after
an error.
The commit be10eb7589
("tty: n_hdlc add buffer flushing") introduced racy access to n_hdlc.tbuf.
After tx error concurrent flush_tx_queue() and n_hdlc_send_frames() can put
one data buffer to tx_free_buf_list twice. That causes double free in
n_hdlc_release().
Let's use standard kernel linked list and get rid of n_hdlc.tbuf:
in case of tx error put current data buffer after the head of tx_buf_list.
Signed-off-by: Alexander Popov <alex.popov@linux.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We see this happens multiple times in heavy workload in systrace
and AMS stuck in uid_lock.
Running process: Process 953
Running thread: android.ui
State: Uninterruptible Sleep
Start:
1,025.628 ms
Duration:
27,955.949 ms
On CPU:
Running instead: system_server
Args:
{kernel callsite when blocked:: "uid_procstat_write+0xb8/0x144"}
Changing to rt_mutex can mitigate the priority inversion
Bug: 34991231
Bug: 34193533
Test: on marlin
Change-Id: I28eb3971331cea60b1075740c792ab87d103262c
Signed-off-by: Wei Wang <wvw@google.com>
A task can cancel writes made by other tasks. In rare cases,
cancelled_write_bytes is larger than write_bytes if the task
itself didn't make any write. This doesn't affect total size
but may cause confusion when looking at IO usage on individual
tasks.
Bug: 35851986
Change-Id: If6cb549aeef9e248e18d804293401bb2b91918ca
Signed-off-by: Jin Qian <jinqian@google.com>
This module tracks cputime and io stats.
Signed-off-by: Jin Qian <jinqian@google.com>
Bug: 34198239
Change-Id: I9ee7d9e915431e0bb714b36b5a2282e1fdcc7342
IO usages are accounted in foreground and background buckets.
For each uid, io usage is calculated in two steps.
delta = current total of all uid tasks - previus total
current bucket += delta
Bucket is determined by current uid stat. Userspace writes to
/proc/uid_procstat/set <uid> <stat> when uid stat is updated.
/proc/uid_io/stats shows IO usage in this format.
<uid> <foreground IO> <background IO>
Signed-off-by: Jin Qian <jinqian@google.com>
Bug: 34198239
Change-Id: Ib8bebda53e7a56f45ea3eb0ec9a3153d44188102
commit 26c2154816 ("ANDROID: sched: Introduce Window Assisted Load
Tracking (WALT)") which was a forward port of WALT from android 4.4
missed locking for 2 set_task_cpu() calls, re-added here.
Signed-off-by: Andres Oportus <andresoportus@google.com>
(cherry picked from commit 0ddb8e0b78)
Since, arm64 can support all offset within a double word limit. Therefore,
now support other lengths within that range as well.
Signed-off-by: Pratyush Anand <panand@redhat.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Pavel Labath <labath@google.com>
Change-Id: Ibcb263a3903572336ccbf96e0180d3990326545a
Bug: 30919905
(cherry picked from commit fdfeff0f9e)
Arm64 hardware does not always report a watchpoint hit address that
matches one of the watchpoints set. It can also report an address
"near" the watchpoint if a single instruction access both watched and
unwatched addresses. There is no straight-forward way, short of
disassembling the offending instruction, to map that address back to
the watchpoint.
Previously, when the hardware reported a watchpoint hit on an address
that did not match our watchpoint (this happens in case of instructions
which access large chunks of memory such as "stp") the process would
enter a loop where we would be continually resuming it (because we did
not recognise that watchpoint hit) and it would keep hitting the
watchpoint again and again. The tracing process would never get
notified of the watchpoint hit.
This commit fixes the problem by looking at the watchpoints near the
address reported by the hardware. If the address does not exactly match
one of the watchpoints we have set, it attributes the hit to the
nearest watchpoint we have. This heuristic is a bit dodgy, but I don't
think we can do much more, given the hardware limitations.
Signed-off-by: Pavel Labath <labath@google.com>
[panand: reworked to rebase on his patches]
Signed-off-by: Pratyush Anand <panand@redhat.com>
[will: use __ffs instead of ffs - 1]
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Pavel Labath <labath@google.com>
Change-Id: I714dfaa3947d89d89a9e9a1ea84914d44ba0faa3
Bug: 30919905
ARM64 hardware supports watchpoint at any double word aligned address.
However, it can select any consecutive bytes from offset 0 to 7 from that
base address. For example, if base address is programmed as 0x420030 and
byte select is 0x1C, then access of 0x420032,0x420033 and 0x420034 will
generate a watchpoint exception.
Currently, we do not have such modularity. We can only program byte,
halfword, word and double word access exception from any base address.
This patch adds support to overcome above limitations.
Signed-off-by: Pratyush Anand <panand@redhat.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Pavel Labath <labath@google.com>
Change-Id: I28b1ca63f63182c10c3d6b6b3bacf6c56887ddbe
Bug: 30919905
(cherry picked from commit 651be3cb08)
We only support breakpoint/watchpoint of length 1, 2, 4 and 8. If we can
support other length as well, then user may watch more data with less
number of watchpoints (provided hardware supports it). For example: if we
have to watch only 4th, 5th and 6th byte from a 64 bit aligned address, we
will have to use two slots to implement it currently. One slot will watch a
half word at offset 4 and other a byte at offset 6. If we can have a
watchpoint of length 3 then we can watch it with single slot as well.
ARM64 hardware does support such functionality, therefore adding these new
definitions in generic layer.
Signed-off-by: Pratyush Anand <panand@redhat.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Pavel Labath <labath@google.com>
Change-Id: Ie17ed89ca526e4fddf591bb4e556fdfb55fc2eac
Bug: 30919905
Reapplying fix by Darren Whobrey (Change 69674)
Fixes issues: 20545, 59667 and 61390.
With prior version of f_accessory.c, UsbAccessories would not
unbind cleanly when application is closed or i/o stopped
while the usb cable is still connected. The accessory gadget
driver would be left in an invalid state which was not reset
on subsequent binding or opening. A reboot was necessary to clear.
In some phones this issues causes the phone to reboot upon
unplugging the USB cable.
Main problem was that acc_disconnect was being called on I/O error
which reset disconnected and online.
Minor fix required to properly track setting and unsetting of
disconnected and online flags. Also added urb Q wakeup's on unbind
to help unblock waiting threads.
Tested on Nexus 7 grouper. Expected behaviour now observed:
closing accessory causes blocked i/o to interrupt with IOException.
Accessory can be restarted following closing of file handle
and re-opening.
This is a generic fix that applies to all devices.
Change-Id: I4e08b326730dd3a2820c863124cee10f7cb5501e
Signed-off-by: Darren Whobrey <d.whobrey@mildai.org>
Signed-off-by: Anson Jacob <ansonjacob.aj@gmail.com>
sock_i_uid() acquires the sk_callback_lock which does not exist for
sockets in TCP_NEW_SYN_RECV state. This results in errors showing up
as spinlock bad magic. Fix this by looking for the full sock as
suggested by Eric.
Callstack for reference -
-003|rwlock_bug
-004|arch_read_lock
-004|do_raw_read_lock
-005|raw_read_lock_bh
-006|sock_i_uid
-007|from_kuid_munged(inline)
-007|reset_timer
-008|idletimer_tg_target
-009|ipt_do_table
-010|iptable_mangle_hook
-011|nf_iterate
-012|nf_hook_slow
-013|NF_HOOK_COND(inline)
-013|ip_output
-014|ip_local_out
-015|ip_build_and_send_pkt
-016|tcp_v4_send_synack
-017|atomic_sub_return(inline)
-017|reqsk_put(inline)
-017|tcp_conn_request
-018|tcp_v4_conn_request
-019|tcp_rcv_state_process
-020|tcp_v4_do_rcv
-021|tcp_v4_rcv
-022|ip_local_deliver_finish
-023|NF_HOOK_THRESH(inline)
-023|NF_HOOK(inline)
-023|ip_local_deliver
-024|ip_rcv_finish
-025|NF_HOOK_THRESH(inline)
-025|NF_HOOK(inline)
-025|ip_rcv
-026|deliver_skb(inline)
-026|deliver_ptype_list_skb(inline)
-026|__netif_receive_skb_core
-027|__netif_receive_skb
-028|netif_receive_skb_internal
-029|netif_receive_skb
Change-Id: Ic8f3a3d2d7af31434d1163b03971994e2125d552
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Cc: Eric Dumazet <edumazet@google.com>
Andoid files frequently have spaces in them, as do cmdline strings.
Replace these spaces with '_', so tools that parse these tracepoints
don't get terribly confused.
Change-Id: I1cbbedf5c803aa6a58d9b8b7836e9125683c49d1
Signed-off-by: Mohan Srinivasan <srmohan@google.com>
(cherry picked from commit 5035d5f0933758dd515327d038e5bef7e40dbaa7)
(cherry picked from commit 6f4a2453a1)