[ Upstream commit 8759fec4af ]
Currently, inner IV/DIGEST data are only copied once into the hash
engines and not set explicitly before launching a request that is not a
first frag. This is an issue especially when multiple ahash reqs are
computed in parallel or chained with cipher request, as the state of the
request being computed is not updated into the hash engine. It leads to
non-deterministic corrupted digest results.
Fixes: commit 2786cee8e5 ("crypto: marvell - Move SRAM I/O operations to step functions")
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 0ea617a298 ]
On an error, snd_ctl_add already free's kctrl, so calling snd_ctl_free_one
to free it again leads to a double free error. Fix this by removing
the extraneous snd_ctl_free_one call.
Issue found using static analysis with CoverityScan, CID 1372908
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 3e38df136e ]
BUG: KASAN: slab-out-of-bounds in nf_tables_rule_destroy+0xf1/0x130 at addr ffff88006a4c35c8
Read of size 8 by task nft/1607
When we've destroyed last valid expr, nft_expr_next() returns an invalid expr.
We must not dereference it unless it passes != nft_expr_last() check.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c2e756ff9e ]
Using smp_processor_id() causes splats with PREEMPT_RCU:
[19379.552780] BUG: using smp_processor_id() in preemptible [00000000] code: ping/32389
[19379.552793] caller is debug_smp_processor_id+0x17/0x19
[...]
[19379.552823] Call Trace:
[19379.552832] [<ffffffff81274e9e>] dump_stack+0x67/0x90
[19379.552837] [<ffffffff8129a4d4>] check_preemption_disabled+0xe5/0xf5
[19379.552842] [<ffffffff8129a4fb>] debug_smp_processor_id+0x17/0x19
[19379.552849] [<ffffffffa07c42dd>] nft_queue_eval+0x35/0x20c [nft_queue]
No need to disable preemption since we only fetch the numeric value, so
let's use raw_smp_processor_id() instead.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 91ca1a8c58 ]
At the end of function ad7150_write_event_config(), directly returns 0.
As a result, the errors will be ignored by the callers. It may be better
to return variable "ret".
Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit db4e5376d0 ]
In function cm3232_reg_init(), it returns 0 even if the last call to
i2c_smbus_write_byte_data() returns a negative value (indicates error).
As a result, the return value may be inconsistent with the execution
status, and the caller of cm3232_reg_init() will not be able to detect
the error. This patch fixes the bug.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188641
Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d15697de60 ]
The driver does not check if mapping dma memory succeed.
The patch adds the checks and failure handling.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 76f43b4c0a ]
mesh_sync_offset_adjust_tbtt() implements Extensible synchronization
framework ([1] 13.13.2 Extensible synchronization framework). It shall
not operate the flag "TBTT Adjusting subfield" ([1] 8.4.2.100.8 Mesh
Capability), since it is used only for MBCA ([1] 13.13.4 Mesh beacon
collision avoidance, see 13.13.4.4.3 TBTT scanning and adjustment
procedures for detail). So this patch remove the flag operations.
[1] IEEE Std 802.11 2012
Signed-off-by: Masashi Honma <masashi.honma@gmail.com>
[remove adjusting_tbtt entirely, since it's now unused]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit ad6d8004fa ]
Currently the chip name buffer is allocated on the stack and the
address of the buffer is passed to the gpio framework. It's invalid
after probe() returns, so the sysfs label attribute displays garbage.
Use devm_kasprintf() for each string instead.
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 972aa2c708 ]
Setting shutup when the action is HDA_FIXUP_ACT_PRE_PROBE might
not have the desired effect since it could be overridden by
another more generic shutup function. Prevent this by setting
the more specific shutup function on HDA_FIXUP_ACT_PROBE.
Signed-off-by: Gabriele Mazzotta <gabriele.mzt@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 486b5c22ea ]
With the added support for the bnxt_re RDMA driver, both drivers can be
allocating completion rings in any order. The firmware does not know
which completion ring should be receiving async events. Add an
extra step to tell firmware the completion ring number for receiving
async events after bnxt_en allocates the completion rings.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 097e46d2ae ]
ath10k_wmi_tlv_op_pull_fw_stats() uses tb = ath10k_wmi_tlv_parse_alloc(...)
function, which allocates memory. If any of the three error-paths are
taken, this tb needs to be freed.
Signed-off-by: Christian Lamparter <chunkeey@googlemail.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d2e202c06c ]
With command to get board_id from otp, in the case of following
boot get otp board id result 0x00000000 board_id 0 chip_id 0
boot using board name 'bus=pci,bmi-chip-id=0,bmi-board-id=0"
...
failed to fetch board data for bus=pci,bmi-chip-id=0,bmi-board-id=0 from
ath10k/QCA6174/hw3.0/board-2.bin
The invalid board_id=0 will be used as index to search in the board-2.bin.
Ignore the case with board_id=0, as it means the otp is not carrying
the board id information.
Signed-off-by: Ryan Hsu <ryanhsu@qca.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 88407beb1b ]
Ath10k reports the phy capability that supports P2P_DEVICE interface.
When we use the P2P supported wpa_supplicant to start connection, it'll
create two interfaces, one is wlan0 (vdev_id=0) and one is P2P_DEVICE
p2p-dev-wlan0 which is for p2p control channel (vdev_id=1).
ath10k_pci mac vdev create 0 (add interface) type 2 subtype 0
ath10k_add_interface: vdev_id: 0, txpower: 0, bss_power: 0
...
ath10k_pci mac vdev create 1 (add interface) type 2 subtype 1
ath10k_add_interface: vdev_id: 1, txpower: 0, bss_power: 0
And the txpower in per vif bss_conf will only be set to valid tx power when
the interface is assigned with channel_ctx.
But this P2P_DEVICE interface will never be used for any connection, so
that the uninitialized bss_conf.txpower=0 is assinged to the
arvif->txpower when interface created.
Since the txpower configuration is firmware per physical interface.
So the smallest txpower of all vifs will be the one limit the tx power
of the physical device, that causing the low txpower issue on other
active interfaces.
wlan0: Limiting TX power to 21 (24 - 3) dBm
ath10k_pci mac vdev_id 0 txpower 21
ath10k_mac_txpower_recalc: vdev_id: 1, txpower: 0
ath10k_mac_txpower_recalc: vdev_id: 0, txpower: 21
ath10k_pci mac txpower 0
This issue only happens when we use the wpa_supplicant that supports
P2P or if we use the iw tool to create the control P2P_DEVICE interface.
Signed-off-by: Ryan Hsu <ryanhsu@qca.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 74c8719b8e ]
If we have sdio work requests received when sdio card reset is
happening, we may end up accessing older save_adapter pointer
later which is already freed during card reset.
This patch solves the problem by cancelling those pending requests.
Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 7bb387c5ab ]
IP_MULTICAST_IF fails if sk_bound_dev_if is already set and the new index
does not match it. e.g.,
ntpd[15381]: setsockopt IP_MULTICAST_IF 192.168.1.23 fails: Invalid argument
Relax the check in setsockopt to allow setting mc_index to an L3 slave if
sk_bound_dev_if points to an L3 master.
Make a similar change for IPv6. In this case change the device lookup to
take the rcu_read_lock avoiding a refcnt. The rcu lock is also needed for
the lookup of a potential L3 master device.
This really only silences a setsockopt failure since uses of mc_index are
secondary to sk_bound_dev_if if it is set. In both cases, if either index
is an L3 slave or master, lookups are directed to the same FIB table so
relaxing the check at setsockopt time causes no harm.
Patch is based on a suggested change by Darwin for a problem noted in
their code base.
Suggested-by: Darwin Dingel <darwin.dingel@alliedtelesis.co.nz>
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit dffd0cfa06 ]
As part of an effort to clean up fscrypt-related error codes, make
FS_IOC_SET_ENCRYPTION_POLICY fail with ENOTDIR when the file descriptor
does not refer to a directory. This is more descriptive than EINVAL,
which was ambiguous with some of the other error cases.
I am not aware of any users who might be relying on the previous error
code of EINVAL, which was never documented anywhere, and in some buggy
kernels did not exist at all as the S_ISDIR() check was missing.
This failure case will be exercised by an xfstest.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 54475f531b ]
As part of an effort to clean up fscrypt-related error codes, make
attempting to create a file in an encrypted directory that hasn't been
"unlocked" fail with ENOKEY. Previously, several error codes were used
for this case, including ENOENT, EACCES, and EPERM, and they were not
consistent between and within filesystems. ENOKEY is a better choice
because it expresses that the failure is due to lacking the encryption
key. It also matches the error code returned when trying to open an
encrypted regular file without the key.
I am not aware of any users who might be relying on the previous
inconsistent error codes, which were never documented anywhere.
This failure case will be exercised by an xfstest.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit fc318d64f3 ]
The zx_dma driver supports cyclic transfer mode. Let's set DMA_CYCLIC
cap_mask bit to make that clear, and avoid unnecessary failure when
clients request channel via dma_request_chan_by_mask() with DMA_CYCLIC
bit set in mask.
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Reviewed-by: Jun Nie <jun.nie@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 790d929b54 ]
When adjusting PLL_CPUX on A33, the PLL is temporarily driven too high,
and the system hangs.
Add a notifier to avoid this situation by temporarily switching to a
known stable 24 MHz oscillator.
Signed-off-by: Icenowy Zheng <icenowy@aosc.xyz>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 70421257c0 ]
As the SPDIF was rarely documented on the earlier Allwinner SoCs
it was assumed that it had a similar clock register to the one
described in the H3 User Manual.
However this is not the case and it looks to shares the same setup
as the I2S clock registers.
Signed-off-by: Marcus Cooper <codekipper@gmail.com>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 0f0861e31e ]
If 'sun4i_backend_drm_format_to_layer()' does not return 0, then 'val' is
left unmodified.
As it is not initialized either, the return value can be anything.
It is likely that returning the error code was expected here.
As the only caller of 'sun4i_backend_update_layer_formats()' does not check
the return value, this fix is purely theorical.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 977509f7c5 ]
Previously we didn't check the type of device before trying to apply Type 1
(PCI-X) or Type 2 (PCIe) Setting Records from _HPX.
We don't support PCI-X Setting Records, so this was harmless, but the
warning was useless.
We do support PCIe Setting Records, and we didn't check whether a device
was PCIe before applying settings. I don't think anything bad happened on
non-PCIe devices because pcie_capability_clear_and_set_word(),
pcie_cap_has_lnkctl(), etc., would fail before doing any harm. But it's
ugly to depend on those internals.
Check the device type before attempting to apply Type 1 and Type 2 Setting
Records (Type 0 records are applicable to PCI, PCI-X, and PCIe devices).
A side benefit is that this prevents useless "not supported" warnings when
a BIOS supplies a Type 1 (PCI-X) Setting Record and we try to apply it to
every single device:
pci 0000:00:00.0: PCI-X settings not supported
After this patch, we'll get the warning only when a BIOS supplies a Type 1
record and we have a PCI-X device to which it should be applied.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=187731
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 584a8279a4 ]
The first message to a remote node should prompt a new
connection even if it is RDMA operation. For RDMA operation
the MR mapping can fail because connections is not yet up.
Since the connection establishment is asynchronous,
we make sure the map failure because of unavailable
connection reach to the user by appropriate error code.
Before returning to the user, lets trigger the connection
so that its ready for the next retry.
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4aea7a5c5e upstream.
When e1000e_poll() is not fast enough to keep up with incoming traffic, the
adapter (when operating in msix mode) raises the Other interrupt to signal
Receiver Overrun.
This is a double problem because 1) at the moment e1000_msix_other()
assumes that it is only called in case of Link Status Change and 2) if the
condition persists, the interrupt is repeatedly raised again in quick
succession.
Ideally we would configure the Other interrupt to not be raised in case of
receiver overrun but this doesn't seem possible on this adapter. Instead,
we handle the first part of the problem by reverting to the practice of
reading ICR in the other interrupt handler, like before commit 16ecba59bc
("e1000e: Do not read ICR in Other interrupt"). Thanks to commit
0a8047ac68 ("e1000e: Fix msi-x interrupt automask") which cleared IAME
from CTRL_EXT, reading ICR doesn't interfere with RxQ0, TxQ0 interrupts
anymore. We handle the second part of the problem by not re-enabling the
Other interrupt right away when there is overrun. Instead, we wait until
traffic subsides, napi polling mode is exited and interrupts are
re-enabled.
Reported-by: Lennart Sorensen <lsorense@csclub.uwaterloo.ca>
Fixes: 16ecba59bc ("e1000e: Do not read ICR in Other interrupt")
Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 19110cfbb3 upstream.
Lennart reported the following race condition:
\ e1000_watchdog_task
\ e1000e_has_link
\ hw->mac.ops.check_for_link() === e1000e_check_for_copper_link
/* link is up */
mac->get_link_status = false;
/* interrupt */
\ e1000_msix_other
hw->mac.get_link_status = true;
link_active = !hw->mac.get_link_status
/* link_active is false, wrongly */
This problem arises because the single flag get_link_status is used to
signal two different states: link status needs checking and link status is
down.
Avoid the problem by using the return value of .check_for_link to signal
the link status to e1000e_has_link().
Reported-by: Lennart Sorensen <lsorense@csclub.uwaterloo.ca>
Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9523feac27 upstream.
Because userspace gets Very Unhappy when calls like stat() and execve()
return -EINTR on 9p filesystem mounts. For instance, when bash is
looking in PATH for things to execute and some SIGCHLD interrupts
stat(), bash can throw a spurious 'command not found' since it doesn't
retry the stat().
In practice, hitting the problem is rare and needs a really
slow/bogged down 9p server.
Signed-off-by: Tuomas Tynkkynen <tuomas@tuxera.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a0b3bc8553 upstream.
fscrypt_initialize(), which allocates the global bounce page pool when
an encrypted file is first accessed, uses "double-checked locking" to
try to avoid locking fscrypt_init_mutex. However, it doesn't use any
memory barriers, so it's theoretically possible for a thread to observe
a bounce page pool which has not been fully initialized. This is a
classic bug with "double-checked locking".
While "only a theoretical issue" in the latest kernel, in pre-4.8
kernels the pointer that was checked was not even the last to be
initialized, so it was easily possible for a crash (NULL pointer
dereference) to happen. This was changed only incidentally by the large
refactor to use fs/crypto/.
Solve both problems in a trivial way that can easily be backported: just
always take the mutex. It's theoretically less efficient, but it
shouldn't be noticeable in practice as the mutex is only acquired very
briefly once per encrypted file.
Later I'd like to make this use a helper macro like DO_ONCE(). However,
DO_ONCE() runs in atomic context, so we'd need to add a new macro that
allows blocking.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4bdced5c9a upstream.
When a CPU lowers its priority (schedules out a high priority task for a
lower priority one), a check is made to see if any other CPU has overloaded
RT tasks (more than one). It checks the rto_mask to determine this and if so
it will request to pull one of those tasks to itself if the non running RT
task is of higher priority than the new priority of the next task to run on
the current CPU.
When we deal with large number of CPUs, the original pull logic suffered
from large lock contention on a single CPU run queue, which caused a huge
latency across all CPUs. This was caused by only having one CPU having
overloaded RT tasks and a bunch of other CPUs lowering their priority. To
solve this issue, commit:
b6366f048e ("sched/rt: Use IPI to trigger RT task push migration instead of pulling")
changed the way to request a pull. Instead of grabbing the lock of the
overloaded CPU's runqueue, it simply sent an IPI to that CPU to do the work.
Although the IPI logic worked very well in removing the large latency build
up, it still could suffer from a large number of IPIs being sent to a single
CPU. On a 80 CPU box, I measured over 200us of processing IPIs. Worse yet,
when I tested this on a 120 CPU box, with a stress test that had lots of
RT tasks scheduling on all CPUs, it actually triggered the hard lockup
detector! One CPU had so many IPIs sent to it, and due to the restart
mechanism that is triggered when the source run queue has a priority status
change, the CPU spent minutes! processing the IPIs.
Thinking about this further, I realized there's no reason for each run queue
to send its own IPI. As all CPUs with overloaded tasks must be scanned
regardless if there's one or many CPUs lowering their priority, because
there's no current way to find the CPU with the highest priority task that
can schedule to one of these CPUs, there really only needs to be one IPI
being sent around at a time.
This greatly simplifies the code!
The new approach is to have each root domain have its own irq work, as the
rto_mask is per root domain. The root domain has the following fields
attached to it:
rto_push_work - the irq work to process each CPU set in rto_mask
rto_lock - the lock to protect some of the other rto fields
rto_loop_start - an atomic that keeps contention down on rto_lock
the first CPU scheduling in a lower priority task
is the one to kick off the process.
rto_loop_next - an atomic that gets incremented for each CPU that
schedules in a lower priority task.
rto_loop - a variable protected by rto_lock that is used to
compare against rto_loop_next
rto_cpu - The cpu to send the next IPI to, also protected by
the rto_lock.
When a CPU schedules in a lower priority task and wants to make sure
overloaded CPUs know about it. It increments the rto_loop_next. Then it
atomically sets rto_loop_start with a cmpxchg. If the old value is not "0",
then it is done, as another CPU is kicking off the IPI loop. If the old
value is "0", then it will take the rto_lock to synchronize with a possible
IPI being sent around to the overloaded CPUs.
If rto_cpu is greater than or equal to nr_cpu_ids, then there's either no
IPI being sent around, or one is about to finish. Then rto_cpu is set to the
first CPU in rto_mask and an IPI is sent to that CPU. If there's no CPUs set
in rto_mask, then there's nothing to be done.
When the CPU receives the IPI, it will first try to push any RT tasks that is
queued on the CPU but can't run because a higher priority RT task is
currently running on that CPU.
Then it takes the rto_lock and looks for the next CPU in the rto_mask. If it
finds one, it simply sends an IPI to that CPU and the process continues.
If there's no more CPUs in the rto_mask, then rto_loop is compared with
rto_loop_next. If they match, everything is done and the process is over. If
they do not match, then a CPU scheduled in a lower priority task as the IPI
was being passed around, and the process needs to start again. The first CPU
in rto_mask is sent the IPI.
This change removes this duplication of work in the IPI logic, and greatly
lowers the latency caused by the IPIs. This removed the lockup happening on
the 120 CPU machine. It also simplifies the code tremendously. What else
could anyone ask for?
Thanks to Peter Zijlstra for simplifying the rto_loop_start atomic logic and
supplying me with the rto_start_trylock() and rto_start_unlock() helper
functions.
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Clark Williams <williams@redhat.com>
Cc: Daniel Bristot de Oliveira <bristot@redhat.com>
Cc: John Kacur <jkacur@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Scott Wood <swood@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170424114732.1aac6dc4@gandalf.local.home
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9cac9d2fb2 upstream.
VIDIOC_DQEVENT and VIDIOC_QUERY_EXT_CTRL should give the same output for
the control flags field.
This patch creates a new function user_flags(), that calculates the user
exported flags value (which is different than the kernel internal flags
structure). This function is then used by all the code that exports the
internal flags to userspace.
Reported-by: Dimitrios Katsaros <patcherwork@gmail.com>
Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3e45067f94 upstream.
The ioctl LIRC_SET_REC_TIMEOUT would set a timeout of 704ns if called
with a timeout of 4294968us.
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>