[ Upstream commit af2eca5320 ]
The incoming mode might have a missing vrefresh field if it came from
drmModeSetCrtc(), which the kernel is supposed to calculate using
drm_mode_vrefresh(). We could either use that or the adjusted_mode's
original vrefresh value.
However, we can maintain a more exact vrefresh value (not just the
integer approximation), by scaling by the ratio of our clocks.
v2: Use math suggested by Andrzej Hajda instead.
v3: Simplify math now that adjusted_mode->clock isn't padded.
v4: Drop some parens.
Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20170815234722.20700-2-eric@anholt.net
Reviewed-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f187851b9b ]
When failing to enter broadcast timer mode for an idle state that
requires it, a new state is selected that does not require broadcast,
but the broadcast variable remains set. This causes
tick_broadcast_exit to be called despite not having entered broadcast
mode.
This causes the WARN_ON_ONCE(!irqs_disabled()) to trigger in some
cases. It does not appear to cause problems for code today, but seems
to violate the interface so should be fixed.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 74717b28cb ]
If there is any non expired timer in the queue, the RTC alarm is never set.
This is an issue when adding a timer that expires before the next non
expired timer.
Ensure the RTC alarm is set in that case.
Fixes: 2b2f5ff00f ("rtc: interface: ignore expired timers when enqueuing new timers")
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 7f3ed79188 ]
The HDMI DDC clock found in the CCU is the parent of the actual DDC
clock within the HDMI controller. That clock is also named "hdmi-ddc".
Rename the one in the CCU to "ddc". This makes more sense than renaming
the one in the HDMI controller to something else.
Fixes: c6e6c96d8f ("clk: sunxi-ng: Add A31/A31s clocks")
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5e437b1d7e ]
After the loop in hns_roce_v1_mr_free_work_fn function, it is possible that
all qps will have been freed (in which case ne will be 0). If that
happens, then later in the function when we dereference hr_qp we will
get an exception. Check ne is not 0 to make sure we actually have an
hr_qp left to work on.
This patch fixes the smatch error as below:
drivers/infiniband/hw/hns/hns_roce_hw_v1.c:1009 hns_roce_v1_mr_free_work_fn()
error: we previously assumed 'hr_qp' could be null
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Shaobo Xu <xushaobo2@huawei.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 1f372c7bfb ]
The NS for DAD are sent on admin up as long as a valid qdisc is found.
A race condition exists by which these packets will not egress the
interface if the operational state of the lower device is not yet up.
The solution is to delay DAD until the link is operationally up
according to RFC2863. Rather than only doing this, follow the existing
code checks by deferring IPv6 device initialization altogether. The fix
allows DAD on devices like tunnels that are controlled by userspace
control plane. The fix has no impact on regular deployments, but means
that there is no IPv6 connectivity until the port has been opened in
the case of port-based network access control, which should be
desirable.
Signed-off-by: Mike Manning <mmanning@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 17a9180994 ]
When we process VF mailboxes, the driver is likely going to also queue
up messages to the switch manager. This process merely queues up the
FIFO, but doesn't actually begin the transmission process. Because we
hold the mailbox lock during this VF processing, the PF<->SM mailbox is
not getting processed at this time. Ensure that we actually process the
PF<->SM mailbox in between each PF<->VF mailbox.
This should ensure prompt transmission of the messages queued up after
each VF message is received and handled.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a99897f550 ]
Odroid HC1 board has built-in JMicron USB to SATA bridge, which supports
UAS protocol. Compile-in support for it (instead of enabling it as module)
to make sure that all built-in storage devices are available for rootfs.
The bridge itself also supports fallback to standard USB Mass Storage
protocol, but USB Mass Storage class doesn't bind to it when UAS is
compiled as module and modules are not (yet) available.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 523184972b ]
With virtual PCI-Express chipsets, we now see userspace/guest drivers
trying to match the physical MPS setting to a virtual downstream port.
Of course a lone physical device surrounded by virtual interconnects
cannot make a correct decision for a proper MPS setting. Instead,
let's virtualize the MPS control register so that writes through to
hardware are disallowed. Userspace drivers like QEMU assume they can
write anything to the device and we'll filter out anything dangerous.
Since mismatched MPS can lead to AER and other faults, let's add it
to the kernel side rather than relying on userspace virtualization to
handle it.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c53d11f669 ]
Currently there is a bug in which the PF driver fails to inform clients
of a VF reset which then causes clients to leak resources. The bug
exists because we were incorrectly checking the I40E_VF_STATE_PRE_ENABLE
bit.
When a VF is first init we go through a reset to initialize variables
and allocate resources but we don't want to inform clients of this first
reset since the client isn't fully enabled yet so we set a state bit
signifying we're in a "pre-enabled" client state. During the first
reset we should be clearing the bit, allowing all following resets to
notify the client of the reset when the bit is not set. This patch
fixes the issue by negating the 'test_and_clear_bit' check to accurately
reflect the behavior we want.
Signed-off-by: Alan Brady <alan.brady@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 184fc2b9a8 ]
Firmware update fails with: status x17 add_status x56 on the final write
If multiple DMA buffers are used for the download, some firmware revs
have difficulty with signatures and crcs split across the dma buffer
boundaries. Resolve by making all writes be a single 4k page in length.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 3e256ac5b1 ]
We've had support for setting both a minimum and maximum bandwidth via
.ndo_set_vf_bw since commit 883a9ccbae ("fm10k: Add support for SR-IOV
to driver", 2014-09-20).
Likely because we do not support minimum rates, the declaration
mis-ordered the "unused" parameter, which causes warnings when analyzed
with cppcheck.
Fix this warning by properly declaring the min_rate and max_rate
variables in the declaration and definition (rather than using
"unused"). Also rename "rate" to max_rate so as to clarify that we only
support setting the maximum rate.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 46d69e141d ]
If the driver is built as a module, autoload won't work because the module
alias information is not filled. So user-space can't match the registered
device with the corresponding module.
Export the module alias information using the MODULE_DEVICE_TABLE() macro.
Before this patch:
$ modinfo snd_soc_msm8916_analog | grep alias
$
After this patch:
$ modinfo snd_soc_msm8916_analog | grep alias
alias: of:N*T*Cqcom,pm8916-wcd-analog-codecC*
alias: of:N*T*Cqcom,pm8916-wcd-analog-codec
Signed-off-by: Nicolas Dechesne <nicolas.dechesne@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 1ae2eaaa22 ]
As SCTP supports up to 65535 streams, that can lead to very large
allocations in sctp_stream_init(). As Xin Long noticed, systems with
small amounts of memory are more prone to not have enough memory and
dump warnings on dmesg initiated by user actions. Thus, silence them.
Also, if the reallocation of stream->out is not necessary, skip it and
keep the memory we already have.
Reported-by: Xin Long <lucien.xin@gmail.com>
Tested-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 80e4d70b06 ]
In xmon, touch_nmi_watchdog() is not expected to be checking that
other CPUs have not touched the watchdog, so the code will just call
touch_nmi_watchdog() once before re-enabling hard interrupts.
Just update our CPU's state, and ignore apparently stuck SMP threads.
Arguably touch_nmi_watchdog should check for SMP lockups, and callers
should be fixed, but that's not trivial for the input code of xmon.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 064996d62a ]
The SMP hardlockup watchdog cross-checks other CPUs for lockups, which
causes xmon headaches because it's assuming interrupts hard disabled
means no watchdog troubles. Try to improve that by calling
touch_nmi_watchdog() in obvious places where secondaries are spinning.
Also annotate these spin loops with spin_begin/end calls.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c70458890f ]
Add pm_runtime_get_sync and pm_runtime_put calls to set_fmt callback
function. This fixes a bus error during boot when CONFIG_SUSPEND is
defined when this function gets called while the device is runtime
disabled and device registers are accessed while the clock is disabled.
Signed-off-by: Ed Blake <ed.blake@sondrel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 664611e7e0 ]
The macro used to set the microphone bias level causes the
snd_soc_write() call to overwrite other fields in the CDC_A_MICB_1_VAL
register. The macro also does not return the proper level value
to use. This fixes this by preserving all bits from the register
that are not the level while setting the level.
Signed-off-by: Jean-François Têtu <jean-francois.tetu@savoirfairelinux.com>
Acked-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 86acc79071 ]
Previously, if an non-fatal error was reported by an endpoint, we
called report_error_detected() for the endpoint, every sibling on the
bus, and their descendents. If any of them did not implement the
.error_detected() method, do_recovery() failed, leaving all these
devices unrecovered.
For example, the system described in the bugzilla below has two devices:
0000:74:02.0 [19e5:a230] SAS controller, driver has .error_detected()
0000:74:03.0 [19e5:a235] SATA controller, driver lacks .error_detected()
When a device such as 74:02.0 reported a non-fatal error, do_recovery()
failed because 74:03.0 lacked an .error_detected() method. But per PCIe
r3.1, sec 6.2.2.2.2, such an error does not compromise the Link and
does not affect 74:03.0:
Non-fatal errors are uncorrectable errors which cause a particular
transaction to be unreliable but the Link is otherwise fully functional.
Isolating Non-fatal from Fatal errors provides Requester/Receiver logic
in a device or system management software the opportunity to recover from
the error without resetting the components on the Link and disturbing
other transactions in progress. Devices not associated with the
transaction in error are not impacted by the error.
Report non-fatal errors only to the endpoint that reported them. We really
want to check for AER_NONFATAL here, but the current code structure doesn't
allow that. Looking for pci_channel_io_normal is the best we can do now.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=197055
Fixes: 6c2b374d74 ("PCI-Express AER implemetation: AER core and aerdriver")
Signed-off-by: Gabriele Paoloni <gabriele.paoloni@huawei.com>
Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit be664cbefc ]
Currently, when setting up the IRQ for a q_vector, we set an affinity
hint based on the v_idx of that q_vector. Meaning a loop iterates on
v_idx, which is an incremental value, and the cpumask is created based
on this value.
This is a problem in systems with multiple logical CPUs per core (like in
simultaneous multithreading (SMT) scenarios). If we disable some logical
CPUs, by turning SMT off for example, we will end up with a sparse
cpu_online_mask, i.e., only the first CPU in a core is online, and
incremental filling in q_vector cpumask might lead to multiple offline
CPUs being assigned to q_vectors.
Example: if we have a system with 8 cores each one containing 8 logical
CPUs (SMT == 8 in this case), we have 64 CPUs in total. But if SMT is
disabled, only the 1st CPU in each core remains online, so the
cpu_online_mask in this case would have only 8 bits set, in a sparse way.
In general case, when SMT is off the cpu_online_mask has only C bits set:
0, 1*N, 2*N, ..., C*(N-1) where
C == # of cores;
N == # of logical CPUs per core.
In our example, only bits 0, 8, 16, 24, 32, 40, 48, 56 would be set.
Instead, we should only assign hints for CPUs which are online. Even
better, the kernel already provides a function, cpumask_local_spread()
which takes an index and returns a CPU, spreading the interrupts across
local NUMA nodes first, and then remote ones if necessary.
Since we generally have a 1:1 mapping between vectors and CPUs, there
is no real advantage to spreading vectors to local CPUs first. In order
to avoid mismatch of the default XPS hints, we'll pass -1 so that it
spreads across all CPUs without regard to the node locality.
Note that we don't need to change the q_vector->affinity_mask as this is
initialized to cpu_possible_mask, until an actual affinity is set and
then notified back to us.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 227630cccd ]
This commit fixes 2 issues with host-wake irq trigger type handling
in hci_bcm:
1) bcm_setup_sleep sets sleep_params.host_wake_active based on
bcm_device.irq_polarity, but bcm_request_irq was always requesting
IRQF_TRIGGER_RISING as trigger type independent of irq_polarity.
This was a problem when the irq is described as a GpioInt rather then
an Interrupt in the DSDT as for GpioInt-s the value passed to request_irq
is honored. This commit fixes this by requesting the correct trigger
type depending on bcm_device.irq_polarity.
2) bcm_device.irq_polarity was used to directly store an ACPI polarity
value (ACPI_ACTIVE_*). This is undesirable because hci_bcm is also
used with device-tree and checking for something like ACPI_ACTIVE_LOW
in a non ACPI specific function like bcm_request_irq feels wrong.
This commit fixes this by renaming irq_polarity to irq_active_low
and changing its type to a bool.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 52ca7d0f7b ]
The PCA9552 lines can be used either for driving LEDs or as GPIOs. The
manual states that for LEDs, the operation is open-drain:
The LSn LED select registers determine the source of the LED data.
00 = output is set LOW (LED on)
01 = output is set high-impedance (LED off; default)
10 = output blinks at PWM0 rate
11 = output blinks at PWM1 rate
For GPIOs it suggests a pull-up so that the open-case drives the line
high:
For use as output, connect external pull-up resistor to the pin
and size it according to the DC recommended operating
characteristics. LED output pin is HIGH when the output is
programmed as high-impedance, and LOW when the output is
programmed LOW through the ‘LED selector’ register. The output
can be pulse-width controlled when PWM0 or PWM1 are used.
Now, I have a hardware design that uses the LED controller to control
LEDs. However, for $reasons, we're using the leds-gpio driver to drive
the them. The reasons are here are a tangent but lead to the discovery
of the inversion, which manifested as the LEDs being set to full
brightness at boot when we expected them to be off.
As we're driving the LEDs through leds-gpio, this means wending our way
through the gpiochip abstractions. So with that in mind we need to
describe an active-low GPIO configuration to drive the LEDs as though
they were GPIOs.
The set() gpiochip callback in leds-pca955x does the following:
...
if (val)
pca955x_led_set(&led->led_cdev, LED_FULL);
else
pca955x_led_set(&led->led_cdev, LED_OFF);
...
Where LED_FULL = 255. pca955x_led_set() in turn does:
...
switch (value) {
case LED_FULL:
ls = pca955x_ledsel(ls, ls_led, PCA955X_LS_LED_ON);
break;
...
Where PCA955X_LS_LED_ON is defined as:
#define PCA955X_LS_LED_ON 0x0 /* Output LOW */
So here we have some type confusion: We've crossed domains from GPIO
behaviour to LED behaviour without accounting for possible inversions
in the process.
Stepping back to leds-gpio for a moment, during probe() we call
create_gpio_led(), which eventually executes:
if (template->default_state == LEDS_GPIO_DEFSTATE_KEEP) {
state = gpiod_get_value_cansleep(led_dat->gpiod);
if (state < 0)
return state;
} else {
state = (template->default_state == LEDS_GPIO_DEFSTATE_ON);
}
...
ret = gpiod_direction_output(led_dat->gpiod, state);
In the devicetree the GPIO is annotated as active-low, and
gpiod_get_value_cansleep() handles this for us:
int gpiod_get_value_cansleep(const struct gpio_desc *desc)
{
int value;
might_sleep_if(extra_checks);
VALIDATE_DESC(desc);
value = _gpiod_get_raw_value(desc);
if (value < 0)
return value;
if (test_bit(FLAG_ACTIVE_LOW, &desc->flags))
value = !value;
return value;
}
_gpiod_get_raw_value() in turn calls through the get() callback for the
gpiochip implementation, so returning to our get() implementation in
leds-pca955x we find we extract the raw value from hardware:
static int pca955x_gpio_get_value(struct gpio_chip *gc, unsigned int offset)
{
struct pca955x *pca955x = gpiochip_get_data(gc);
struct pca955x_led *led = &pca955x->leds[offset];
u8 reg = pca955x_read_input(pca955x->client, led->led_num / 8);
return !!(reg & (1 << (led->led_num % 8)));
}
This behaviour is not symmetric with that of set(), where the val is
inverted by the driver.
Closing the loop on the GPIO_ACTIVE_LOW inversions,
gpiod_direction_output(), like gpiod_get_value_cansleep(), handles it
for us:
int gpiod_direction_output(struct gpio_desc *desc, int value)
{
VALIDATE_DESC(desc);
if (test_bit(FLAG_ACTIVE_LOW, &desc->flags))
value = !value;
else
value = !!value;
return _gpiod_direction_output_raw(desc, value);
}
All-in-all, with a value of 'keep' for default-state property in a
leds-gpio child node, the current state of the hardware will in-fact be
inverted; precisely the opposite of what was intended.
Rework leds-pca955x so that we avoid the incorrect inversion and clarify
the semantics with respect to GPIO.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Tested-by: Joel Stanley <joel@jms.id.au>
Tested-by: Matt Spinler <mspinler@linux.vnet.ibm.com>
Signed-off-by: Jacek Anaszewski <jacek.anaszewski@gmail.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a94b9367e0 ]
After rwlock is replaced with rcu and spinlock, ip6_pol_route() will be
called with only rcu held. That means rt6 route deletion could happen
simultaneously with rt6_make_pcpu_rt(). This could potentially cause
memory leak if rt6_release() is called right before rt6_make_pcpu_rt()
on the same route.
This patch grabs rt->rt6i_ref safely before calling rt6_make_pcpu_rt()
to make sure rt6_release() will not get triggered while
rt6_make_pcpu_rt() is in progress. And rt6_release() is called after
rt6_make_pcpu_rt() is finished.
Note: As we are incrementing rt->rt6i_ref in ip6_pol_route(), there is a
very slim chance that fib6_purge_rt() will be triggered unnecessarily
when deleting a route if ip6_pol_route() running on another thread picks
this route as well and tries to make pcpu cache for it.
Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d1d90147c9 ]
Since commit 4ad23a9764 ("MD: use per-cpu counter for writes_pending"),
the wait_queue is only got invoked if THREAD_WAKEUP is not set previously.
With above change, I can see process_metadata_update could always hang on
the wait queue, because mddev->thread could stay on 'D' status and the
THREAD_WAKEUP flag is not cleared since there are lots of place to wake up
mddev->thread. Then deadlock happened as follows:
linux175:~ # ps aux|grep md|grep D
root 20117 0.0 0.0 0 0 ? D 03:45 0:00 [md0_raid1]
root 20125 0.0 0.0 0 0 ? D 03:45 0:00 [md0_cluster_rec]
linux175:~ # cat /proc/20117/stack
[<ffffffffa0635604>] dlm_lock_sync+0x94/0xd0 [md_cluster]
[<ffffffffa0635674>] lock_token+0x34/0xd0 [md_cluster]
[<ffffffffa0635804>] metadata_update_start+0x64/0x110 [md_cluster]
[<ffffffffa04d985b>] md_update_sb.part.58+0x9b/0x860 [md_mod]
[<ffffffffa04da035>] md_update_sb+0x15/0x30 [md_mod]
[<ffffffffa04dc066>] md_check_recovery+0x266/0x490 [md_mod]
[<ffffffffa06450e2>] raid1d+0x42/0x810 [raid1]
[<ffffffffa04d2252>] md_thread+0x122/0x150 [md_mod]
[<ffffffff81091741>] kthread+0x101/0x140
linux175:~ # cat /proc/20125/stack
[<ffffffffa0636679>] recv_daemon+0x3f9/0x5c0 [md_cluster]
[<ffffffffa04d2252>] md_thread+0x122/0x150 [md_mod]
[<ffffffff81091741>] kthread+0x101/0x140
So let's revert the part of code in the commit to resovle the problem since
we can't get lots of benefits of previous change.
Fixes: 4ad23a9764 ("MD: use per-cpu counter for writes_pending")
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b5dc5d4d1f ]
Similarly to CFQ, BFQ has its write-throttling heuristics, and it
is better not to combine them with further write-throttling
heuristics of a different nature.
So this commit disables write-back throttling for a device if BFQ
is used as I/O scheduler for that device.
Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Lee Tibbert <lee.tibbert@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 4831ca9e4a ]
The allocation for elem may fail (especially because we're using
GFP_ATOMIC) so best to check for a null return. This fixes a potential
null pointer dereference when assigning elem->pool.
Detected by CoverityScan CID#1357507 ("Dereference null return value")
Fixes: 8700e3e7c4 ("Soft RoCE driver")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit e72a060151 ]
Introduce register mask for data-ready status register since
pressure sensors (e.g. LPS22HB) export just two channels
(BIT(0) and BIT(1)) and BIT(2) is marked reserved while in
st_sensors_new_samples_available() value read from status register
is masked using 0x7.
Moreover do not mask status register using active_scan_mask since
now status value is properly masked and if the result is not zero the
interrupt has to be consumed by the driver. This fix an issue on LPS25H
and LPS331AP where channel definition is swapped respect to status
register.
Furthermore that change allows to properly support new devices
(e.g LIS2DW12) that report just ZYXDA (data-ready) field in status register
to figure out if the interrupt has been generated by the device.
Fixes: 97865fe413 (iio: st_sensors: verify interrupt event to status)
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@st.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 784548c40d ]
This patch replaces hash_for_each function with hash_for_each_safe
when calling __i40e_del_filter. The hash_for_each_safe function is
the right one to use when iterating over a hash table to safely remove
a hash entry. Otherwise, incorrect values may be read from freed memory.
Detected by CoverityScan, CID 1402048 Read from pointer after free
Signed-off-by: Lihong Yang <lihong.yang@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 035ed07208 ]
On some i.MX6 platforms which do not have speed grading
check, opp table will not be created in platform code,
so cpufreq driver prints the following error message:
cpu cpu0: dev_pm_opp_get_opp_count: OPP table not found (-19)
However, this is not really an error in this case because the
imx6q-cpufreq driver first calls dev_pm_opp_get_opp_count()
and if it fails, it means that platform code does not provide
OPP and then dev_pm_opp_of_add_table() will be called.
In order to avoid such confusing error message, move it to
debug level.
It is up to the caller of dev_pm_opp_get_opp_count() to check its
return value and decide if it will print an error or not.
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 27d6162944 ]
When creating virtual functions, create the "virtfn%u" and "physfn" links
in sysfs *before* attaching the driver instead of after. When we attach
the driver to the new virtual network interface first, there is a race when
the driver attaches to the new sends out an "add" udev event, and the
network interface naming software (biosdevname or systemd, for example)
tries to look at these links.
Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 2ce9a36452 ]
Whenever an I/O for a RAID volume fails with IOCStatus
MPI2_IOCSTATUS_SCSI_IOC_TERMINATED and SCSIStatus equal to
(MPI2_SCSI_STATE_TERMINATED | MPI2_SCSI_STATE_NO_SCSI_STATUS) then
return the I/O to SCSI midlayer with "DID_RESET" (i.e. retry the IO
infinite times) set in the host byte.
Previously, the driver was completing the I/O with "DID_SOFT_ERROR"
which causes the I/O to be quickly retried. However, firmware needed
more time and hence I/Os were failing.
Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 357027786f ]
When checking to see if a PCI bus can safely be reset, we previously
checked to see if any of the children had their PCI_DEV_FLAGS_NO_BUS_RESET
flag set. Children marked with that flag are known not to behave well
after a bus reset.
Some PCIe root port bridges also do not behave well after a bus reset,
sometimes causing the devices behind the bridge to become unusable.
Add a check for PCI_DEV_FLAGS_NO_BUS_RESET being set in the bridge device
to allow these bridges to be flagged, and prevent their secondary buses
from being reset.
Signed-off-by: David Daney <david.daney@cavium.com>
[jglauber@cavium.com: fixed typo]
Signed-off-by: Jan Glauber <jglauber@cavium.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 46bea48ac2 ]
The kvm slabs can consume a significant amount of system memory
and indeed in our production environment we have observed that
a lot of machines are spending significant amount of memory that
can not be left as system memory overhead. Also the allocations
from these slabs can be triggered directly by user space applications
which has access to kvm and thus a buggy application can leak
such memory. So, these caches should be accounted to kmemcg.
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 778f81d6cd ]
If crypto4xx is used in conjunction with dm-crypt, the available
ring buffer elements are not enough to handle the load properly.
On an aes-cbc-essiv:sha256 encrypted swap partition the read
performance is abyssal: (tested with hdparm -t)
/dev/mapper/swap_crypt:
Timing buffered disk reads: 14 MB in 3.68 seconds = 3.81 MB/sec
The patch increases both PPC4XX_NUM_SD and PPC4XX_NUM_PD to 256.
This improves the performance considerably:
/dev/mapper/swap_crypt:
Timing buffered disk reads: 104 MB in 3.03 seconds = 34.31 MB/sec
Furthermore, PPC4XX_LAST_SD, PPC4XX_LAST_GD and PPC4XX_LAST_PD
can be easily calculated from their respective PPC4XX_NUM_*
constant.
Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d51fe3ba97 ]
The post-divider for the audio PLL is in bits [29:26], as specified
in the user manual, not [19:16] as currently programmed in the code.
The post-divider has a default register value of 2, i.e. a divider
of 3. This means the clock rate fed to the audio codec would be off.
This was discovered when porting sigma-delta modulation for the PLL
to sun5i, which needs the post-divider to be 1.
Fix the bit offset, so we do actually force the post-divider to a
certain value.
Fixes: 5e73761786 ("clk: sunxi-ng: Add sun5i CCU driver")
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>