[ Upstream commit 42c8eb3f6e ]
The driver may sleep under a spinlock, and the function call path is:
vt6655_suspend (acquire the spinlock)
pci_set_power_state
__pci_start_power_transition (drivers/pci/pci.c)
msleep --> may sleep
To fix it, pci_set_power_state is called without having a spinlock.
This bug is found by my static analysis tool and my code review.
Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 235b6003fb ]
When reshaping a fully degraded raid5/raid6 to a larger
nubmer of devices, the new device(s) are not in-sync
and so that can make the newly grown stripe appear to be
"failed".
To avoid this, we set the R5_Expanded flag to say "Even though
this device is not fully in-sync, this block is safe so
don't treat the device as failed for this stripe".
This flag is set for data devices, not not for parity devices.
Consequently, if you have a RAID6 with two devices that are partly
recovered and a spare, and start a reshape to include the spare,
then when the reshape gets past the point where the recovery was
up to, it will think the stripes are failed and will get into
an infinite loop, failing to make progress.
So when contructing parity on an EXPAND_READY stripe,
set R5_Expanded.
Reported-by: Curt <lightspd@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 1c363531dd ]
The build robot is complaining on Blackfin:
drivers/pinctrl/pinctrl-adi2.c: In function 'port_setup':
>> drivers/pinctrl/pinctrl-adi2.c:221:21: error: dereferencing
pointer to incomplete type 'struct gpio_port_t'
writew(readw(®s->port_fer) & ~BIT(offset),
^~
drivers/pinctrl/pinctrl-adi2.c: In function 'adi_gpio_ack_irq':
>> drivers/pinctrl/pinctrl-adi2.c:266:18: error: dereferencing
pointer to incomplete type 'struct bfin_pint_regs'
if (readl(®s->invert_set) & pintbit)
^~
It seems the driver need to include <asm/gpio.h> and <asm/irq.h>
to compile.
The Blackfin architecture was re-defining the Kconfig
PINCTRL symbol which is not OK, so replaced this with
PINCTRL_BLACKFIN_ADI2 which selects PINCTRL and PINCTRL_ADI2
just like most arches do.
Further, the old GPIO driver symbol GPIO_ADI was possible to
select at the same time as selecting PINCTRL. This was not
working because the arch-local <asm/gpio.h> header contains
an explicit #ifndef PINCTRL clause making compilation break
if you combine them. The same is true for DEBUG_MMRS.
Make sure the ADI2 pinctrl driver is not selected at the same
time as the old GPIO implementation. (This should be converted
to use gpiolib or pincontrol and move to drivers/...) Also make
sure the old GPIO_ADI driver or DEBUG_MMRS is not selected at
the same time as the new PINCTRL implementation, and only make
PINCTRL_ADI2 selectable for the Blackfin families that actually
have it.
This way it is still possible to add e.g. I2C-based pin
control expanders on the Blackfin.
Cc: Steven Miao <realmz6@gmail.com>
Cc: Huanhuan Feng <huanhuan.feng@analog.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bd3486ded7 upstream.
When babble condition happens, the musb controller might automatically
turns off VBUS. On DA8xx platform, the controller generates drvvbus
interrupt for turning off VBUS along with the babble interrupt.
In this case, we should handle the babble interrupt first and recover
from the babble condition.
This change ignores the drvvbus interrupt if babble interrupt is also
generated at the same time, so the babble recovery routine works
properly.
Signed-off-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 05c14c0313 ]
In the hv-24x7 code there is a function memord() which tries to
implement a sort function return -1, 0, 1. However one of the
conditions is incorrect, such that it can never be true, because we
will have already returned.
I don't believe there is a bug in practice though, because the
comparisons are an optimisation prior to calling memcmp().
Fix it by swapping the second comparision, so it can be true.
Reported-by: David Binderman <dcb314@hotmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit dfb2e6f46b ]
This patch cleans up a lot of warnings when unloading the driver.
A current example of the stack trace starts with:
[ 142.570715] sysfs group 'power' not found for kobject 'port-5:0'
There can be hundreds of these messages during a driver unload.
I am resubmitting this patch on behalf of Martin Wilck with his
permission.
His original patch can be found here:
https://www.spinics.net/lists/linux-scsi/msg102085.html
This patch did not help until Hannes's
commit 9441284fbc39 ("scsi-fixup-kernel-warning-during-rmmod")
was applied to the kernel.
---------------------------
Original patch description:
---------------------------
Unloading the hpsa driver causes warnings
[ 1063.793652] WARNING: CPU: 1 PID: 4850 at ../fs/sysfs/group.c:237 device_del+0x54/0x240()
[ 1063.793659] sysfs group ffffffff81cf21a0 not found for kobject 'port-2:0'
with two different stacks:
1)
[ 1063.793774] [<ffffffff81448af4>] device_del+0x54/0x240
[ 1063.793780] [<ffffffff8145178a>] transport_remove_classdev+0x4a/0x60
[ 1063.793784] [<ffffffff81451216>] attribute_container_device_trigger+0xa6/0xb0
[ 1063.793802] [<ffffffffa0105d46>] sas_port_delete+0x126/0x160 [scsi_transport_sas]
[ 1063.793819] [<ffffffffa036ebcc>] hpsa_free_sas_port+0x3c/0x70 [hpsa]
2)
[ 1063.797103] [<ffffffff81448af4>] device_del+0x54/0x240
[ 1063.797118] [<ffffffffa0105d4e>] sas_port_delete+0x12e/0x160 [scsi_transport_sas]
[ 1063.797134] [<ffffffffa036ebcc>] hpsa_free_sas_port+0x3c/0x70 [hpsa]
This is caused by the fact that host device hostX is deleted before the
SAS transport devices hostX/port-a:b.
This patch fixes this by reverting the order of device deletions.
Tested-by: Don Brace <don.brace@microsemi.com>
Reviewed-by: Don Brace <don.brace@microsemi.com>
Signed-off-by: Martin Wilck <mwilck@suse.de>
Signed-off-by: Don Brace <don.brace@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 55ca38b425 ]
I am resubmitting this patch on behalf of Martin Wilck with his
permission.
The original patch can be found here:
https://www.spinics.net/lists/linux-scsi/msg102083.html
This patch did not help until Hannes's
commit 9441284fbc39 ("scsi-fixup-kernel-warning-during-rmmod")
was applied to the kernel.
--------------------------------------
Original patch description from Martin:
--------------------------------------
When the hpsa module is unloaded using rmmod, dangling
symlinks remain under /sys/class/sas_phy. Fix this by
calling sas_phy_delete() rather than sas_phy_free (which,
according to comments, should not be called for PHYs that
have been set up successfully, anyway).
Tested-by: Don Brace <don.brace@microsemi.com>
Reviewed-by: Don Brace <don.brace@microsemi.com>
Signed-off-by: Martin Wilck <mwilck@suse.de>
Signed-off-by: Don Brace <don.brace@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 16b6c8bb68 ]
When removing a device, for example a VF being removed due to SR-IOV
teardown, a "soft" hot-unplug via 'echo 1 > remove' in sysfs, or an actual
hot-unplug, we first remove the procfs and sysfs attributes for the device
before attempting to release the device from any driver bound to it.
Unbinding the driver from the device can take time. The device might need
to write out data or it might be actively in use. If it's in use by
userspace through a vfio driver, the unbind might block until the user
releases the device. This leads to a potentially non-trivial amount of
time where the device exists, but we've torn down the interfaces that
userspace uses to examine devices, for instance lspci might generate this
sort of error:
pcilib: Cannot open /sys/bus/pci/devices/0000:01:0a.3/config
lspci: Unable to read the standard configuration space header of device 0000:01:0a.3
We don't seem to have any dependence on this teardown ordering in the
kernel, so let's unbind the driver first, which is also more symmetric with
the instantiation of the device in pci_bus_add_device().
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5e422f5e4f ]
There was one spot in xfs_bmap_add_extent_unwritten_real that didn't use the
passed in new extent state but always converted to normal, leading to wrong
behavior when converting from normal to unwritten.
Only found by code inspection, it seems like this code path to move partial
extent from written to unwritten while merging it with the next extent is
rarely exercised.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9f2a450580 ]
It is possible for mkfs to format very small filesystems with too
small of an internal log with respect to the various minimum size
and block count requirements. If this occurs when the log happens to
be smaller than the scan window used for cycle verification and the
scan wraps the end of the log, the start_blk calculation in
xlog_find_head() underflows and leads to an attempt to scan an
invalid range of log blocks. This results in log recovery failure
and a failed mount.
Since there may be filesystems out in the wild with this kind of
geometry, we cannot simply refuse to mount. Instead, cap the scan
window for cycle verification to the size of the physical log. This
ensures that the cycle verification proceeds as expected when the
scan wraps the end of the log.
Reported-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c157313791 ]
Currently, Cache missed IOs are identified by s->cache_miss, but actually,
there are many situations that missed IOs are not assigned a value for
s->cache_miss in cached_dev_cache_miss(), for example, a bypassed IO
(s->iop.bypass = 1), or the cache_bio allocate failed. In these situations,
it will go to out_put or out_submit, and s->cache_miss is null, which leads
bch_mark_cache_accounting() to treat this IO as a hit IO.
[ML: applied by 3-way merge]
Signed-off-by: tang.junhui <tang.junhui@zte.com.cn>
Reviewed-by: Michael Lyle <mlyle@lyle.org>
Reviewed-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 330a4db89d ]
mutex_destroy does nothing most of time, but it's better to call
it to make the code future proof and it also has some meaning
for like mutex debug.
As Coly pointed out in a previous review, bcache_exit() may not be
able to handle all the references properly if userspace registers
cache and backing devices right before bch_debug_init runs and
bch_debug_init failes later. So not exposing userspace interface
until everything is ready to avoid that issue.
Signed-off-by: Liang Chen <liangchen.linux@gmail.com>
Reviewed-by: Michael Lyle <mlyle@lyle.org>
Reviewed-by: Coly Li <colyli@suse.de>
Reviewed-by: Eric Wheeler <bcache@linux.ewheeler.net>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit cc555b09d8 ]
This patch fixes a deadlock caused when the jdata flag is set for
inodes that are already on the ordered write list. Since it is
on the ordered write list, log_flush calls gfs2_ordered_write which
calls filemap_fdatawrite. But since the inode had the jdata flag
set, that calls gfs2_jdata_writepages, which tries to start a new
transaction. A new transaction cannot be started because it tries
to acquire the log_flush rwsem which is already locked by the log
flush operation.
The bottom line is: We cannot switch an inode from ordered to jdata
until we eliminate any ordered data pages (via log flush) or any
log_flush operation afterward will create the circular dependency
above. So we need to flush the log before setting the diskflags to
switch the file mode, then we need to remove the inode from the
ordered writes list.
Before this patch, the log flush was done for jdata->ordered, but
that's wrong. If we're going from jdata to ordered, we don't need
to call gfs2_log_flush because the call to filemap_fdatawrite will
do it for us:
filemap_fdatawrite() -> __filemap_fdatawrite_range()
__filemap_fdatawrite_range() -> do_writepages()
do_writepages() -> gfs2_jdata_writepages()
gfs2_jdata_writepages() -> gfs2_log_flush()
This patch modifies function do_gfs2_set_flags so that if a file
has its jdata flag set, and it's already on the ordered write list,
the log will be flushed and it will be removed from the list
before setting the flag.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Acked-by: Abhijith Das <adas@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 07209fcf33 ]
There is a particular situation when the cooling device is cpufreq and the heat
dissipation is not efficient enough where the temperature increases little by
little until reaching the critical threshold and leading to a SoC reset.
The behavior is reproducible on a hikey6220 with bad heat dissipation (eg.
stacked with other boards).
Running a simple C program doing while(1); for each CPU of the SoC makes the
temperature to reach the passive regulation trip point and ends up to the
maximum allowed temperature followed by a reset.
This issue has been also reported by running the libhugetlbfs test suite.
What is observed is a ping pong between two cpu frequencies, 1.2GHz and 900MHz
while the temperature continues to grow.
It appears the step wise governor calls get_target_state() the first time with
the throttle set to true and the trend to 'raising'. The code selects logically
the next state, so the cpu frequency decreases from 1.2GHz to 900MHz, so far so
good. The temperature decreases immediately but still stays greater than the
trip point, then get_target_state() is called again, this time with the
throttle set to true *and* the trend to 'dropping'. From there the algorithm
assumes we have to step down the state and the cpu frequency jumps back to
1.2GHz. But the temperature is still higher than the trip point, so
get_target_state() is called with throttle=1 and trend='raising' again, we jump
to 900MHz, then get_target_state() is called with throttle=1 and
trend='dropping', we jump to 1.2GHz, etc ... but the temperature does not
stabilizes and continues to increase.
[ 237.922654] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=1,throttle=1
[ 237.922678] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=1,throttle=1
[ 237.922690] thermal cooling_device0: cur_state=0
[ 237.922701] thermal cooling_device0: old_target=0, target=1
[ 238.026656] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=2,throttle=1
[ 238.026680] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=2,throttle=1
[ 238.026694] thermal cooling_device0: cur_state=1
[ 238.026707] thermal cooling_device0: old_target=1, target=0
[ 238.134647] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=1,throttle=1
[ 238.134667] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=1,throttle=1
[ 238.134679] thermal cooling_device0: cur_state=0
[ 238.134690] thermal cooling_device0: old_target=0, target=1
In this situation the temperature continues to increase while the trend is
oscillating between 'dropping' and 'raising'. We need to keep the current state
untouched if the throttle is set, so the temperature can decrease or a higher
state could be selected, thus preventing this oscillation.
Keeping the next_target untouched when 'throttle' is true at 'dropping' time
fixes the issue.
The following traces show the governor does not change the next state if
trend==2 (dropping) and throttle==1.
[ 2306.127987] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=1,throttle=1
[ 2306.128009] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=1,throttle=1
[ 2306.128021] thermal cooling_device0: cur_state=0
[ 2306.128031] thermal cooling_device0: old_target=0, target=1
[ 2306.231991] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=2,throttle=1
[ 2306.232016] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=2,throttle=1
[ 2306.232030] thermal cooling_device0: cur_state=1
[ 2306.232042] thermal cooling_device0: old_target=1, target=1
[ 2306.335982] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=0,throttle=1
[ 2306.336006] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=0,throttle=1
[ 2306.336021] thermal cooling_device0: cur_state=1
[ 2306.336034] thermal cooling_device0: old_target=1, target=1
[ 2306.439984] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=2,throttle=1
[ 2306.440008] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=2,throttle=0
[ 2306.440022] thermal cooling_device0: cur_state=1
[ 2306.440034] thermal cooling_device0: old_target=1, target=0
[ ... ]
After a while, if the temperature continues to increase, the next state becomes
2 which is 720MHz on the hikey. That results in the temperature stabilizing
around the trip point.
[ 2455.831982] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=1,throttle=1
[ 2455.832006] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=1,throttle=0
[ 2455.832019] thermal cooling_device0: cur_state=1
[ 2455.832032] thermal cooling_device0: old_target=1, target=1
[ 2455.935985] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=0,throttle=1
[ 2455.936013] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=0,throttle=0
[ 2455.936027] thermal cooling_device0: cur_state=1
[ 2455.936040] thermal cooling_device0: old_target=1, target=1
[ 2456.043984] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=0,throttle=1
[ 2456.044009] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=0,throttle=0
[ 2456.044023] thermal cooling_device0: cur_state=1
[ 2456.044036] thermal cooling_device0: old_target=1, target=1
[ 2456.148001] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=1,throttle=1
[ 2456.148028] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=1,throttle=1
[ 2456.148042] thermal cooling_device0: cur_state=1
[ 2456.148055] thermal cooling_device0: old_target=1, target=2
[ 2456.252009] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=2,throttle=1
[ 2456.252041] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=2,throttle=0
[ 2456.252058] thermal cooling_device0: cur_state=2
[ 2456.252075] thermal cooling_device0: old_target=2, target=1
IOW, this change is needed to keep the state for a cooling device if the
temperature trend is oscillating while the temperature increases slightly.
Without this change, the situation above leads to a catastrophic crash by a
hardware reset on hikey. This issue has been reported to happen on an OMAP
dra7xx also.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Keerthy <j-keerthy@ti.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Tested-by: Keerthy <j-keerthy@ti.com>
Reviewed-by: Keerthy <j-keerthy@ti.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c68ee58d9e ]
On i.MX6 SoCs without VPU (in my case MCIMX6D4AVT10AC), the hdmi driver
fails to probe:
[ 2.540030] dwhdmi-imx 120000.hdmi: Unsupported HDMI controller
(0000:00:00)
[ 2.548199] imx-drm display-subsystem: failed to bind 120000.hdmi
(ops dw_hdmi_imx_ops): -19
[ 2.557403] imx-drm display-subsystem: master bind failed: -19
That's because hdmi_isfr's parent, video_27m, is not correctly ungated.
As explained in commit 5ccc248cc5 ("ARM: imx6q: clk: Add support for
mipi_core_cfg clock as a shared clock gate"), video_27m is gated by
CCM_CCGR3[CG8].
On i.MX6 SoCs with VPU, the hdmi is working thanks to the
CCM_CMEOR[mod_en_ov_vpu] bit which makes the video_27m ungated whatever
is in CCM_CCGR3[CG8]. The issue can be reproduced by setting
CCMEOR[mod_en_ov_vpu] to 0.
Make the HDMI work in every case by setting hdmi_isfr's parent to
mipi_core_cfg.
Signed-off-by: Sébastien Szymanski <sebastien.szymanski@armadeus.com>
Reviewed-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c955bf3998 ]
Since the previous setup always sets the PLL using crystal 26MHz, this
doesn't always happen in every MediaTek platform. So the patch added
flexibility for assigning extra member for determining the PLL source
clock.
Signed-off-by: Chen Zhong <chen.zhong@mediatek.com>
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 592e254502 ]
_calc_vm_trans() does not handle the situation when some of the passed
flags are 0 (which can happen if these VM flags do not make sense for
the architecture). Improve the _calc_vm_trans() macro to return 0 in
such situation. Since all passed flags are constant, this does not add
any runtime overhead.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 594e25e734 ]
The function fd_execute_unmap() in target_core_file.c calles
ret = file->f_op->fallocate(file, mode, pos, len);
Some filesystems implement fallocate() to return error if
length is zero (e.g. btrfs) but according to SCSI Block
Commands spec UNMAP should return success for zero length.
Signed-off-by: Jiang Yi <jiangyilism@gmail.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 24528f089d ]
When is pr_reg->isid_present_at_reg is false,this function should return.
This fixes a regression originally introduced by:
commit d2843c173e
Author: Andy Grover <agrover@redhat.com>
Date: Thu May 16 10:40:55 2013 -0700
target: Alter core_pr_dump_initiator_port for ease of use
Signed-off-by: tangwenji <tang.wenji@zte.com.cn>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 71e24d7731 ]
The current code checks the completion map to look for the first token
that is complete. In some cases, a completion can come in but the
token can still be on lease to the caller processing the completion.
If this completed but unreleased token is the first token found in the
bitmap by another tasks trying to acquire a token, then the
__test_and_set_bit call will fail since the token will still be on
lease. The acquisition will then fail with an EBUSY.
This patch reorganizes the acquisition code to look at the
opal_async_token_map for an unleased token. If the token has no lease
it must have no outstanding completions so we should never see an
EBUSY, unless we have leased out too many tokens. Since
opal_async_get_token_inrerruptible is protected by a semaphore, we
will practically never see EBUSY anymore.
Fixes: 8d72482322 ("powerpc/powernv: Infrastructure to support OPAL async completion")
Signed-off-by: William A. Kennington III <wak@google.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c5504f724c ]
Information about ipvs in different network namespace can be seen via procfs.
How to reproduce:
# ip netns add ns01
# ip netns add ns02
# ip netns exec ns01 ip a add dev lo 127.0.0.1/8
# ip netns exec ns02 ip a add dev lo 127.0.0.1/8
# ip netns exec ns01 ipvsadm -A -t 10.1.1.1:80
# ip netns exec ns02 ipvsadm -A -t 10.1.1.2:80
The ipvsadm displays information about its own network namespace only.
# ip netns exec ns01 ipvsadm -Ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 10.1.1.1:80 wlc
# ip netns exec ns02 ipvsadm -Ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 10.1.1.2:80 wlc
But I can see information about other network namespace via procfs.
# ip netns exec ns01 cat /proc/net/ip_vs
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 0A010101:0050 wlc
TCP 0A010102:0050 wlc
# ip netns exec ns02 cat /proc/net/ip_vs
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 0A010102:0050 wlc
Signed-off-by: KUWAZAWA Takuya <albatross0@gmail.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit cd77b5ce20 ]
The call to /proc/cpuinfo in turn calls cpufreq_quick_get() which
returns the last frequency requested by the kernel, but may not
reflect the actual frequency the processor is running at. This patch
makes a call to cpufreq_get() instead which returns the current
frequency reported by the hardware.
Fixes: fb5153d05a ("powerpc: powernv: Implement ppc_md.get_proc_freq()")
Signed-off-by: Shriya <shriyak@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 3ad3f8ce50 ]
PCIe PME and native hotplug share the same interrupt number, so hotplug
interrupts are also processed by PME. In some cases, e.g., a Link Down
interrupt, a device may be present but unreachable, so when we try to
read its Root Status register, the read fails and we get all ones data
(0xffffffff).
Previously, we interpreted that data as PCI_EXP_RTSTA_PME being set, i.e.,
"some device has asserted PME," so we scheduled pcie_pme_work_fn(). This
caused an infinite loop because pcie_pme_work_fn() tried to handle PME
requests until PCI_EXP_RTSTA_PME is cleared, but with the link down,
PCI_EXP_RTSTA_PME can't be cleared.
Check for the invalid 0xffffffff data everywhere we read the Root Status
register.
1469d17dd3 ("PCI: pciehp: Handle invalid data when reading from
non-existent devices") added similar checks in the hotplug driver.
Signed-off-by: Qiang Zheng <zhengqiang10@huawei.com>
[bhelgaas: changelog, also check in pcie_pme_work_fn(), use "~0" to follow
other similar checks]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 288e7560e4 ]
The used 0x1f mask is only valid for am335x family of SoC, different family
using this type of crossbar might have different number of electable
events. In case of am43xx family 0x3f mask should have been used for
example.
Instead of trying to handle each family's mask, just use u8 type to store
the mux value since the event offsets are aligned to byte offset.
Fixes: 42dbdcc6bf ("dmaengine: ti-dma-crossbar: Add support for crossbar on AM33xx/AM43xx")
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c987694755 ]
While usb_control_msg function expects timeout in miliseconds, a value
of HZ is used. Replace it with USB_CTRL_GET_TIMEOUT and also fix error
message which looks like:
udlfb: Read EDID byte 78 failed err ffffff92
as error is either negative errno or number of bytes transferred use %d
format specifier.
Returned EDID is in second byte, so return error when less than two bytes
are received.
Fixes: 18dffdf891 ("staging: udlfb: enhance EDID and mode handling support")
Signed-off-by: Ladislav Michl <ladis@linux-mips.org>
Cc: Bernie Thompson <bernie@plugable.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 760bf578ed ]
This fixes the following races:
1. core_alua_do_transition_tg_pt could have read
tg_pt_gp_alua_access_state and gone into this if chunk:
if (!explicit &&
atomic_read(&tg_pt_gp->tg_pt_gp_alua_access_state) ==
ALUA_ACCESS_STATE_TRANSITION) {
and then core_alua_do_transition_tg_pt_work could update the
state. core_alua_do_transition_tg_pt would then only set
tg_pt_gp_alua_pending_state and the tg_pt_gp_alua_access_state would
not get updated with the second calls state.
2. core_alua_do_transition_tg_pt could be setting
tg_pt_gp_transition_complete while the tg_pt_gp_transition_work
is already completing. core_alua_do_transition_tg_pt then waits on the
completion that will never be called.
To handle these issues, we just call flush_work which will return when
core_alua_do_transition_tg_pt_work has completed so there is no need
to do the complete/wait. And, if core_alua_do_transition_tg_pt_work
was running, instead of trying to sneak in the state change, we just
schedule up another core_alua_do_transition_tg_pt_work call.
Note that this does not handle a possible race where there are multiple
threads call core_alua_do_transition_tg_pt at the same time. I think
we need a mutex in target_tg_pt_gp_alua_access_state_store.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d7175373f2 ]
The implicit transition time tells initiators the min time
to wait before timing out a transition. We currently schedule
the transition to occur in tg_pt_gp_implicit_trans_secs
seconds so there is no room for delays. If
core_alua_do_transition_tg_pt_work->core_alua_update_tpg_primary_metadata
needs to write out info to a remote file, then the initiator can
easily time out the operation.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 207ee84133 ]
If tcmu-runner is processing a STPG and needs to change the kernel's
ALUA state then we cannot use the same work queue for task management
requests and ALUA transitions, because we could deadlock. The problem
occurs when a STPG times out before tcmu-runner is able to
call into target_tg_pt_gp_alua_access_state_store->
core_alua_do_port_transition -> core_alua_do_transition_tg_pt ->
queue_work. In this case, the tmr is on the work queue waiting for
the STPG to complete, but the STPG transition is now queued behind
the waiting tmr.
Note:
This bug will also be fixed by this patch:
http://www.spinics.net/lists/target-devel/msg14560.html
which switches the tmr code to use the system workqueues.
For both, I am not sure if we need a dedicated workqueue since
it is not a performance path and I do not think we need WQ_MEM_RECLAIM
to make forward progress to free up memory like the block layer does.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit e1699d2d7b ]
This is a story about 4 distinct (and very old) btrfs bugs.
Commit c8b978188c ("Btrfs: Add zlib compression support") added
three data corruption bugs for inline extents (bugs #1-3).
Commit 93c82d5750 ("Btrfs: zero page past end of inline file items")
fixed bug #1: uncompressed inline extents followed by a hole and more
extents could get non-zero data in the hole as they were read. The fix
was to add a memset in btrfs_get_extent to zero out the hole.
Commit 166ae5a418 ("btrfs: fix inline compressed read err corruption")
fixed bug #2: compressed inline extents which contained non-zero bytes
might be replaced with zero bytes in some cases. This patch removed an
unhelpful memset from uncompress_inline, but the case where memset is
required was missed.
There is also a memset in the decompression code, but this only covers
decompressed data that is shorter than the ram_bytes from the extent
ref record. This memset doesn't cover the region between the end of the
decompressed data and the end of the page. It has also moved around a
few times over the years, so there's no single patch to refer to.
This patch fixes bug #3: compressed inline extents followed by a hole
and more extents could get non-zero data in the hole as they were read
(i.e. bug #3 is the same as bug #1, but s/uncompressed/compressed/).
The fix is the same: zero out the hole in the compressed case too,
by putting a memset back in uncompress_inline, but this time with
correct parameters.
The last and oldest bug, bug #0, is the cause of the offending inline
extent/hole/extent pattern. Bug #0 is a subtle and mostly-harmless quirk
of behavior somewhere in the btrfs write code. In a few special cases,
an inline extent and hole are allowed to persist where they normally
would be combined with later extents in the file.
A fast reproducer for bug #0 is presented below. A few offending extents
are also created in the wild during large rsync transfers with the -S
flag. A Linux kernel build (git checkout; make allyesconfig; make -j8)
will produce a handful of offending files as well. Once an offending
file is created, it can present different content to userspace each
time it is read.
Bug #0 is at least 4 and possibly 8 years old. I verified every vX.Y
kernel back to v3.5 has this behavior. There are fossil records of this
bug's effects in commits all the way back to v2.6.32. I have no reason
to believe bug #0 wasn't present at the beginning of btrfs compression
support in v2.6.29, but I can't easily test kernels that old to be sure.
It is not clear whether bug #0 is worth fixing. A fix would likely
require injecting extra reads into currently write-only paths, and most
of the exceptional cases caused by bug #0 are already handled now.
Whether we like them or not, bug #0's inline extents followed by holes
are part of the btrfs de-facto disk format now, and we need to be able
to read them without data corruption or an infoleak. So enough about
bug #0, let's get back to bug #3 (this patch).
An example of on-disk structure leading to data corruption found in
the wild:
item 61 key (606890 INODE_ITEM 0) itemoff 9662 itemsize 160
inode generation 50 transid 50 size 47424 nbytes 49141
block group 0 mode 100644 links 1 uid 0 gid 0
rdev 0 flags 0x0(none)
item 62 key (606890 INODE_REF 603050) itemoff 9642 itemsize 20
inode ref index 3 namelen 10 name: DB_File.so
item 63 key (606890 EXTENT_DATA 0) itemoff 8280 itemsize 1362
inline extent data size 1341 ram 4085 compress(zlib)
item 64 key (606890 EXTENT_DATA 4096) itemoff 8227 itemsize 53
extent data disk byte 5367308288 nr 20480
extent data offset 0 nr 45056 ram 45056
extent compression(zlib)
Different data appears in userspace during each read of the 11 bytes
between 4085 and 4096. The extent in item 63 is not long enough to
fill the first page of the file, so a memset is required to fill the
space between item 63 (ending at 4085) and item 64 (beginning at 4096)
with zero.
Here is a reproducer from Liu Bo, which demonstrates another method
of creating the same inline extent and hole pattern:
Using 'page_poison=on' kernel command line (or enable
CONFIG_PAGE_POISONING) run the following:
# touch foo
# chattr +c foo
# xfs_io -f -c "pwrite -W 0 1000" foo
# xfs_io -f -c "falloc 4 8188" foo
# od -x foo
# echo 3 >/proc/sys/vm/drop_caches
# od -x foo
This produce the following on my box:
Correct output: file contains 1000 data bytes followed
by zeros:
0000000 cdcd cdcd cdcd cdcd cdcd cdcd cdcd cdcd
*
0001740 cdcd cdcd cdcd cdcd 0000 0000 0000 0000
0001760 0000 0000 0000 0000 0000 0000 0000 0000
*
0020000
Actual output: the data after the first 1000 bytes
will be different each run:
0000000 cdcd cdcd cdcd cdcd cdcd cdcd cdcd cdcd
*
0001740 cdcd cdcd cdcd cdcd 6c63 7400 635f 006d
0001760 5f74 6f43 7400 435f 0053 5f74 7363 7400
0002000 435f 0056 5f74 6164 7400 645f 0062 5f74
(...)
Signed-off-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: Chris Mason <clm@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 033853325f ]
Currently client doesn't respect max sizes server returns in CREATE_SESSION.
nfs4_session_set_rwsize() gets called and server->rsize, server->wsize are 0
so they never get set to the sizes returned by the server.
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 822f5845f7 ]
The Intel Compute Stick STCK1A8LFC and Weibu F3C platforms both
log 2 error messages during boot:
efi: requested map not found.
esrt: ESRT header is not in the memory map.
Searching the web, this seems to affect many other platforms too.
Since these messages are logged as errors, they appear on-screen during
the boot process even when using the "quiet" boot parameter used by
distros.
Demote the ESRT error to a warning so that it does not appear on-screen,
and delete the error logging from efi_mem_desc_lookup; both callsites
of that function log more specific messages upon failure.
Out of curiosity I looked closer at the Weibu F3C. There is no entry in
the UEFI-provided memory map which corresponds to the ESRT pointer, but
hacking the code to map it anyway, the ESRT does appear to be valid with
2 entries.
Signed-off-by: Daniel Drake <drake@endlessm.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Acked-by: Peter Jones <pjones@redhat.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit e7ede72a6d ]
The current symbols__fixup_end() heuristic for the last entry in the rb
tree is suboptimal as it leads to not being able to recognize the symbol
in the call graph in a couple of corner cases, for example:
i) If the symbol has a start address (f.e. exposed via kallsyms)
that is at a page boundary, then the roundup(curr->start, 4096)
for the last entry will result in curr->start == curr->end with
a symbol length of zero.
ii) If the symbol has a start address that is shortly before a page
boundary, then also here, curr->end - curr->start will just be
very few bytes, where it's unrealistic that we could perform a
match against.
Instead, change the heuristic to roundup(curr->start, 4096) + 4096, so
that we can catch such corner cases and have a better chance to find
that specific symbol. It's still just best effort as the real end of the
symbol is unknown to us (and could even be at a larger offset than the
current range), but better than the current situation.
Alexei reported that he recently run into case i) with a JITed eBPF
program (these are all page aligned) as the last symbol which wasn't
properly shown in the call graph (while other eBPF program symbols in
the rb tree were displayed correctly). Since this is a generic issue,
lets try to improve the heuristic a bit.
Reported-and-Tested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Fixes: 2e538c4a18 ("perf tools: Improve kernel/modules symbol lookup")
Link: http://lkml.kernel.org/r/bb5c80d27743be6f12afc68405f1956a330e1bc9.1489614365.git.daniel@iogearbox.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 4cbe4dac82 ]
Some Hypervisors detach VFs from VMs by instantly causing an FLR event
to be generated for a VF.
In the mlx4 case, this will cause that VF's comm channel to be disabled
before the VM has an opportunity to invoke the VF device's "shutdown"
method.
For such Hypervisors, there is a race condition between the VF's
shutdown method and its internal-error detection/reset thread.
The internal-error detection/reset thread (which runs every 5 seconds) also
detects a disabled comm channel. If the internal-error detection/reset
flow wins the race, we still get delays (while that flow tries repeatedly
to detect comm-channel recovery).
The cited commit fixed the command timeout problem when the
internal-error detection/reset flow loses the race.
This commit avoids the unneeded delays when the internal-error
detection/reset flow wins.
Fixes: d585df1c5c ("net/mlx4_core: Avoid command timeouts during VF driver device shutdown")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Reported-by: Simon Xiao <sixiao@microsoft.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>