After commit 205d5f733f ("ANDROID: GKI: Enable symbol trimming
and strict mode"), HiKey960 stopped booting properly.
Will McVicker suggested we add a new symbol list:
android/abi_gki_aarch64_hikey960
So this patch does this. However, we also ran into an issue with
one of the configs we had enabled needing filp_open() which is
no longer allowed. So this patch also disables
CONFIG_NVME_TARGET (which wasn't really necessary) to avoid
this.
Fixes: 205d5f733f ("ANDROID: GKI: Enable symbol trimming and strict mode")
Signed-off-by: John Stultz <john.stultz@linaro.org>
Change-Id: I3b7a862805f134f1dfa49e2a0c2e47ee5bb3b83a
In addition to reverting commit 7b05bf7710 ("Revert "block/mq-deadline:
Prioritize high-priority requests""), this patch uses 'jiffies' instead
of ktime_get() in the code for aging lower priority requests.
This patch has been tested as follows:
Measured QD=1/jobs=1 IOPS for nullb with the mq-deadline scheduler.
Result without and with this patch: 555 K IOPS.
Measured QD=1/jobs=8 IOPS for nullb with the mq-deadline scheduler.
Result without and with this patch: about 380 K IOPS.
Ran the following script:
set -e
scriptdir=$(dirname "$0")
if [ -e /sys/module/scsi_debug ]; then modprobe -r scsi_debug; fi
modprobe scsi_debug ndelay=1000000 max_queue=16
sd=''
while [ -z "$sd" ]; do
sd=$(basename /sys/bus/pseudo/drivers/scsi_debug/adapter*/host*/target*/*/block/*)
done
echo $((100*1000)) > "/sys/block/$sd/queue/iosched/prio_aging_expire"
if [ -e /sys/fs/cgroup/io.prio.class ]; then
cd /sys/fs/cgroup
echo restrict-to-be >io.prio.class
echo +io > cgroup.subtree_control
else
cd /sys/fs/cgroup/blkio/
echo restrict-to-be >blkio.prio.class
fi
echo $$ >cgroup.procs
mkdir -p hipri
cd hipri
if [ -e io.prio.class ]; then
echo none-to-rt >io.prio.class
else
echo none-to-rt >blkio.prio.class
fi
{ "${scriptdir}/max-iops" -a1 -d32 -j1 -e mq-deadline "/dev/$sd" >& ~/low-pri.txt & }
echo $$ >cgroup.procs
"${scriptdir}/max-iops" -a1 -d32 -j1 -e mq-deadline "/dev/$sd" >& ~/hi-pri.txt
Result:
* 11000 IOPS for the high-priority job
* 40 IOPS for the low-priority job
If the prio aging expiry time is changed from 100s into 0, the IOPS results
change into 6712 and 6796 IOPS.
The max-iops script is a script that runs fio with the following arguments:
--bs=4K --gtod_reduce=1 --ioengine=libaio --ioscheduler=${arg_e} --runtime=60
--norandommap --rw=read --thread --buffered=0 --numjobs=${arg_j}
--iodepth=${arg_d} --iodepth_batch_submit=${arg_a}
--iodepth_batch_complete=$((arg_d / 2)) --name=${positional_argument_1}
--filename=${positional_argument_1}
Cc: Damien Le Moal <damien.lemoal@wdc.com>
Cc: Niklas Cassel <Niklas.Cassel@wdc.com>
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
(cherry picked from commit b4d170687c4f3cdef7a0f928d9bb81b7ad1162b3 git://git.kernel.dk/linux-block/ for-5.16 block)
Change-Id: Ic236be29728a1003c7342b1f3049dec44030ef24
Signed-off-by: Bart Van Assche <bvanassche@google.com>
Calculating the sum over all CPUs of per-CPU counters frequently is
inefficient. Hence switch from per-CPU to individual counters. Three
counters are protected by the mq-deadline spinlock since these are
only accessed from contexts that already hold that spinlock. The fourth
counter is atomic because protecting it with the mq-deadline spinlock
would trigger lock contention.
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com>
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
(cherry picked from commit 30d1c42fb962e38480a260760969f8aeb4f9a000 git://git.kernel.dk/linux-block/ for-5.16 block)
Change-Id: Id343ac31be31a38a04876c16e05e18cd224011d8
Signed-off-by: Bart Van Assche <bvanassche@google.com>
Check a statistics invariant at module unload time. When running
blktests, the invariant is verified every time a request queue is
removed and hence is verified at least once per test.
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com>
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
(cherry picked from commit 5eb9e5202056efdf49f1fdcfd9f9f86b81a207c9 git://git.kernel.dk/linux-block/ for-5.16 block)
Change-Id: I912a080a13b7c9c5674d56ba525d97424ad72186
Signed-off-by: Bart Van Assche <bvanassche@google.com>
The scheduler .insert_requests() callback is called when a request is
queued for the first time and also when it is requeued. Only count a
request the first time it is queued. Additionally, since the mq-deadline
scheduler only performs zone locking for requests that have been
inserted, skip the zone unlock code for requests that have not been
inserted into the mq-deadline scheduler.
Fixes: 38ba64d12d ("block/mq-deadline: Track I/O statistics")
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com>
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
(cherry picked from commit 73ac1fd7994b0ea2e62c921f413530dc6c1724ac git://git.kernel.dk/linux-block/ for-5.16 block)
Change-Id: I418f845a80bb0716400944fbd466ec3016551937
Signed-off-by: Bart Van Assche <bvanassche@google.com>
If CONFIG_BLK_DEBUG_FS=n:
block/mq-deadline.c:274:12: warning: ‘dd_queued’ defined but not used [-Wunused-function]
274 | static u32 dd_queued(struct deadline_data *dd, enum dd_prio prio)
| ^~~~~~~~~
Fix this by moving dd_queued() just before the sole function that calls
it.
Fixes: 7b05bf7710 ("Revert "block/mq-deadline: Prioritize high-priority requests"")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Fixes: 38ba64d12d ("block/mq-deadline: Track I/O statistics")
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20210830091128.1854266-1-geert@linux-m68k.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
(cherry picked from commit 7b05bf7710)
Signed-off-by: Bart Van Assche <bvanassche@google.com>
Change-Id: Ia33ad5231b41b44d7f4038b065d18bf9734c30f2
This reverts commit fb926032b3.
Zhen reports that this commit slows down mq-deadline on a 128 thread
box, going from 258K IOPS to 170-180K. My testing shows that Optane
gen2 IOPS goes from 2.3M IOPS to 1.2M IOPS on a 64 thread box.
Looking in detail at the code, the main culprit here is needing to sum
percpu counters in the dispatch hot path, leading to very high CPU
utilization there. To make matters worse, the code currently needs to
sum 2 percpu counters, and it does so in the most naive way of iterating
possible CPUs _twice_.
Since we're close to release, revert this commit and we can re-do it
with regular per-priority counters instead for the 5.15 kernel.
Link: https://lore.kernel.org/linux-block/20210826144039.2143-1-thunder.leizhen@huawei.com/
Reported-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
(cherry picked from commit 7b05bf7710)
Signed-off-by: Bart Van Assche <bvanassche@google.com>
Change-Id: Ic931ffdc0f2f790d10cf93be6e8186163bf9c75a
This reverts commit 08a9ad8bf6 ("block/mq-deadline: Add cgroup support")
and a follow-up commit c06bc5a3fb ("block/mq-deadline: Remove a
WARN_ON_ONCE() call"). The added cgroup support has the following issues:
* It breaks cgroup interface file format rule by adding custom elements to a
nested key-value file.
* It registers mq-deadline as a cgroup-aware policy even though all it's
doing is collecting per-cgroup stats. Even if we need these stats, this
isn't the right way to add them.
* It hasn't been reviewed from cgroup side.
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Change-Id: Ic0cd1c656cd452de09602dbe1b4bad8b541bf450
(cherry picked from commit 0f78399551)
Signed-off-by: Bart Van Assche <bvanassche@google.com>
Controllers with multiple queues have their IRQ-handelers pinned to a
CPU. The core shouldn't need to complete the request on a remote CPU.
Remove this case and always raise the softirq to complete the request.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
(cherry picked from commit 0a2efafbb1)
Signed-off-by: Bart Van Assche <bvanassche@google.com>
Change-Id: If75d3d9ec65568b7e0a2e9b75b1d8940424d976b
Improves the checks on the zones of a zoned block device done in
blk_revalidate_disk_zones() by making sure that the device report_zones
method did report at least one zone and that the zones reported exactly
cover the entire disk capacity, that is, that there are no missing zones
at the end of the disk sector range.
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
(cherry picked from commit 2afdeb23e4)
Signed-off-by: Bart Van Assche <bvanassche@google.com>
Change-Id: Ie5c3b50fbf96493506cc23b49f49406c498e3b0c
The __UNIQUE_ID() macro causes problems as it turns out to not be
deterministic across different compiler runs as it relies on the
__COUNTER__ macro which could have been used on other .h files previous
to this .h file being included.
This shows up specifically when building with "LTO=thin" vs. "LTO=full"
as different build paths seem to be triggered.
As the structure name isn't really needed at all here, we were just
including it for older compilers that could not handle anonymous
structures in a union, just drop the whole thing which resolves the abi
naming issue.
Bug: 210255585
Reported-by: Giuliano Procida <gprocida@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I6b9449fa9d26ffc5d66b2f0f3b41e2d5f3003f68
In 5.9 and later, pages pinned for DMA should use pin_user_pages_remote
rather than get_user_pages_remote. In particular this allows them to
interact better with page migration for CMA allocation.
Bug: 199213963
Signed-off-by: Jesse Hall <jessehall@google.com>
Change-Id: I2068acae64a996c0e81befeb820cba4e6c9f87ac
Although KVM can be compiled out of the kernel, it cannot be disabled
at runtime. Allow this possibility by introducing a new mode that
will prevent KVM from initialising.
This is useful in the (limited) circumstances where you don't want
KVM to be available (what is wrong with you?), or when you want
to install another hypervisor instead (good luck with that).
Bug: 192819132
Test: boot in EL2, pass kvm-arm.mode=none on cmdline
Link: https://lore.kernel.org/kvmarm/20210903091652.985836-1-maz@kernel.org/
Signed-off-by: Marc Zyngier <maz@kernel.org>
Change-Id: If4d2775b8f1f4119b5f390391f64c65221a17fde
struct hid_device was not being tracked as a "stable" symbol in the
past, but that looks to change with some future abi requirements. So
add needed padding now, to ensure that we can support this over the
long-term.
This does not change the existing api at all as this symbol was not
supported yet.
Bug: 151154716
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I8c3064fb7a19006a29dabbaf25c9ed1737f62e75
When building modules(CONFIG_...=m), I found some of module versions
are incorrect and set to 0.
This can be found in build log for first clean build which shows
WARNING: EXPORT symbol "XXXX" [drivers/XXX/XXX.ko] version generation failed,
symbol will not be versioned.
But in second build(incremental build), the WARNING disappeared and the
module version becomes valid CRC and make someone who want to change
modules without updating kernel image can't insert their modules.
The problematic code is
+ $(foreach n, $(filter-out FORCE,$^), \
+ $(if $(wildcard $(n).symversions), \
+ ; cat $(n).symversions >> $@.symversions))
For example:
rm -f fs/notify/built-in.a.symversions ; rm -f fs/notify/built-in.a; \
llvm-ar cDPrST fs/notify/built-in.a fs/notify/fsnotify.o \
fs/notify/notification.o fs/notify/group.o ...
`foreach n` shows nothing to `cat` into $(n).symversions because
`if $(wildcard $(n).symversions)` return nothing, but actually
they do exist during this line was executed.
-rw-r--r-- 1 root root 168580 Jun 13 19:10 fs/notify/fsnotify.o
-rw-r--r-- 1 root root 111 Jun 13 19:10 fs/notify/fsnotify.o.symversions
The reason is the $(n).symversions are generated at runtime, but
Makefile wildcard function expends and checks the file exist or not
during parsing the Makefile.
Thus fix this by use `test` shell command to check the file
existence in runtime.
Rebase from both:
1. [https://lore.kernel.org/lkml/20210616080252.32046-1-lecopzer.chen@mediatek.com/]
2. [https://lore.kernel.org/lkml/20210702032943.7865-1-lecopzer.chen@mediatek.com/]
Fixes: 38e8918490 ("kbuild: lto: fix module versioning")
Co-developed-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Lecopzer Chen <lecopzer.chen@mediatek.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
(cherry picked from commit 1d11053dc6)
Bug: 187129171
Signed-off-by: Connor O'Brien <connoro@google.com>
Change-Id: Ibbc839e1d31a18ec0642cbb55c3157879af68654
Since commit:
bad1e1c663 ("arm64: mte: switch GCR_EL1 in kernel entry and exit")
we saved/restored the user GCR_EL1 value at exception boundaries, and
update_gcr_el1_excl() is no longer used for this. However it is used to
restore the kernel's GCR_EL1 value when returning from a suspend state.
Thus, the comment is misleading (and an ISB is necessary).
When restoring the kernel's GCR value, we need an ISB to ensure this is
used by subsequent instructions. We don't necessarily get an ISB by
other means (e.g. if the kernel is built without support for pointer
authentication). As __cpu_setup() initialised GCR_EL1.Exclude to 0xffff,
until a context synchronization event, allocation tag 0 may be used
rather than the desired set of tags.
This patch drops the misleading comment, adds the missing ISB, and for
clarity folds update_gcr_el1_excl() into its only user.
Fixes: bad1e1c663 ("arm64: mte: switch GCR_EL1 in kernel entry and exit")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210714143843.56537-2-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit 59f44069e0)
Bug: 187129171
Signed-off-by: Connor O'Brien <connoro@google.com>
Change-Id: I56d7512ca889025a16b72ea9c08fd7d39f5478ae
scmi_resp_sensor_reading_complete structure is meant to represent an
SCMI asynchronous reading complete message. The readings field with
a 64bit type forces padding and breaks reads in scmi_sensor_reading_get.
Split it in two adjacent 32bit readings_low/high subfields to avoid the
padding within the structure. Alternatively we could to mark the structure
packed.
Link: https://lore.kernel.org/r/20210628170042.34105-1-cristian.marussi@arm.com
Fixes: e2083d3673 ("firmware: arm_scmi: Add SCMI v3.0 sensors timestamped reads")
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
(cherry picked from commit 187a002b07)
Bug: 187129171
Signed-off-by: Connor O'Brien <connoro@google.com>
Change-Id: Icf803023a528e0219bcf8b42b73b3ec45366ea23
v4l2_ctrl_new_std() fails if the caller provides no 'step' parameter for
integer control, so define it to fix following error:
s5p_mfc_dec_ctrls_setup:1166: Adding control (1) failed
Fixes: c3042bff91 ("media: s5p-mfc: Use display delay and display enable std controls")
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
(cherry picked from commit 61c6f04a98)
Bug: 187129171
Signed-off-by: Connor O'Brien <connoro@google.com>
Change-Id: I50de27537f5136100c122aba6667b35cac1db18d
The function software_node_notify() - the function that creates
and removes the symlinks between the node and the device - was
called unconditionally in device_add_software_node() and
device_remove_software_node(), but it needs to be called in
those functions only in the special case where the node is
added to a device that has already been registered.
This fixes NULL pointer dereference that happens if
device_remove_software_node() is used with device that was
never registered.
Fixes: b622b24519 ("software node: Allow node addition to already existing device")
Reported-and-tested-by: Dominik Brodowski <linux@dominikbrodowski.net>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit 5dca69e26f)
Bug: 187129171
Signed-off-by: Connor O'Brien <connoro@google.com>
Change-Id: I93ae2de517a385f732b31c01895acf807d67ece7
commit 6579c8d97a ("clk: Mark fwnodes when their clock provider is added")
revealed that clk/bcm/clk-raspberrypi.c driver calls
devm_of_clk_add_hw_provider(), with a NULL dev->of_node, which resulted in a
NULL pointer dereference in of_clk_add_hw_provider() when calling
fwnode_dev_initialized().
Returning 0 is reducing the if conditions in driver code and is being
consistent with the CONFIG_OF=n inline stub that returns 0 when CONFIG_OF
is disabled. The downside is that drivers will maybe register clkdev lookups
when they don't need to and waste some memory.
Fixes: 6579c8d97a ("clk: Mark fwnodes when their clock provider is added")
Fixes: 3c9ea42802 ("clk: Mark fwnodes when their clock provider is added/removed")
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Stephen Boyd <sboyd@kernel.org>
Reviewed-by: Saravana Kannan <saravanak@google.com>
Reviewed-by: Nicolas Saenz Julienne <nsaenz@kernel.org>
Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
Link: https://lore.kernel.org/r/20210426065618.588144-1-tudor.ambarus@microchip.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit bb4031b8af)
Bug: 187129171
Signed-off-by: Connor O'Brien <connoro@google.com>
Change-Id: I3aec52abf070284bf8ab5223d0a08ee50eccd74b
Commit 34dc2efb39 ("memblock: fix section mismatch warning") marked
memblock_bottom_up() and memblock_set_bottom_up() as __init, but they
could be referenced from non-init functions like
memblock_find_in_range_node() on architectures that enable
CONFIG_ARCH_KEEP_MEMBLOCK.
For such builds kernel test robot reports:
WARNING: modpost: vmlinux.o(.text+0x74fea4): Section mismatch in reference from the function memblock_find_in_range_node() to the function .init.text:memblock_bottom_up()
The function memblock_find_in_range_node() references the function __init memblock_bottom_up().
This is often because memblock_find_in_range_node lacks a __init annotation or the annotation of memblock_bottom_up is wrong.
Replace __init annotations with __init_memblock annotations so that the
appropriate section will be selected depending on
CONFIG_ARCH_KEEP_MEMBLOCK.
Link: https://lore.kernel.org/lkml/202103160133.UzhgY0wt-lkp@intel.com
Link: https://lkml.kernel.org/r/20210316171347.14084-1-rppt@kernel.org
Fixes: 34dc2efb39 ("memblock: fix section mismatch warning")
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit a024b7c285)
Bug: 187129171
Signed-off-by: Connor O'Brien <connoro@google.com>
Change-Id: I365419baf230ba9123e751cd6f14ebbdb715cd75
When FUSE passthrough is used, the lower file system file is manipulated
directly, but neither mtime, atime or ctime of the referencing FUSE file
is updated.
Fix by updating the file times when passthrough operations are
performed.
Bug: 200779468
Reported-by: Fengnan Chang <changfengnan@vivo.com>
Reported-by: Ed Tsai <ed.tsai@mediatek.com>
Signed-off-by: Alessio Balsini <balsini@google.com>
Change-Id: I35b72196b2cc1d79a9f62ddb32e2cfa934c3b6d3
This reverts commit cac70628fb.
This functionality is provided by upstream commit:
a353397b0d ("usb: gadget: f_fs: Re-use SS descriptors for SuperSpeedPlus")
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Change-Id: I02ad16f4ac031f87e1567c00daaa3f6213a0a06d
(cherry picked from commit 4e307150ee)
After commit 205d5f733f ("ANDROID: GKI: Enable symbol trimming
and strict mode"), db845c stopped booting properly.
Suggested solution is to add a new symbol list for db845c:
android/abi_gki_aarch64_db845c
Bug: 146449535
Fixes: 205d5f733f ("ANDROID: GKI: Enable symbol trimming and strict mode")
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Change-Id: I88bcce520479eec92a8f76b002229b3fb6ea86f8
Changes in 5.10.68
drm/bridge: lt9611: Fix handling of 4k panels
btrfs: fix upper limit for max_inline for page size 64K
io_uring: ensure symmetry in handling iter types in loop_rw_iter()
xen: reset legacy rtc flag for PV domU
bnx2x: Fix enabling network interfaces without VFs
arm64/sve: Use correct size when reinitialising SVE state
PM: base: power: don't try to use non-existing RTC for storing data
PCI: Add AMD GPU multi-function power dependencies
drm/amd/amdgpu: Increase HWIP_MAX_INSTANCE to 10
drm/etnaviv: return context from etnaviv_iommu_context_get
drm/etnaviv: put submit prev MMU context when it exists
drm/etnaviv: stop abusing mmu_context as FE running marker
drm/etnaviv: keep MMU context across runtime suspend/resume
drm/etnaviv: exec and MMU state is lost when resetting the GPU
drm/etnaviv: fix MMU context leak on GPU reset
drm/etnaviv: reference MMU context when setting up hardware state
drm/etnaviv: add missing MMU context put when reaping MMU mapping
s390/sclp: fix Secure-IPL facility detection
x86/pat: Pass valid address to sanitize_phys()
x86/mm: Fix kern_addr_valid() to cope with existing but not present entries
tipc: fix an use-after-free issue in tipc_recvmsg
ethtool: Fix rxnfc copy to user buffer overflow
net/{mlx5|nfp|bnxt}: Remove unnecessary RTNL lock assert
net-caif: avoid user-triggerable WARN_ON(1)
ptp: dp83640: don't define PAGE0
dccp: don't duplicate ccid when cloning dccp sock
net/l2tp: Fix reference count leak in l2tp_udp_recv_core
r6040: Restore MDIO clock frequency after MAC reset
tipc: increase timeout in tipc_sk_enqueue()
drm/rockchip: cdn-dp-core: Make cdn_dp_core_resume __maybe_unused
perf machine: Initialize srcline string member in add_location struct
net/mlx5: FWTrace, cancel work on alloc pd error flow
net/mlx5: Fix potential sleeping in atomic context
nvme-tcp: fix io_work priority inversion
events: Reuse value read using READ_ONCE instead of re-reading it
net: ipa: initialize all filter table slots
gen_compile_commands: fix missing 'sys' package
vhost_net: fix OoB on sendmsg() failure.
net/af_unix: fix a data-race in unix_dgram_poll
net: dsa: destroy the phylink instance on any error in dsa_slave_phy_setup
x86/uaccess: Fix 32-bit __get_user_asm_u64() when CC_HAS_ASM_GOTO_OUTPUT=y
tcp: fix tp->undo_retrans accounting in tcp_sacktag_one()
selftest: net: fix typo in altname test
qed: Handle management FW error
udp_tunnel: Fix udp_tunnel_nic work-queue type
dt-bindings: arm: Fix Toradex compatible typo
ibmvnic: check failover_pending in login response
KVM: PPC: Book3S HV: Tolerate treclaim. in fake-suspend mode changing registers
bnxt_en: make bnxt_free_skbs() safe to call after bnxt_free_mem()
net: hns3: pad the short tunnel frame before sending to hardware
net: hns3: change affinity_mask to numa node range
net: hns3: disable mac in flr process
net: hns3: fix the timing issue of VF clearing interrupt sources
mm/memory_hotplug: use "unsigned long" for PFN in zone_for_pfn_range()
dt-bindings: mtd: gpmc: Fix the ECC bytes vs. OOB bytes equation
mfd: db8500-prcmu: Adjust map to reality
PCI: Add ACS quirks for NXP LX2xx0 and LX2xx2 platforms
fuse: fix use after free in fuse_read_interrupt()
PCI: tegra194: Fix handling BME_CHGED event
PCI: tegra194: Fix MSI-X programming
PCI: tegra: Fix OF node reference leak
mfd: Don't use irq_create_mapping() to resolve a mapping
PCI: rcar: Fix runtime PM imbalance in rcar_pcie_ep_probe()
tracing/probes: Reject events which have the same name of existing one
PCI: cadence: Use bitfield for *quirk_retrain_flag* instead of bool
PCI: cadence: Add quirk flag to set minimum delay in LTSSM Detect.Quiet state
PCI: j721e: Add PCIe support for J7200
PCI: j721e: Add PCIe support for AM64
PCI: Add ACS quirks for Cavium multi-function devices
watchdog: Start watchdog in watchdog_set_last_hw_keepalive only if appropriate
octeontx2-af: Add additional register check to rvu_poll_reg()
Set fc_nlinfo in nh_create_ipv4, nh_create_ipv6
net: usb: cdc_mbim: avoid altsetting toggling for Telit LN920
block, bfq: honor already-setup queue merges
PCI: ibmphp: Fix double unmap of io_mem
ethtool: Fix an error code in cxgb2.c
NTB: Fix an error code in ntb_msit_probe()
NTB: perf: Fix an error code in perf_setup_inbuf()
s390/bpf: Fix optimizing out zero-extensions
s390/bpf: Fix 64-bit subtraction of the -0x80000000 constant
s390/bpf: Fix branch shortening during codegen pass
mfd: axp20x: Update AXP288 volatile ranges
backlight: ktd253: Stabilize backlight
PCI: of: Don't fail devm_pci_alloc_host_bridge() on missing 'ranges'
PCI: iproc: Fix BCMA probe resource handling
netfilter: Fix fall-through warnings for Clang
netfilter: nft_ct: protect nft_ct_pcpu_template_refcnt with mutex
KVM: arm64: Restrict IPA size to maximum 48 bits on 4K and 16K page size
PCI: Fix pci_dev_str_match_path() alloc while atomic bug
mfd: tqmx86: Clear GPIO IRQ resource when no IRQ is set
tracing/boot: Fix a hist trigger dependency for boot time tracing
mtd: mtdconcat: Judge callback existence based on the master
mtd: mtdconcat: Check _read, _write callbacks existence before assignment
KVM: arm64: Fix read-side race on updates to vcpu reset state
KVM: arm64: Handle PSCI resets before userspace touches vCPU state
PCI: Sync __pci_register_driver() stub for CONFIG_PCI=n
mtd: rawnand: cafe: Fix a resource leak in the error handling path of 'cafe_nand_probe()'
ARC: export clear_user_page() for modules
perf unwind: Do not overwrite FEATURE_CHECK_LDFLAGS-libunwind-{x86,aarch64}
perf bench inject-buildid: Handle writen() errors
gpio: mpc8xxx: Fix a resources leak in the error handling path of 'mpc8xxx_probe()'
gpio: mpc8xxx: Use 'devm_gpiochip_add_data()' to simplify the code and avoid a leak
net: dsa: tag_rtl4_a: Fix egress tags
selftests: mptcp: clean tmp files in simult_flows
net: hso: add failure handler for add_net_device
net: dsa: b53: Fix calculating number of switch ports
net: dsa: b53: Set correct number of ports in the DSA struct
netfilter: socket: icmp6: fix use-after-scope
fq_codel: reject silly quantum parameters
qlcnic: Remove redundant unlock in qlcnic_pinit_from_rom
ip_gre: validate csum_start only on pull
net: dsa: b53: Fix IMP port setup on BCM5301x
bnxt_en: fix stored FW_PSID version masks
bnxt_en: Fix asic.rev in devlink dev info command
bnxt_en: log firmware debug notifications
bnxt_en: Consolidate firmware reset event logging.
bnxt_en: Convert to use netif_level() helpers.
bnxt_en: Improve logging of error recovery settings information.
bnxt_en: Fix possible unintended driver initiated error recovery
mfd: lpc_sch: Partially revert "Add support for Intel Quark X1000"
mfd: lpc_sch: Rename GPIOBASE to prevent build error
net: renesas: sh_eth: Fix freeing wrong tx descriptor
x86/mce: Avoid infinite loop for copy from user recovery
bnxt_en: Fix error recovery regression
net: dsa: bcm_sf2: Fix array overrun in bcm_sf2_num_active_ports()
Linux 5.10.68
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I81a2e88d9586c1c08ab91028da2283f6c01dd95a
commit 02319bf15a upstream.
After d12e1c4649 ("net: dsa: b53: Set correct number of ports in the
DSA struct") we stopped setting dsa_switch::num_ports to DSA_MAX_PORTS,
which created an off by one error between the statically allocated
bcm_sf2_priv::port_sts array (of size DSA_MAX_PORTS). When
dsa_is_cpu_port() is used, we end-up accessing an out of bounds member
and causing a NPD.
Fix this by iterating with the appropriate port count using
ds->num_ports.
Fixes: d12e1c4649 ("net: dsa: b53: Set correct number of ports in the DSA struct")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit eca4cf12ac upstream.
The recent patch has introduced a regression by not reading the reset
count in the ERROR_RECOVERY async event handler. We may have just
gone through a reset and the reset count has just incremented. If
we don't update the reset count in the ERROR_RECOVERY event handler,
the health check timer will see that the reset count has changed and
will initiate an unintended reset.
Restore the unconditional update of the reset count in
bnxt_async_event_process() if error recovery watchdog is enabled.
Also, update the reset count at the end of the reset sequence to
make it even more robust.
Fixes: 1b2b918319 ("bnxt_en: Fix possible unintended driver initiated error recovery")
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 81065b35e2 upstream.
There are two cases for machine check recovery:
1) The machine check was triggered by ring3 (application) code.
This is the simpler case. The machine check handler simply queues
work to be executed on return to user. That code unmaps the page
from all users and arranges to send a SIGBUS to the task that
triggered the poison.
2) The machine check was triggered in kernel code that is covered by
an exception table entry. In this case the machine check handler
still queues a work entry to unmap the page, etc. but this will
not be called right away because the #MC handler returns to the
fix up code address in the exception table entry.
Problems occur if the kernel triggers another machine check before the
return to user processes the first queued work item.
Specifically, the work is queued using the ->mce_kill_me callback
structure in the task struct for the current thread. Attempting to queue
a second work item using this same callback results in a loop in the
linked list of work functions to call. So when the kernel does return to
user, it enters an infinite loop processing the same entry for ever.
There are some legitimate scenarios where the kernel may take a second
machine check before returning to the user.
1) Some code (e.g. futex) first tries a get_user() with page faults
disabled. If this fails, the code retries with page faults enabled
expecting that this will resolve the page fault.
2) Copy from user code retries a copy in byte-at-time mode to check
whether any additional bytes can be copied.
On the other side of the fence are some bad drivers that do not check
the return value from individual get_user() calls and may access
multiple user addresses without noticing that some/all calls have
failed.
Fix by adding a counter (current->mce_count) to keep track of repeated
machine checks before task_work() is called. First machine check saves
the address information and calls task_work_add(). Subsequent machine
checks before that task_work call back is executed check that the address
is in the same page as the first machine check (since the callback will
offline exactly one page).
Expected worst case is four machine checks before moving on (e.g. one
user access with page faults disabled, then a repeat to the same address
with page faults enabled ... repeat in copy tail bytes). Just in case
there is some code that loops forever enforce a limit of 10.
[ bp: Massage commit message, drop noinstr, fix typo, extend panic
messages. ]
Fixes: 5567d11c21 ("x86/mce: Send #MC singal from task work")
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/YT/IJ9ziLqmtqEPu@agluck-desk2.amr.corp.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 0341d5e3d1 ]
The cur_tx counter must be incremented after TACT bit of
txdesc->status was set. However, a CPU is possible to reorder
instructions and/or memory accesses between cur_tx and
txdesc->status. And then, if TX interrupt happened at such a
timing, the sh_eth_tx_free() may free the descriptor wrongly.
So, add wmb() before cur_tx++.
Otherwise NETDEV WATCHDOG timeout is possible to happen.
Fixes: 86a74ff21a ("net: sh_eth: add support for Renesas SuperH Ethernet")
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cdff1eda69 ]
One MIPS platform (mach-rc32434) defines GPIOBASE. This macro
conflicts with one of the same name in lpc_sch.c. Rename the latter one
to prevent the build error.
../drivers/mfd/lpc_sch.c:25: error: "GPIOBASE" redefined [-Werror]
25 | #define GPIOBASE 0x44
../arch/mips/include/asm/mach-rc32434/rb.h:32: note: this is the location of the previous definition
32 | #define GPIOBASE 0x050000
Cc: Denis Turischev <denis@compulab.co.il>
Fixes: e82c60ae7d ("mfd: Introduce lpc_sch for Intel SCH LPC bridge")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 922e8ce883 ]
The IRQ support for SCH GPIO is not specific to the Intel Quark SoC.
Moreover the IRQ routing is quite interesting there, so while it's
needs a special support, the driver haven't it anyway yet.
Due to above remove basically redundant code of IRQ support.
This reverts commit ec689a8a81.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1b2b918319 ]
If error recovery is already enabled, bnxt_timer() will periodically
check the heartbeat register and the reset counter. If we get an
error recovery async. notification from the firmware (e.g. change in
primary/secondary role), we will immediately read and update the
heartbeat register and the reset counter. If the timer for the next
health check expires soon after this, we may read the heartbeat register
again in quick succession and find that it hasn't changed. This will
trigger error recovery unintentionally.
The likelihood is small because we also reset fw_health->tmr_counter
which will reset the interval for the next health check. But the
update is not protected and bnxt_timer() can miss the update and
perform the health check without waiting for the full interval.
Fix it by only reading the heartbeat register and reset counter in
bnxt_async_event_process() if error recovery is trasitioning to the
enabled state. Also add proper memory barriers so that when enabling
for the first time, bnxt_timer() will see the tmr_counter interval and
perform the health check after the full interval has elapsed.
Fixes: 7e914027f7 ("bnxt_en: Enable health monitoring.")
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f4d95c3c19 ]
We currently only log the error recovery settings if it is enabled.
In some cases, firmware disables error recovery after it was
initially enabled. Without logging anything, the user will not be
aware of this change in setting.
Log it when error recovery is disabled. Also, change the reset count
value from hexadecimal to decimal.
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a44daa8fcb ]
Firmware is capable of generating asynchronous debug notifications.
The event data is opaque to the driver and is simply logged. Debug
notifications can be enabled by turning on hardware status messages
using the ethtool msglvl interface.
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6fdab8a3ad ]
The current asic.rev is incomplete and does not include the metal
revision. Add the metal revision and decode the complete asic
revision into the more common and readable form (A0, B0, etc).
Fixes: 7154917a12 ("bnxt_en: Refactor bnxt_dl_info_get().")
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 63f8428b40 ]
Broadcom's b53 switches have one IMP (Inband Management Port) that needs
to be programmed using its own designed register. IMP port may be
different than CPU port - especially on devices with multiple CPU ports.
For that reason it's required to explicitly note IMP port index and
check for it when choosing a register to use.
This commit fixes BCM5301x support. Those switches use CPU port 5 while
their IMP port is 8. Before this patch b53 was trying to program port 5
with B53_PORT_OVERRIDE_CTRL instead of B53_GMII_PORT_OVERRIDE_CTRL(5).
It may be possible to also replace "cpu_port" usages with
dsa_is_cpu_port() but that is out of the scope of thix BCM5301x fix.
Fixes: 967dd82ffc ("net: dsa: b53: Add support for Broadcom RoboSwitch")
Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8a0ed250f9 ]
The GRE tunnel device can pull existing outer headers in ipge_xmit.
This is a rare path, apparently unique to this device. The below
commit ensured that pulling does not move skb->data beyond csum_start.
But it has a false positive if ip_summed is not CHECKSUM_PARTIAL and
thus csum_start is irrelevant.
Refine to exclude this. At the same time simplify and strengthen the
test.
Simplify, by moving the check next to the offending pull, making it
more self documenting and removing an unnecessary branch from other
code paths.
Strengthen, by also ensuring that the transport header is correct and
therefore the inner headers will be after skb_reset_inner_headers.
The transport header is set to csum_start in skb_partial_csum_set.
Link: https://lore.kernel.org/netdev/YS+h%2FtqCJJiQei+W@shredder/
Fixes: 1d011c4803 ("ip_gre: add validation for csum_start")
Reported-by: Ido Schimmel <idosch@idosch.org>
Suggested-by: Alexander Duyck <alexander.duyck@gmail.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9ddbc2a00d ]
Previous commit 68233c583a removes the qlcnic_rom_lock()
in qlcnic_pinit_from_rom(), but remains its corresponding
unlock function, which is odd. I'm not very sure whether the
lock is missing, or the unlock is redundant. This bug is
suggested by a static analysis tool, please advise.
Fixes: 68233c583a ("qlcnic: updated reset sequence")
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>