fuse could use get_user_pages_fast by iov_iter_get_pages at
fuse_copy_fill so close the false positive by attributing
it by put_user_page.
Page pinned via pid 670, ts 4554195916 ns
PFN 83125 Block 162 type Movable Flags 0xfffffc008001e(referenced|uptodate|dirty|lru|swapbacked)
try_grab_compound_head+0x1e8/0x240
internal_get_user_pages_fast+0x66d/0xca0
iov_iter_get_pages+0xd4/0x3a0
fuse_copy_fill+0x197/0x200
fuse_copy_one+0x6e/0xf0
fuse_dev_do_read.constprop.0+0x435/0x7e0
fuse_dev_read+0x5d/0x90
new_sync_read+0x115/0x1a0
vfs_read+0xf4/0x180
ksys_read+0x5f/0xe0
do_syscall_64+0x33/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xae
Bug: 183414571
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Minchan Kim <minchan@google.com>
Change-Id: Idc80d4a34b546f25e8f6dbc68313d39586e914d9
CMA allocation can fail by temporal page refcount increasement
by get_page API as well as get_user_pages friends.
However, since get_page is one of the most hot function, it is
hard to hook get_page to get callstack everytime due to
performance concern. Furthermore, get_page could be nested
multiple times so we couldn't track all of the pin sites on
limited space of page_pinner.
Thus, here approach is keep tracking of put_page callsite rather
than get_page once VM found the page migration failed.
It's based on assumption:
1. Since it's temporal page refcount, it could be released soon
before overflowing dmesg log buffer
2. developer can find the pair of get_page by reviewing put_page.
By default, it's eanbled. If you want to disable it:
echo 0 > $debugfs/page_pinner/failure_tracking
You can capture the tracking using:
cat $debugfs/page_pinner/alloc_contig_failed
note: the example below is artificial:
Page pinned ts 386067292 us count 0
PFN 10162530 Block 9924 type Isolate Flags 0x800000000008000c(uptodate|dirty|swapbacked)
__page_pinner_migration_failed+0x30/0x104
putback_lru_page+0x90/0xac
putback_movable_pages+0xc4/0x204
__alloc_contig_migrate_range+0x290/0x31c
alloc_contig_range+0x114/0x2bc
cma_alloc+0x2d8/0x698
cma_alloc_write+0x58/0xb8
simple_attr_write+0xd4/0x124
debugfs_attr_write+0x50/0xd8
full_proxy_write+0x70/0xf8
vfs_write+0x168/0x3a8
ksys_write+0x7c/0xec
__arm64_sys_write+0x20/0x30
el0_svc_common+0xa4/0x180
do_el0_svc+0x28/0x88
el0_svc+0x14/0x24
Page pinned ts 385867394 us count 0
PFN 10162530 Block 9924 type Isolate Flags 0x800000000008000c(uptodate|dirty|swapbacked)
__page_pinner_migration_failed+0x30/0x104
__alloc_contig_migrate_range+0x200/0x31c
alloc_contig_range+0x114/0x2bc
cma_alloc+0x2d8/0x698
cma_alloc_write+0x58/0xb8
simple_attr_write+0xd4/0x124
debugfs_attr_write+0x50/0xd8
full_proxy_write+0x70/0xf8
vfs_write+0x168/0x3a8
ksys_write+0x7c/0xec
__arm64_sys_write+0x20/0x30
el0_svc_common+0xa4/0x180
do_el0_svc+0x28/0x88
el0_svc+0x14/0x24
el0_sync_handler+0x88/0xec
el0_sync+0x198/0x1c0
Bug: 183414571
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Minchan Kim <minchan@google.com>
Change-Id: Ie79902c18390eb9f320d823839bb9d9a7fdcdb31
For CMA allocation, it's really critical to migrate a page but
sometimes it fails. One of the reasons is some driver holds a
page refcount for a long time so VM couldn't migrate the page
at that time.
The concern here is there is no way to find the who hold the
refcount of the page effectively. This patch introduces feature
to keep tracking page's pinner. All get_page sites are vulnerable
to pin a page for a long time but the cost to keep track it would
be significat since get_page is the most frequent kernel operation.
Furthermore, the page could be not user page but kernel page which
is not related to the page migration failure. So, this patch keeps
tracking only get_user_pages/follow_page with (FOLL_GET|PIN friends
because they are the very common APIs to pin user pages which could
cause migration failure and the less frequent than get_page so
runtime cost wouldn't be that big but could cover many cases
effectively.
This patch also introduces put_user_page API. It aims for attributing
"the pinner releases the page from now on" while it release the
page refcount. Thus, any user of get_user_pages/follow_page(FOLL_GET)
must use put_user_page as pair of those functions. Otherwise,
page_pinner will treat them long term pinner as false postive but
nothing should affect stability.
* $debugfs/page_pinner/threshold
It indicates threshold(microsecond) to flag long term pinning.
It's configurable(Default is 300000us). Once you write new value
to the threshold, old data will clear.
* $debugfs/page_pinner/longterm_pinner
It shows call sites where the duration of pinning was greater than
the threshold. Internally, it uses a static array to keep 4096
elements and overwrites old ones once overflow happens. Therefore,
you could lose some information.
example)
Page pinned ts 76953865787 us count 1
PFN 9856945 Block 9625 type Movable Flags 0x8000000000080014(uptodate|lru|swapbacked)
__set_page_pinner+0x34/0xcc
try_grab_page+0x19c/0x1a0
follow_page_pte+0x1c0/0x33c
follow_page_mask+0xc0/0xc8
__get_user_pages+0x178/0x414
__gup_longterm_locked+0x80/0x148
internal_get_user_pages_fast+0x140/0x174
pin_user_pages_fast+0x24/0x40
CCC
BBB
AAA
__arm64_sys_ioctl+0x94/0xd0
el0_svc_common+0xa4/0x180
do_el0_svc+0x28/0x88
el0_svc+0x14/0x24
note: page_pinner doesn't guarantee attributing/unattributing are
atomic if they happen at the same time. It's just best effort so
false-positive could happen.
Bug: 183414571
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Minchan Kim <minchan@google.com>
Change-Id: Ife37ec360eef993d390b9c131732218a4dfd2f04
Fix the following build warning:
include/trace/hooks/psi.h:17:18: warning: declaration of
'struct psi_trigger' will not be visible outside of
this function [-Wvisibility]
Bug: 178721511
Fixes: commit b79d1815c4 ("ANDROID: psi: Add vendor hooks for PSI tracing")
Change-Id: I4c8c9730f5a0c94fa8d93c6995ce25f385b52043
Signed-off-by: Georgi Djakov <quic_c_gdjako@quicinc.com>
When servicemanager process added service proxy from other process
register the service, we want to know the matching relation between
handle in the process and service name. When binder transaction
happened, We want to know what process calls what method on what service.
Bug: 186604985
Signed-off-by: zhengding chen <chenzhengding@oppo.com>
Change-Id: I813d1cde10294d8665f899f7fef0d444ec1f1f5e
DMA-BUF per-buffer attachment stats are present at
/sys/kernel/dma-buf/buffers/<ino>/attachments. Since a kset of name
'attachment' is created everytime a buffer is allocated, disable
uevents from the kset to avoid their userspace overhead.
The ksets 'dma-buf' and 'buffers' do not need uevents disabled since
they are emitted just once but the patch does it for uniformity.
Bug: 186155231
Change-Id: I29009dd2aa0bc018c18bca7a1e0a068f4480cf13
Signed-off-by: Hridya Valsaraju <hridya@google.com>
Changes in 5.10.33
vhost-vdpa: protect concurrent access to vhost device iotlb
gpio: omap: Save and restore sysconfig
KEYS: trusted: Fix TPM reservation for seal/unseal
vdpa/mlx5: Set err = -ENOMEM in case dma_map_sg_attrs fails
pinctrl: lewisburg: Update number of pins in community
block: return -EBUSY when there are open partitions in blkdev_reread_part
pinctrl: core: Show pin numbers for the controllers with base = 0
arm64: dts: allwinner: Revert SD card CD GPIO for Pine64-LTS
bpf: Permits pointers on stack for helper calls
bpf: Allow variable-offset stack access
bpf: Refactor and streamline bounds check into helper
bpf: Tighten speculative pointer arithmetic mask
locking/qrwlock: Fix ordering in queued_write_lock_slowpath()
perf/x86/intel/uncore: Remove uncore extra PCI dev HSWEP_PCI_PCU_3
perf/x86/kvm: Fix Broadwell Xeon stepping in isolation_ucodes[]
perf auxtrace: Fix potential NULL pointer dereference
perf map: Fix error return code in maps__clone()
HID: google: add don USB id
HID: alps: fix error return code in alps_input_configured()
HID cp2112: fix support for multiple gpiochips
HID: wacom: Assign boolean values to a bool variable
soc: qcom: geni: shield geni_icc_get() for ACPI boot
dmaengine: xilinx: dpdma: Fix descriptor issuing on video group
dmaengine: xilinx: dpdma: Fix race condition in done IRQ
ARM: dts: Fix swapped mmc order for omap3
net: geneve: check skb is large enough for IPv4/IPv6 header
dmaengine: tegra20: Fix runtime PM imbalance on error
s390/entry: save the caller of psw_idle
arm64: kprobes: Restore local irqflag if kprobes is cancelled
xen-netback: Check for hotplug-status existence before watching
cavium/liquidio: Fix duplicate argument
kasan: fix hwasan build for gcc
csky: change a Kconfig symbol name to fix e1000 build error
ia64: fix discontig.c section mismatches
ia64: tools: remove duplicate definition of ia64_mf() on ia64
x86/crash: Fix crash_setup_memmap_entries() out-of-bounds access
net: hso: fix NULL-deref on disconnect regression
USB: CDC-ACM: fix poison/unpoison imbalance
Linux 5.10.33
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I638db3c919ad938eaaaac3d687175252edcd7990
If the timestamp of the .config file is updated, config_data.gz is
regenerated, then vmlinux is re-linked. This occurs even if the content
of the .config has not changed at all.
This issue was mitigated by commit 67424f61f8 ("kconfig: do not write
.config if the content is the same"); Kconfig does not update the
.config when it ends up with the identical configuration.
The issue is remaining when the .config is created by *_defconfig with
some config fragment(s) applied on top.
This is typical for powerpc and mips, where several *_defconfig targets
are constructed by using merge_config.sh.
One workaround is to have the copy of the .config. The filechk rule
updates the copy, kernel/config_data, by checking the content instead
of the timestamp.
With this commit, the second run with the same configuration avoids
the needless rebuilds.
$ make ARCH=mips defconfig all
[ snip ]
$ make ARCH=mips defconfig all
*** Default configuration is based on target '32r2el_defconfig'
Using ./arch/mips/configs/generic_defconfig as base
Merging arch/mips/configs/generic/32r2.config
Merging arch/mips/configs/generic/el.config
Merging ./arch/mips/configs/generic/board-boston.config
Merging ./arch/mips/configs/generic/board-ni169445.config
Merging ./arch/mips/configs/generic/board-ocelot.config
Merging ./arch/mips/configs/generic/board-ranchu.config
Merging ./arch/mips/configs/generic/board-sead-3.config
Merging ./arch/mips/configs/generic/board-xilfpga.config
#
# configuration written to .config
#
SYNC include/config/auto.conf
CALL scripts/checksyscalls.sh
CALL scripts/atomic/check-atomics.sh
CHK include/generated/compile.h
CHK include/generated/autoksyms.h
Reported-by: Elliot Berman <eberman@codeaurora.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Bug: 179648610
(cherry picked from commit b33976d90d1ea7652fff662dcc2234f352346a33
https://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild.git
kbuild)
[eberman: Fixed minor conflicts in kernel/.gitignore]
Change-Id: I8c93147c8d5a48d0f5e9abf855870b10c1a24efc
Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>
When CPUs with mismatched support for 32-bit EL0 are detected early,
before the system capabilities are finalised, then initialisation of
the compat hwcaps is left to the boot CPU to configure as it would do
for a system without a mismatch.
Unfortunately, this initialisation is only carried out if
system_supports_32bit_el0(), which isn't initialised until later in the
boot process via a CPU hotplug notifier. Consequently, the compat hwcaps
reported do not necessarily indicate all of the CPU features available.
Move the initialisation of compat hwcaps to the CPU hotplug notifier
itself so that we can ensure that they are configured as soon as we have
detected a mismatch.
Bug: 186482502
Cc: Quentin Perret <qperret@google.com>
Cc: Huang Yiwei <hyiwei@codeaurora.org>
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I54edcc34f96be9dec15aa44407441c0227c17753
Take the 4 instruction byte swapping sequence from the decompressor's
head.S, and turn it into a rev_l GAS macro for general use. While
at it, make it use the 'rev' instruction when compiling for v6 or
later.
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
(cherry picked from commit 6468e898c6)
Bug: 178411248
Change-Id: I8433e97d2880f75cace215f1a8daadec7f29929c
Signed-off-by: Eric Biggers <ebiggers@google.com>
DTB stores all values as 32-bit big-endian integers.
Add a macro to convert such values to native CPU endianness, to reduce
duplication.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
(cherry picked from commit 0557ac83fd)
Bug: 178411248
Change-Id: I0807f36352dbfd5f5808959e358a7469dc9753bb
Signed-off-by: Eric Biggers <ebiggers@google.com>
Patch series "kfence: optimize timer scheduling", v2.
We have observed that mostly-idle systems with KFENCE enabled wake up
otherwise idle CPUs, preventing such to enter a lower power state.
Debugging revealed that KFENCE spends too much active time in
toggle_allocation_gate().
While the first version of KFENCE was using all the right bits to be
scheduling optimal, and thus power efficient, by simply using wait_event()
+ wake_up(), that code was unfortunately removed.
As KFENCE was exposed to various different configs and tests, the
scheduling optimal code slowly disappeared. First because of hung task
warnings, and finally because of deadlocks when an allocation is made by
timer code with debug objects enabled. Clearly, the "fixes" were not too
friendly for devices that want to be power efficient.
Therefore, let's try a little harder to fix the hung task and deadlock
problems that we have with wait_event() + wake_up(), while remaining as
scheduling friendly and power efficient as possible.
Crucially, we need to defer the wake_up() to an irq_work, avoiding any
potential for deadlock.
The result with this series is that on the devices where we observed a
power regression, power usage returns back to baseline levels.
This patch (of 3):
On mostly-idle systems, we have observed that toggle_allocation_gate() is
a cause of frequent wake-ups, preventing an otherwise idle CPU to go into
a lower power state.
A late change in KFENCE's development, due to a potential deadlock [1],
required changing the scheduling-friendly wait_event_timeout() and
wake_up() to an open-coded wait-loop using schedule_timeout(). [1]
https://lkml.kernel.org/r/000000000000c0645805b7f982e4@google.com
To avoid unnecessary wake-ups, switch to using wait_event_timeout().
Unfortunately, we still cannot use a version with direct wake_up() in
__kfence_alloc() due to the same potential for deadlock as in [1].
Instead, add a level of indirection via an irq_work that is scheduled if
we determine that the kfence_timer requires a wake_up().
Link: https://lkml.kernel.org/r/20210421105132.3965998-1-elver@google.com
Link: https://lkml.kernel.org/r/20210421105132.3965998-2-elver@google.com
Fixes: 0ce20dd840 ("mm: add Kernel Electric-Fence infrastructure")
Signed-off-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Hillf Danton <hdanton@sina.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Bug: 172317151
Bug: 185280916
Test: power team confirmed there's no regression
(cherry picked from commit 0ac66fbbadde5547db190e9d87ff2e77d245f9a2
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git akpm)
Signed-off-by: Alexander Potapenko <glider@google.com>
Change-Id: Idba39683f0b76eb3f70df1113e43d94845fab5bf
Because memblock allocations are registered with kmemleak, the KFENCE
pool was seen by kmemleak as one large object. Later allocations
through kfence_alloc() that were registered with kmemleak via
slab_post_alloc_hook() would then overlap and trigger a warning.
Therefore, once the pool is initialized, we can remove (free) it from
kmemleak again, since it should be treated as allocator-internal and be
seen as "free memory".
The second problem is that kmemleak is passed the rounded size, and not
the originally requested size, which is also the size of KFENCE objects.
To avoid kmemleak scanning past the end of an object and trigger a
KFENCE out-of-bounds error, fix the size if it is a KFENCE object.
For simplicity, to avoid a call to kfence_ksize() in
slab_post_alloc_hook() (and avoid new IS_ENABLED(CONFIG_DEBUG_KMEMLEAK)
guard), just call kfence_ksize() in mm/kmemleak.c:create_object().
Link: https://lkml.kernel.org/r/20210317084740.3099921-1-elver@google.com
Signed-off-by: Marco Elver <elver@google.com>
Reported-by: Luis Henriques <lhenriques@suse.de>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Luis Henriques <lhenriques@suse.de>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Jann Horn <jannh@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 172317151
Test: build and run on an ARM64 device
(cherry picked from commit 9551158069)
Signed-off-by: Alexander Potapenko <glider@google.com>
Change-Id: Ida4d747d3b81b5e8a6f1047b864da89e8e110e61
Allow disabling symbol trimming on the command line when running
build.sh. This allows us to make GKI builds without trimming and without
modifying the build config. The main use case is when we want to update
the symbol list in a mixed build system.
Bug: 186549137
Signed-off-by: Will McVicker <willmcvicker@google.com>
Change-Id: I16d1c348270b4dbb378f009857286acd7b6d8aa3
Prior change
ANDROID: Incremental fs: stat should return actual used blocks
adds blocks to getattr. Unfortunately the code always looks for the
backing file, and pseudo files don't have backing files, so getattr
fails for pseudo files.
Bug: 186567511
Test: incfs_test passes, can do incremental installs on test device
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Change-Id: Ia3df87f3683e095d05c822b69747515963c95f1c
(cherry picked from commit 9d00e67d8b)
Currently, the sched code checks if the rq clock has been
updated after its lock has been held when CONFIG_SCHED_DEBUG
is enabled. It tracks this by clearing the RQCF_UPDATED bit
when a lock is acquired and setting it upon a subsequent
update_rq_clock() call. It warns if rq clock is read without
RQCF_UPDATED flag indicating the code path missed updating
the clock.
When migrate_tasks() is called during a pause_cpus() event,
the local variable orf is updated with the contents of *rf,
prior to the call to update_rq_clock(). As a result, when
migrate_tasks() restores *rf from the local variable the
RQCF_UPDATED flag is lost. This clearing out of the
RQCF_UPDATED flag leads to a warning when the next task
is being pushed out.
For example in migrate_tasks()
orf = rf; // save flags, RQCF_UPDATE cleared
update_rq_clock() // set RQCF_UPDATE
for()
...
__migrate_task(dead_rq, new_cpu)
...
--> if migration, restore dead_rq's flags with orf.
--> We loose RQCF_UPDATE
rq_relock(dead_rq, orf)
This leaves the current cpu's rq clock_update_flags with the
RQCF_UPDATED flag cleared, an error condition with
CONFIG_SCHED_DEBUG enabled.
Fix the issue for by ensuring that the local variable orf
has the RQCF_UPDATED flag set, allowing the current
CPU's rq to have the flag set and leaving it in a good state
for future usage.
pause_cpus() is currently Android specific. As cpu_pause does
not rely on stop_machine_cpuslocked() like the regular
hotunplug path does, there's a risk for another CPU to
read the rq_clock, after we cleared RQCF_UPDATE, when using
pause_cpus(). This change will have little or no impact outside
of Android currently. If pause_cpus() or drain_rq_cpu_stop()
are merged upstream this change should be merged as well.
Bug: 186222712
Change-Id: Id241122e1449cdd4dcd15f94eb68735b40e3d6f5
Signed-off-by: Stephen Dickey <dickey@codeaurora.org>
Transparent huge pages are supported for read-only non-shmem files,
but are only used for vmas with VM_DENYWRITE. This condition ensures that
file THPs are protected from writes while an application is running
(ETXTBSY). Any existing file THPs are then dropped from the page cache
when a file is opened for write in do_dentry_open(). Since sys_mmap
ignores MAP_DENYWRITE, this constrains the use of file THPs to vmas
produced by execve().
Systems that make heavy use of shared libraries (e.g. Android) are unable
to apply VM_DENYWRITE through the dynamic linker, preventing them from
benefiting from the resultant reduced contention on the TLB.
This patch reduces the constraint on file THPs allowing use with any
executable mapping from a file not opened for write (see
inode_is_open_for_write()). It also introduces additional conditions to
ensure that files opened for write will never be backed by file THPs.
Restricting the use of THPs to executable mappings eliminates the risk that
a read-only file later opened for write would encounter significant
latencies due to page cache truncation.
The ld linker flag '-z max-page-size=(hugepage size)' can be used to
produce executables with the necessary layout. The dynamic linker must
map these file's segments at a hugepage size aligned vma for the mapping to
be backed with THPs.
Comparison of the performance characteristics of 4KB and 2MB-backed
libraries follows; the Android dex2oat tool was used to AOT compile an
example application on a single ARM core.
4KB Pages:
==========
count event_name # count / runtime
598,995,035,942 cpu-cycles # 1.800861 GHz
81,195,620,851 raw-stall-frontend # 244.112 M/sec
347,754,466,597 iTLB-loads # 1.046 G/sec
2,970,248,900 iTLB-load-misses # 0.854122% miss rate
Total test time: 332.854998 seconds.
2MB Pages:
==========
count event_name # count / runtime
592,872,663,047 cpu-cycles # 1.800358 GHz
76,485,624,143 raw-stall-frontend # 232.261 M/sec
350,478,413,710 iTLB-loads # 1.064 G/sec
803,233,322 iTLB-load-misses # 0.229182% miss rate
Total test time: 329.826087 seconds
A check of /proc/$(pidof dex2oat64)/smaps shows THPs in use:
/apex/com.android.art/lib64/libart.so
FilePmdMapped: 4096 kB
/apex/com.android.art/lib64/libart-compiler.so
FilePmdMapped: 2048 kB
Bug: 158135888
Link: https://lore.kernel.org/patchwork/patch/1408266/
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
Acked-by: Hugh Dickins <hughd@google.com>
Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Collin Fijalkovich <cfijalkovich@google.com>
Change-Id: I75c693a4b4e7526d374ef2c010bde3094233eef2
commit a8b3b51961 upstream.
suspend() does its poisoning conditionally, resume() does it
unconditionally. On a device with combined interfaces this
will balance, on a device with two interfaces the counter will
go negative and resubmission will fail.
Both actions need to be done conditionally.
Fixes: 6069e3e927 ("USB: cdc-acm: untangle a circular dependency between callback and softint")
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210421074513.4327-1-oneukum@suse.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2ad5692db7 upstream.
Commit 8a12f88361 ("net: hso: fix null-ptr-deref during tty device
unregistration") fixed the racy minor allocation reported by syzbot, but
introduced an unconditional NULL-pointer dereference on every disconnect
instead.
Specifically, the serial device table must no longer be accessed after
the minor has been released by hso_serial_tty_unregister().
Fixes: 8a12f88361 ("net: hso: fix null-ptr-deref during tty device unregistration")
Cc: stable@vger.kernel.org
Cc: Anirudh Rayabharam <mail@anirudhrb.com>
Reported-by: Leonardo Antoniazzi <leoanto@aruba.it>
Signed-off-by: Johan Hovold <johan@kernel.org>
Reviewed-by: Anirudh Rayabharam <mail@anirudhrb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5849cdf8c1 upstream.
Commit in Fixes: added support for kexec-ing a kernel on panic using a
new system call. As part of it, it does prepare a memory map for the new
kernel.
However, while doing so, it wrongly accesses memory it has not
allocated: it accesses the first element of the cmem->ranges[] array in
memmap_exclude_ranges() but it has not allocated the memory for it in
crash_setup_memmap_entries(). As KASAN reports:
BUG: KASAN: vmalloc-out-of-bounds in crash_setup_memmap_entries+0x17e/0x3a0
Write of size 8 at addr ffffc90000426008 by task kexec/1187
(gdb) list *crash_setup_memmap_entries+0x17e
0xffffffff8107cafe is in crash_setup_memmap_entries (arch/x86/kernel/crash.c:322).
317 unsigned long long mend)
318 {
319 unsigned long start, end;
320
321 cmem->ranges[0].start = mstart;
322 cmem->ranges[0].end = mend;
323 cmem->nr_ranges = 1;
324
325 /* Exclude elf header region */
326 start = image->arch.elf_load_addr;
(gdb)
Make sure the ranges array becomes a single element allocated.
[ bp: Write a proper commit message. ]
Fixes: dd5f726076 ("kexec: support for kexec on panic using new system call")
Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Young <dyoung@redhat.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/725fa3dc1da2737f0f6188a1a9701bead257ea9d.camel@gmx.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f4bf09dc3a ]
The ia64_mf() macro defined in tools/arch/ia64/include/asm/barrier.h is
already defined in <asm/gcc_intrin.h> on ia64 which causes libbpf
failing to build:
CC /usr/src/linux/tools/bpf/bpftool//libbpf/staticobjs/libbpf.o
In file included from /usr/src/linux/tools/include/asm/barrier.h:24,
from /usr/src/linux/tools/include/linux/ring_buffer.h:4,
from libbpf.c:37:
/usr/src/linux/tools/include/asm/../../arch/ia64/include/asm/barrier.h:43: error: "ia64_mf" redefined [-Werror]
43 | #define ia64_mf() asm volatile ("mf" ::: "memory")
|
In file included from /usr/include/ia64-linux-gnu/asm/intrinsics.h:20,
from /usr/include/ia64-linux-gnu/asm/swab.h:11,
from /usr/include/linux/swab.h:8,
from /usr/include/linux/byteorder/little_endian.h:13,
from /usr/include/ia64-linux-gnu/asm/byteorder.h:5,
from /usr/src/linux/tools/include/uapi/linux/perf_event.h:20,
from libbpf.c:36:
/usr/include/ia64-linux-gnu/asm/gcc_intrin.h:382: note: this is the location of the previous definition
382 | #define ia64_mf() __asm__ volatile ("mf" ::: "memory")
|
cc1: all warnings being treated as errors
Thus, remove the definition from tools/arch/ia64/include/asm/barrier.h.
Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e2af9da4f8 ]
Fix IA64 discontig.c Section mismatch warnings.
When CONFIG_SPARSEMEM=y and CONFIG_MEMORY_HOTPLUG=y, the functions
computer_pernodesize() and scatter_node_data() should not be marked as
__meminit because they are needed after init, on any memory hotplug
event. Also, early_nr_cpus_node() is called by compute_pernodesize(),
so early_nr_cpus_node() cannot be __meminit either.
WARNING: modpost: vmlinux.o(.text.unlikely+0x1612): Section mismatch in reference from the function arch_alloc_nodedata() to the function .meminit.text:compute_pernodesize()
The function arch_alloc_nodedata() references the function __meminit compute_pernodesize().
This is often because arch_alloc_nodedata lacks a __meminit annotation or the annotation of compute_pernodesize is wrong.
WARNING: modpost: vmlinux.o(.text.unlikely+0x1692): Section mismatch in reference from the function arch_refresh_nodedata() to the function .meminit.text:scatter_node_data()
The function arch_refresh_nodedata() references the function __meminit scatter_node_data().
This is often because arch_refresh_nodedata lacks a __meminit annotation or the annotation of scatter_node_data is wrong.
WARNING: modpost: vmlinux.o(.text.unlikely+0x1502): Section mismatch in reference from the function compute_pernodesize() to the function .meminit.text:early_nr_cpus_node()
The function compute_pernodesize() references the function __meminit early_nr_cpus_node().
This is often because compute_pernodesize lacks a __meminit annotation or the annotation of early_nr_cpus_node is wrong.
Link: https://lkml.kernel.org/r/20210411001201.3069-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Mike Rapoport <rppt@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 416dcc5ce9 ]
Fix the following coccicheck warning:
./drivers/net/ethernet/cavium/liquidio/cn66xx_regs.h:413:6-28:
duplicated argument to & or |
The CN6XXX_INTR_M1UPB0_ERR here is duplicate.
Here should be CN6XXX_INTR_M1UNB0_ERR.
Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2afeec08ab ]
The logic in connect() is currently written with the assumption that
xenbus_watch_pathfmt() will return an error for a node that does not
exist. This assumption is incorrect: xenstore does allow a watch to
be registered for a nonexistent node (and will send notifications
should the node be subsequently created).
As of commit 1f2565780 ("xen-netback: remove 'hotplug-status' once it
has served its purpose"), this leads to a failure when a domU
transitions into XenbusStateConnected more than once. On the first
domU transition into Connected state, the "hotplug-status" node will
be deleted by the hotplug_status_changed() callback in dom0. On the
second or subsequent domU transition into Connected state, the
hotplug_status_changed() callback will therefore never be invoked, and
so the backend will remain stuck in InitWait.
This failure prevents scenarios such as reloading the xen-netfront
module within a domU, or booting a domU via iPXE. There is
unfortunately no way for the domU to work around this dom0 bug.
Fix by explicitly checking for existence of the "hotplug-status" node,
thereby creating the behaviour that was previously assumed to exist.
Signed-off-by: Michael Brown <mbrown@fensystems.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 738fa58ee1 ]
If instruction being single stepped caused a page fault, the kprobes
is cancelled to let the page fault handler continue as a normal page
fault. But the local irqflags are disabled so cpu will restore pstate
with DAIF masked. After pagefault is serviced, the kprobes is
triggerred again, we overwrite the saved_irqflag by calling
kprobes_save_local_irqflag(). NOTE, DAIF is masked in this new saved
irqflag. After kprobes is serviced, the cpu pstate is retored with
DAIF masked.
This patch is inspired by one patch for riscv from Liao Chang.
Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20210412174101.6bfb0594@xhacker.debian
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a994eddb94 ]
Currently psw_idle does not allocate a stack frame and does not
save its r14 and r15 into the save area. Even though this is valid from
call ABI point of view, because psw_idle does not make any calls
explicitly, in reality psw_idle is an entry point for controlled
transition into serving interrupts. So, in practice, psw_idle stack
frame is analyzed during stack unwinding. Depending on build options
that r14 slot in the save area of psw_idle might either contain a value
saved by previous sibling call or complete garbage.
[task 0000038000003c28] do_ext_irq+0xd6/0x160
[task 0000038000003c78] ext_int_handler+0xba/0xe8
[task *0000038000003dd8] psw_idle_exit+0x0/0x8 <-- pt_regs
([task 0000038000003dd8] 0x0)
[task 0000038000003e10] default_idle_call+0x42/0x148
[task 0000038000003e30] do_idle+0xce/0x160
[task 0000038000003e70] cpu_startup_entry+0x36/0x40
[task 0000038000003ea0] arch_call_rest_init+0x76/0x80
So, to make a stacktrace nicer and actually point for the real caller of
psw_idle in this frequently occurring case, make psw_idle save its r14.
[task 0000038000003c28] do_ext_irq+0xd6/0x160
[task 0000038000003c78] ext_int_handler+0xba/0xe8
[task *0000038000003dd8] psw_idle_exit+0x0/0x6 <-- pt_regs
([task 0000038000003dd8] arch_cpu_idle+0x3c/0xd0)
[task 0000038000003e10] default_idle_call+0x42/0x148
[task 0000038000003e30] do_idle+0xce/0x160
[task 0000038000003e70] cpu_startup_entry+0x36/0x40
[task 0000038000003ea0] arch_call_rest_init+0x76/0x80
Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a1ebdb3741 ]
Also some omap3 devices like n900 seem to have eMMC and micro-sd swapped
around with commit 21b2cec61c ("mmc: Set PROBE_PREFER_ASYNCHRONOUS for
drivers that existed in v4.4").
Let's fix the issue with aliases as discussed on the mailing lists. While
the mmc aliases should be board specific, let's first fix the issue with
minimal changes.
Cc: Aaro Koskinen <aaro.koskinen@iki.fi>
Cc: Peter Ujfalusi <peter.ujfalusi@gmail.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0c9fdcdba6 ]
Currently, GENI devices like i2c-qcom-geni fails to probe in ACPI boot,
if interconnect support is enabled. That's because interconnect driver
only supports DT right now. As interconnect is not necessarily required
for basic function of GENI devices, let's shield geni_icc_get() call,
and then all other ICC calls become nop due to NULL icc_path, so that
GENI devices keep working for ACPI boot.
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Link: https://lore.kernel.org/r/20210114112928.11368-1-shawn.guo@linaro.org
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2a2b09c867 ]
In lk 5.11.0-rc2 connecting a USB based Silicon Labs HID to I2C
bridge evaluation board (CP2112EK) causes this warning:
gpio gpiochip0: (cp2112_gpio): detected irqchip that is shared
with multiple gpiochips: please fix the driver
Simply copy what other gpio related drivers do to fix this
particular warning: replicate the struct irq_chip object in each
device instance rather than have a static object which makes that
object (incorrectly) shared by each device.
Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fa8ba6e5dc ]
When input_register_device() fails, no error return code is assigned.
To fix this bug, ret is assigned with -ENOENT as error return code.
Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>