Patch series "userfaultfd: support minor fault handling for shmem", v2.
Overview
========
See my original series [1] for a detailed overview of minor fault handling
in general. The feature in this series works exactly like the hugetblfs
version (from userspace's perspective).
I'm sending this as a separate series because:
- The original minor fault handling series has a full set of R-Bs, and seems
close to being merged. So, it seems reasonable to start looking at this next
step, which extends the basic functionality.
- shmem is different enough that this series may require some additional work
before it's ready, and I don't want to delay the original series
unnecessarily by bundling them together.
Use Case
========
In some cases it is useful to have VM memory backed by tmpfs instead of
hugetlbfs. So, this feature will be used to support the same VM live
migration use case described in my original series.
Additionally, Android folks (Lokesh Gidra <lokeshgidra@google.com>) hope
to optimize the Android Runtime garbage collector using this feature:
"The plan is to use userfaultfd for concurrently compacting the heap.
With this feature, the heap can be shared-mapped at another location where
the GC-thread(s) could continue the compaction operation without the need
to invoke userfault ioctl(UFFDIO_COPY) each time. OTOH, if and when Java
threads get faults on the heap, UFFDIO_CONTINUE can be used to resume
execution. Furthermore, this feature enables updating references in the
'non-moving' portion of the heap efficiently. Without this feature,
uneccessary page copying (ioctl(UFFDIO_COPY)) would be required."
[1] https://lore.kernel.org/linux-fsdevel/20210301222728.176417-1-axelrasmussen@google.com/T/#t
This patch (of 5):
Modify the userfaultfd register API to allow registering shmem VMAs in
minor mode. Modify the shmem mcopy implementation to support
UFFDIO_CONTINUE in order to resolve such faults.
Combine the shmem mcopy handler functions into a single
shmem_mcopy_atomic_pte, which takes a mode parameter. This matches how
the hugetlbfs implementation is structured, and lets us remove a good
chunk of boilerplate.
Link: https://lkml.kernel.org/r/20210302000133.272579-1-axelrasmussen@google.com
Link: https://lkml.kernel.org/r/20210302000133.272579-2-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Wang Qing <wangqing@vivo.com>
Cc: Brian Geffon <bgeffon@google.com>
Cc: Cannon Matthews <cannonmatthews@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Oliver Upton <oupton@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
(cherry picked from commit 4cc6e15679966aa49afc5b114c3c83ba0ac39b05
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git akpm)
Link: https://lore.kernel.org/patchwork/patch/1388146/
Conflicts:
mm/shmem.c
(1. Manual rebase
2. Enclosed shmem_copy_atomic_pte() with CONFIG_USERFAULTFD to avoid
compile erros when USERFAULTFD is not enabled.)
Signed-off-by: Lokesh Gidra <lokeshgidra@google.com>
Bug: 160737021
Bug: 169683130
Change-Id: Idcd822b2a124a089121b9ad8c65061f6979126ec
Patch series "userfaultfd: add minor fault handling", v9.
Overview
========
This series adds a new userfaultfd feature, UFFD_FEATURE_MINOR_HUGETLBFS.
When enabled (via the UFFDIO_API ioctl), this feature means that any
hugetlbfs VMAs registered with UFFDIO_REGISTER_MODE_MISSING will *also*
get events for "minor" faults. By "minor" fault, I mean the following
situation:
Let there exist two mappings (i.e., VMAs) to the same page(s) (shared
memory). One of the mappings is registered with userfaultfd (in minor
mode), and the other is not. Via the non-UFFD mapping, the underlying
pages have already been allocated & filled with some contents. The UFFD
mapping has not yet been faulted in; when it is touched for the first
time, this results in what I'm calling a "minor" fault. As a concrete
example, when working with hugetlbfs, we have huge_pte_none(), but
find_lock_page() finds an existing page.
We also add a new ioctl to resolve such faults: UFFDIO_CONTINUE. The idea
is, userspace resolves the fault by either a) doing nothing if the
contents are already correct, or b) updating the underlying contents using
the second, non-UFFD mapping (via memcpy/memset or similar, or something
fancier like RDMA, or etc...). In either case, userspace issues
UFFDIO_CONTINUE to tell the kernel "I have ensured the page contents are
correct, carry on setting up the mapping".
Use Case
========
Consider the use case of VM live migration (e.g. under QEMU/KVM):
1. While a VM is still running, we copy the contents of its memory to a
target machine. The pages are populated on the target by writing to the
non-UFFD mapping, using the setup described above. The VM is still running
(and therefore its memory is likely changing), so this may be repeated
several times, until we decide the target is "up to date enough".
2. We pause the VM on the source, and start executing on the target machine.
During this gap, the VM's user(s) will *see* a pause, so it is desirable to
minimize this window.
3. Between the last time any page was copied from the source to the target, and
when the VM was paused, the contents of that page may have changed - and
therefore the copy we have on the target machine is out of date. Although we
can keep track of which pages are out of date, for VMs with large amounts of
memory, it is "slow" to transfer this information to the target machine. We
want to resume execution before such a transfer would complete.
4. So, the guest begins executing on the target machine. The first time it
touches its memory (via the UFFD-registered mapping), userspace wants to
intercept this fault. Userspace checks whether or not the page is up to date,
and if not, copies the updated page from the source machine, via the non-UFFD
mapping. Finally, whether a copy was performed or not, userspace issues a
UFFDIO_CONTINUE ioctl to tell the kernel "I have ensured the page contents
are correct, carry on setting up the mapping".
We don't have to do all of the final updates on-demand. The userfaultfd manager
can, in the background, also copy over updated pages once it receives the map of
which pages are up-to-date or not.
Interaction with Existing APIs
==============================
Because this is a feature, a registered VMA could potentially receive both
missing and minor faults. I spent some time thinking through how the
existing API interacts with the new feature:
UFFDIO_CONTINUE cannot be used to resolve non-minor faults, as it does not
allocate a new page. If UFFDIO_CONTINUE is used on a non-minor fault:
- For non-shared memory or shmem, -EINVAL is returned.
- For hugetlb, -EFAULT is returned.
UFFDIO_COPY and UFFDIO_ZEROPAGE cannot be used to resolve minor faults.
Without modifications, the existing codepath assumes a new page needs to
be allocated. This is okay, since userspace must have a second
non-UFFD-registered mapping anyway, thus there isn't much reason to want
to use these in any case (just memcpy or memset or similar).
- If UFFDIO_COPY is used on a minor fault, -EEXIST is returned.
- If UFFDIO_ZEROPAGE is used on a minor fault, -EEXIST is returned (or -EINVAL
in the case of hugetlb, as UFFDIO_ZEROPAGE is unsupported in any case).
- UFFDIO_WRITEPROTECT simply doesn't work with shared memory, and returns
-ENOENT in that case (regardless of the kind of fault).
Future Work
===========
This series only supports hugetlbfs. I have a second series in flight to
support shmem as well, extending the functionality. This series is more
mature than the shmem support at this point, and the functionality works
fully on hugetlbfs, so this series can be merged first and then shmem
support will follow.
This patch (of 6):
This feature allows userspace to intercept "minor" faults. By "minor"
faults, I mean the following situation:
Let there exist two mappings (i.e., VMAs) to the same page(s). One of the
mappings is registered with userfaultfd (in minor mode), and the other is
not. Via the non-UFFD mapping, the underlying pages have already been
allocated & filled with some contents. The UFFD mapping has not yet been
faulted in; when it is touched for the first time, this results in what
I'm calling a "minor" fault. As a concrete example, when working with
hugetlbfs, we have huge_pte_none(), but find_lock_page() finds an existing
page.
This commit adds the new registration mode, and sets the relevant flag on
the VMAs being registered. In the hugetlb fault path, if we find that we
have huge_pte_none(), but find_lock_page() does indeed find an existing
page, then we have a "minor" fault, and if the VMA has the userfaultfd
registration flag, we call into userfaultfd to handle it.
This is implemented as a new registration mode, instead of an API feature.
This is because the alternative implementation has significant drawbacks
[1].
However, doing it this was requires we allocate a VM_* flag for the new
registration mode. On 32-bit systems, there are no unused bits, so this
feature is only supported on architectures with
CONFIG_ARCH_USES_HIGH_VMA_FLAGS. When attempting to register a VMA in
MINOR mode on 32-bit architectures, we return -EINVAL.
[1] https://lore.kernel.org/patchwork/patch/1380226/
Link: https://lkml.kernel.org/r/20210301222728.176417-1-axelrasmussen@google.com
Link: https://lkml.kernel.org/r/20210301222728.176417-2-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chinwen Chang <chinwen.chang@mediatek.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Michal Koutn" <mkoutny@suse.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shawn Anastasio <shawn@anastas.io>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Steven Price <steven.price@arm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Adam Ruprecht <ruprecht@google.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Cannon Matthews <cannonmatthews@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
(cherry picked from commit 82a150ec394f6b944e26786b907fc0deab5b2064
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git akpm)
Link: https://lore.kernel.org/patchwork/patch/1388132/
Conflicts:
arch/arm64/Kconfig
fs/userfaultfd.c
mm/hugetlb.c
(All related to SPF feature. Resolved by manual rebase)
Signed-off-by: Lokesh Gidra <lokeshgidra@google.com>
Bug: 160737021
Bug: 169683130
Change-Id: I43b37272d531341439ceaa03213d0e2415e04688
There are two tracepoints in dwc3_readl() and dwc3_writel().
This patch will export the tracepoints so that vendor modules
can use them.
Bug: 184920962
Signed-off-by: Ray Chi <raychi@google.com>
Change-Id: I1170d853be1fa1c47afbba133567b1996418d8e8
Commit ed47acc0c8 ("ASoC: soc-core: Prevent warning if no DMI table is
present") changed soc-core.c by adding #include <linux/acpi.h>. That
caused the visibility of other symbols to suddenly change and so
genksyms changed for some soc-core.c functions when really nothing
changed at all.
Work around this "fun" by providing a __GENKSYMS__ check to include the
acpi.h file or not. Ugh.
Bug: 161946584
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I4b3c5634de2336af6bbf99f25fd9250a365991bf
This reverts commit e21d2b9235
It breaks the abi but we can bring it back later on when the KABI update
happens in a few days.
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I7a5861c037be3e35973893d8c91eda9133bf8595
This reverts commit 7973a0dad0.
It breaks the abi but we can bring it back later on when the KABI update
happens in a few days.
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I01fcc3fd586cb0e748524355403b3871c41df2b7
This reverts commit 1a5751d58b.
It breaks the abi but we can bring it back later on when the KABI update
happens in a few days.
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I713290c735f2c01291c539ab346341fd9aac91ad
Changes in 5.10.28
arm64: mm: correct the inside linear map range during hotplug check
bpf: Fix fexit trampoline.
virtiofs: Fail dax mount if device does not support it
ext4: shrink race window in ext4_should_retry_alloc()
ext4: fix bh ref count on error paths
fs: nfsd: fix kconfig dependency warning for NFSD_V4
rpc: fix NULL dereference on kmalloc failure
iomap: Fix negative assignment to unsigned sis->pages in iomap_swapfile_activate
ASoC: rt1015: fix i2c communication error
ASoC: rt5640: Fix dac- and adc- vol-tlv values being off by a factor of 10
ASoC: rt5651: Fix dac- and adc- vol-tlv values being off by a factor of 10
ASoC: sgtl5000: set DAP_AVC_CTRL register to correct default value on probe
ASoC: es8316: Simplify adc_pga_gain_tlv table
ASoC: soc-core: Prevent warning if no DMI table is present
ASoC: cs42l42: Fix Bitclock polarity inversion
ASoC: cs42l42: Fix channel width support
ASoC: cs42l42: Fix mixer volume control
ASoC: cs42l42: Always wait at least 3ms after reset
NFSD: fix error handling in NFSv4.0 callbacks
kernel: freezer should treat PF_IO_WORKER like PF_KTHREAD for freezing
vhost: Fix vhost_vq_reset()
io_uring: fix ->flags races by linked timeouts
scsi: st: Fix a use after free in st_open()
scsi: qla2xxx: Fix broken #endif placement
staging: comedi: cb_pcidas: fix request_irq() warn
staging: comedi: cb_pcidas64: fix request_irq() warn
ASoC: rt5659: Update MCLK rate in set_sysclk()
ASoC: rt711: add snd_soc_component remove callback
thermal/core: Add NULL pointer check before using cooling device stats
locking/ww_mutex: Simplify use_ww_ctx & ww_ctx handling
locking/ww_mutex: Fix acquire/release imbalance in ww_acquire_init()/ww_acquire_fini()
nvmet-tcp: fix kmap leak when data digest in use
io_uring: imply MSG_NOSIGNAL for send[msg]()/recv[msg]() calls
static_call: Align static_call_is_init() patching condition
ext4: do not iput inode under running transaction in ext4_rename()
io_uring: call req_set_fail_links() on short send[msg]()/recv[msg]() with MSG_WAITALL
net: mvpp2: fix interrupt mask/unmask skip condition
flow_dissector: fix TTL and TOS dissection on IPv4 fragments
can: dev: move driver related infrastructure into separate subdir
net: introduce CAN specific pointer in the struct net_device
can: tcan4x5x: fix max register value
brcmfmac: clear EAP/association status bits on linkdown events
ath11k: add ieee80211_unregister_hw to avoid kernel crash caused by NULL pointer
rtw88: coex: 8821c: correct antenna switch function
netdevsim: dev: Initialize FIB module after debugfs
iwlwifi: pcie: don't disable interrupts for reg_lock
ath10k: hold RCU lock when calling ieee80211_find_sta_by_ifaddr()
net: ethernet: aquantia: Handle error cleanup of start on open
appletalk: Fix skb allocation size in loopback case
net: ipa: remove two unused register definitions
net: ipa: fix register write command validation
net: wan/lmc: unregister device when no matching device is found
net: 9p: advance iov on empty read
bpf: Remove MTU check in __bpf_skb_max_len
ACPI: tables: x86: Reserve memory occupied by ACPI tables
ACPI: processor: Fix CPU0 wakeup in acpi_idle_play_dead()
ALSA: usb-audio: Apply sample rate quirk to Logitech Connect
ALSA: hda: Re-add dropped snd_poewr_change_state() calls
ALSA: hda: Add missing sanity checks in PM prepare/complete callbacks
ALSA: hda/realtek: fix a determine_headset_type issue for a Dell AIO
ALSA: hda/realtek: call alc_update_headset_mode() in hp_automute_hook
ALSA: hda/realtek: fix mute/micmute LEDs for HP 640 G8
xtensa: fix uaccess-related livelock in do_page_fault
xtensa: move coprocessor_flush to the .text section
KVM: SVM: load control fields from VMCB12 before checking them
KVM: SVM: ensure that EFER.SVME is set when running nested guest or on nested vmexit
PM: runtime: Fix race getting/putting suppliers at probe
PM: runtime: Fix ordering in pm_runtime_get_suppliers()
tracing: Fix stack trace event size
s390/vdso: copy tod_steering_delta value to vdso_data page
s390/vdso: fix tod_steering_delta type
mm: fix race by making init_zero_pfn() early_initcall
drm/amdkfd: dqm fence memory corruption
drm/amdgpu: fix offset calculation in amdgpu_vm_bo_clear_mappings()
drm/amdgpu: check alignment on CPU page for bo map
reiserfs: update reiserfs_xattrs_initialized() condition
drm/imx: fix memory leak when fails to init
drm/tegra: dc: Restore coupling of display controllers
drm/tegra: sor: Grab runtime PM reference across reset
vfio/nvlink: Add missing SPAPR_TCE_IOMMU depends
pinctrl: rockchip: fix restore error in resume
extcon: Add stubs for extcon_register_notifier_all() functions
extcon: Fix error handling in extcon_dev_register
firmware: stratix10-svc: reset COMMAND_RECONFIG_FLAG_PARTIAL to 0
usb: dwc3: pci: Enable dis_uX_susphy_quirk for Intel Merrifield
video: hyperv_fb: Fix a double free in hvfb_probe
firewire: nosy: Fix a use-after-free bug in nosy_ioctl()
usbip: vhci_hcd fix shift out-of-bounds in vhci_hub_control()
USB: quirks: ignore remote wake-up on Fibocom L850-GL LTE modem
usb: musb: Fix suspend with devices connected for a64
usb: xhci-mtk: fix broken streams issue on 0.96 xHCI
cdc-acm: fix BREAK rx code path adding necessary calls
USB: cdc-acm: untangle a circular dependency between callback and softint
USB: cdc-acm: downgrade message to debug
USB: cdc-acm: fix double free on probe failure
USB: cdc-acm: fix use-after-free after probe failure
usb: gadget: udc: amd5536udc_pci fix null-ptr-dereference
usb: dwc2: Fix HPRT0.PrtSusp bit setting for HiKey 960 board.
usb: dwc2: Prevent core suspend when port connection flag is 0
usb: dwc3: qcom: skip interconnect init for ACPI probe
usb: dwc3: gadget: Clear DEP flags after stop transfers in ep disable
soc: qcom-geni-se: Cleanup the code to remove proxy votes
staging: rtl8192e: Fix incorrect source in memcpy()
staging: rtl8192e: Change state information from u16 to u8
driver core: clear deferred probe reason on probe retry
drivers: video: fbcon: fix NULL dereference in fbcon_cursor()
riscv: evaluate put_user() arg before enabling user access
Revert "kernel: freezer should treat PF_IO_WORKER like PF_KTHREAD for freezing"
bpf: Use NOP_ATOMIC5 instead of emit_nops(&prog, 5) for BPF_TRAMP_F_CALL_ORIG
Linux 5.10.28
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ifdbbeda8de3ee22a7aa3f5d3b10becf0aba1a124
Add vendor hook to get signal for vendor-specific tuning.
Bug: 184898838
Signed-off-by: Zhuguangqing <zhuguangqing@xiaomi.com>
Change-Id: I83a28b0a6eb413976f4c57f2314d008ad792fa0d
Use the correct printk length specifier [%llx] for u64 variable.
This fixes the following warning:
arch/arm64/mm/mmu.c: In function ‘check_range_driver_managed’:
./include/linux/kern_levels.h:5:18: warning: format ‘%lx’ expects
argument of type ‘long unsigned int’, but argument 3 has type
‘u64’ {aka ‘long long unsigned int’} [-Wformat=]
[...]
arch/arm64/mm/mmu.c:1515:3: note: in expansion of macro ‘pr_err’
1515 | pr_err("%s: couldn't find memory resource for start 0x%lx\n",
| ^~~~~~
Bug: 183339614
Fixes: 1b4aca7d82 (ANDROID: arm64/mm: implement {populate/depopulate}_range_driver_managed)
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Change-Id: I664223ef6c0c5f415e0b6465a0b589667f26e551
LLVM changed the expected function signature for
llvm_gcda_emit_function() in the clang-11 release. Users of clang-11 or
newer may have noticed their kernels producing invalid coverage
information:
$ llvm-cov gcov -a -c -u -f -b <input>.gcda -- gcno=<input>.gcno
1 <func>: checksum mismatch, \
(<lineno chksum A>, <cfg chksum B>) != (<lineno chksum A>, <cfg chksum C>)
2 Invalid .gcda File!
...
Fix up the function signatures so calling this function interprets its
parameters correctly and computes the correct cfg checksum. In
particular, in clang-11, the additional checksum is no longer optional.
Link: https://reviews.llvm.org/rG25544ce2df0daa4304c07e64b9c8b0f7df60c11d
Cc: stable@vger.kernel.org #5.4+
Reported-by: Prasad Sodagudi <psodagud@quicinc.com>
Tested-by: Prasad Sodagudi <psodagud@quicinc.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
(am from https://lore.kernel.org/lkml/20210407185456.41943-2-ndesaulniers@google.com/)
Bug: 182501993
Change-Id: Icd98cf11a6fca0fc55b1399e5b244dc1c81c71e8
commit 126c2fc191 ("ANDROID: dma-heap: Make the
page-pool/deferred-free libraries built-in") introduced deferred_free as
a built-in. Add it to qcom symbol list.
deferred_free caused us to need __refrigerator. Now that deferred_free
is builtin, drop __refrigerator.
Bug: 183902174
Change-Id: I362b49b176aaa418d79840890454fa43775b4611
Signed-off-by: Elliot Berman <eberman@codeaurora.org>
Group support is not implemented and this rather disturbes downstream
merges. So, drop them.
Fixes: 2fa0951b66 ("ANDROID: Initial Android 12 OWNERS for abi metafiles")
Signed-off-by: Matthias Maennich <maennich@google.com>
Change-Id: I7d1462ff87b05eb678b1c97f5e0b89b97d4e91a5
Partners may want Image.lz4, so generate it as part of aarch64 builds.
Bug: 184667897
Signed-off-by: J. Avila <elavila@google.com>
Change-Id: I434287c881eb5cc906ff205e82866ede14014528
The explicit out-fences in crtc are signaled as part of vblank event,
indicating all framebuffers present on the Atomic Commit request are
scanned out on the screen. Though the fence signal and the vblank event
notification happens at the same time, triggered by the same hardware
vsync event, the timestamp set in both are different. With drivers
supporting precise vblank timestamp the difference between the two
timestamps would be even higher. This might have an impact on use-mode
frameworks using these fence timestamps for purposes other than simple
buffer usage. For instance, the Android framework [1] uses the
retire-fences as an alternative to vblank when frame-updates are in
progress. Set the fence timestamp during send vblank event using a new
drm_send_event_timestamp_locked variant to avoid discrepancies.
[1] https://android.googlesource.com/platform/frameworks/native/+/master/
services/surfaceflinger/Scheduler/Scheduler.cpp#397
Changes in v2:
- Use drm_send_event_timestamp_locked to update fence timestamp
- add more information to commit text
Changes in v3:
- use same backend helper function for variants of drm_send_event to
avoid code duplications
Changes in v4:
- remove WARN_ON from drm_send_event_timestamp_locked
Bug: 173434777
Signed-off-by: Veera Sundaram Sankaran <veeras@codeaurora.org>
Signed-off-by: Narendra Muppalla <NarendraM@codeaurora.org>
Reviewed-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
[sumits: minor parenthesis alignment correction]
Link: https://patchwork.freedesktop.org/patch/msgid/1610757107-11892-2-git-send-email-veeras@codeaurora.org
(cherry picked from commit a78e7a51d2)
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Change-Id: Iaa29508f72e2c9c7abd2e3fe7a4dc7b9665336a5
Some drivers have hardware capability to get the precise HW timestamp
of certain events based on which the fences are triggered. The delta
between the event HW timestamp & current HW reference timestamp can
be used to calculate the timestamp in kernel's CLOCK_MONOTONIC time
domain. This allows it to set accurate timestamp factoring out any
software and IRQ latencies. Add a timestamp variant of fence signal
function, dma_fence_signal_timestamp to allow drivers to update the
precise timestamp for fences.
Changes in v2:
- Add a new fence signal variant instead of modifying fence struct
Changes in v3:
- Add timestamp domain information to commit-text and
dma_fence_signal_timestamp documentation
Bug: 173434777
Signed-off-by: Veera Sundaram Sankaran <veeras@codeaurora.org>
Signed-off-by: Narendra Muppalla <NarendraM@codeaurora.org>
Reviewed-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
[sumits: minor parenthesis alignment]
Link: https://patchwork.freedesktop.org/patch/msgid/1610757107-11892-1-git-send-email-veeras@codeaurora.org
(cherry picked from commit 5a164ac4db)
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Change-Id: I994fb91264aac1f5c141415647df3467819ff1f3
commit d3dc04cd81 upstream.
This reverts commit 15b2219fac.
Before IO threads accepted signals, the freezer using take signals to wake
up an IO thread would cause them to loop without any way to clear the
pending signal. That is no longer the case, so stop special casing
PF_IO_WORKER in the freezer.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 285a76bb2c upstream.
The <asm/uaccess.h> header has a problem with put_user(a, ptr) if
the 'a' is not a simple variable, such as a function. This can lead
to the compiler producing code as so:
1: enable_user_access()
2: evaluate 'a' into register 'r'
3: put 'r' to 'ptr'
4: disable_user_acess()
The issue is that 'a' is now being evaluated with the user memory
protections disabled. So we try and force the evaulation by assigning
'x' to __val at the start, and hoping the compiler barriers in
enable_user_access() do the job of ordering step 2 before step 1.
This has shown up in a bug where 'a' sleeps and thus schedules out
and loses the SR_SUM flag. This isn't sufficient to fully fix, but
should reduce the window of opportunity. The first instance of this
we found is in scheudle_tail() where the code does:
$ less -N kernel/sched/core.c
4263 if (current->set_child_tid)
4264 put_user(task_pid_vnr(current), current->set_child_tid);
Here, the task_pid_vnr(current) is called within the block that has
enabled the user memory access. This can be made worse with KASAN
which makes task_pid_vnr() a rather large call with plenty of
opportunity to sleep.
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Reported-by: syzbot+e74b94fe601ab9552d69@syzkaller.appspotmail.com
Suggested-by: Arnd Bergman <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
--
Changes since v1:
- fixed formatting and updated the patch description with more info
Changes since v2:
- fixed commenting on __put_user() (schwab@linux-m68k.org)
Change since v3:
- fixed RFC in patch title. Should be ready to merge.
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
commit f0acf637d6 upstream.
When retrying a deferred probe, any old defer reason string should be
discarded. Otherwise, if the probe is deferred again at a different spot,
but without setting a message, the now incorrect probe reason will remain.
This was observed with the i.MX I2C driver, which ultimately failed
to probe due to lack of the GPIO driver. The probe defer for GPIO
doesn't record a message, but a previous probe defer to clock_get did.
This had the effect that /sys/kernel/debug/devices_deferred listed
a misleading probe deferral reason.
Cc: stable <stable@vger.kernel.org>
Fixes: d090b70ede ("driver core: add deferring probe reason to devices_deferred property")
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Reviewed-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
Link: https://lore.kernel.org/r/20210319110459.19966-1-a.fatoum@pengutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e78836ae76 upstream.
The "u16 CcxRmState[2];" array field in struct "rtllib_network" has 4
bytes in total while the operations performed on this array through-out
the code base are only 2 bytes.
The "CcxRmState" field is fed only 2 bytes of data using memcpy():
(In rtllib_rx.c:1972)
memcpy(network->CcxRmState, &info_element->data[4], 2)
With "info_element->data[]" being a u8 array, if 2 bytes are written
into "CcxRmState" (whose one element is u16 size), then the 2 u8
elements from "data[]" gets squashed and written into the first element
("CcxRmState[0]") while the second element ("CcxRmState[1]") is never
fed with any data.
Same in file rtllib_rx.c:2522:
memcpy(dst->CcxRmState, src->CcxRmState, 2);
The above line duplicates "src" data to "dst" but only writes 2 bytes
(and not 4, which is the actual size). Again, only 1st element gets the
value while the 2nd element remains uninitialized.
This later makes operations done with CcxRmState unpredictable in the
following lines as the 1st element is having a squashed number while the
2nd element is having an uninitialized random number.
rtllib_rx.c:1973: if (network->CcxRmState[0] != 0)
rtllib_rx.c:1977: network->MBssidMask = network->CcxRmState[1] & 0x07;
network->MBssidMask is also of type u8 and not u16.
Fix this by changing the type of "CcxRmState" from u16 to u8 so that the
data written into this array and read from it make sense and are not
random values.
NOTE: The wrong initialization of "CcxRmState" can be seen in the
following commit:
commit ecdfa44610 ("Staging: add Realtek 8192 PCI wireless driver")
The above commit created a file `rtl8192e/ieee80211.h` which used to
have the faulty line. The file has been deleted (or possibly renamed)
with the contents copied in to a new file `rtl8192e/rtllib.h` along with
additional code in the commit 94a799425e (tagged in Fixes).
Fixes: 94a799425e ("From: wlanfae <wlanfae@realtek.com> [PATCH 1/8] rtl8192e: Import new version of driver from realtek")
Cc: stable@vger.kernel.org
Signed-off-by: Atul Gopinathan <atulgopinathan@gmail.com>
Link: https://lore.kernel.org/r/20210323113413.29179-2-atulgopinathan@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 72ad25fbbb upstream.
The variable "info_element" is of the following type:
struct rtllib_info_element *info_element
defined in drivers/staging/rtl8192e/rtllib.h:
struct rtllib_info_element {
u8 id;
u8 len;
u8 data[];
} __packed;
The "len" field defines the size of the "data[]" array. The code is
supposed to check if "info_element->len" is greater than 4 and later
equal to 6. If this is satisfied then, the last two bytes (the 4th and
5th element of u8 "data[]" array) are copied into "network->CcxRmState".
Right now the code uses "memcpy()" with the source as "&info_element[4]"
which would copy in wrong and unintended information. The struct
"rtllib_info_element" has a size of 2 bytes for "id" and "len",
therefore indexing will be done in interval of 2 bytes. So,
"info_element[4]" would point to data which is beyond the memory
allocated for this pointer (that is, at x+8, while "info_element" has
been allocated only from x to x+7 (2 + 6 => 8 bytes)).
This patch rectifies this error by using "&info_element->data[4]" which
correctly copies the last two bytes of "data[]".
NOTE: The faulty line of code came from the following commit:
commit ecdfa44610 ("Staging: add Realtek 8192 PCI wireless driver")
The above commit created the file `rtl8192e/ieee80211/ieee80211_rx.c`
which had the faulty line of code. This file has been deleted (or
possibly renamed) with the contents copied in to a new file
`rtl8192e/rtllib_rx.c` along with additional code in the commit
94a799425e (tagged in Fixes).
Fixes: 94a799425e ("From: wlanfae <wlanfae@realtek.com> [PATCH 1/8] rtl8192e: Import new version of driver from realtek")
Cc: stable@vger.kernel.org
Signed-off-by: Atul Gopinathan <atulgopinathan@gmail.com>
Link: https://lore.kernel.org/r/20210323113413.29179-1-atulgopinathan@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 29d96eb261 upstream.
This reverts commit 048eb908a1 ("soc: qcom-geni-se: Add interconnect
support to fix earlycon crash")
ICC core and platforms drivers supports sync_state feature, which
ensures that the default ICC BW votes from the bootloader is not
removed until all it's consumers are probes.
The proxy votes were needed in case other QUP child drivers
I2C, SPI probes before UART, they can turn off the QUP-CORE clock
which is shared resources for all QUP driver, this causes unclocked
access to HW from earlycon.
Given above support from ICC there is no longer need to maintain
proxy votes on QUP-CORE ICC node from QUP wrapper driver for early
console usecase, the default votes won't be removed until real
console is probed.
Cc: stable@vger.kernel.org
Fixes: 266cd33b59 ("interconnect: qcom: Ensure that the floor bandwidth value is enforced")
Fixes: 7d3b0b0d81 ("interconnect: qcom: Use icc_sync_state")
Signed-off-by: Roja Rani Yarubandi <rojay@codeaurora.org>
Signed-off-by: Akash Asthana <akashast@codeaurora.org>
Reviewed-by: Matthias Kaehlcke <mka@chromium.org>
Link: https://lore.kernel.org/r/20210324101836.25272-2-rojay@codeaurora.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5e4010e36a upstream.
The ACPI probe starts failing since commit bea46b9815 ("usb: dwc3:
qcom: Add interconnect support in dwc3 driver"), because there is no
interconnect support for ACPI, and of_icc_get() call in
dwc3_qcom_interconnect_init() will just return -EINVAL.
Fix the problem by skipping interconnect init for ACPI probe, and then
the NULL icc_path_ddr will simply just scheild all ICC calls.
Fixes: bea46b9815 ("usb: dwc3: qcom: Add interconnect support in dwc3 driver")
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210311060318.25418-1-shawn.guo@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 93f672804b upstream.
In host mode port connection status flag is "0" when loading
the driver. After loading the driver system asserts suspend
which is handled by "_dwc2_hcd_suspend()" function. Before
the system suspend the port connection status is "0". As
result need to check the "port_connect_status" if it is "0",
then skipping entering to suspend.
Cc: <stable@vger.kernel.org> # 5.2
Fixes: 6f6d70597c ("usb: dwc2: bus suspend/resume for hosts with DWC2_POWER_DOWN_PARAM_NONE")
Signed-off-by: Artur Petrosyan <Arthur.Petrosyan@synopsys.com>
Link: https://lore.kernel.org/r/20210326102510.BDEDEA005D@mailhost.synopsys.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 72035f4954 upstream.
init_dma_pools() calls dma_pool_create(...dev->dev) to create dma pool.
however, dev->dev is actually set after calling init_dma_pools(), which
effectively makes dma_pool_create(..NULL) and cause crash.
To fix this issue, init dma only after dev->dev is set.
[ 1.317993] RIP: 0010:dma_pool_create+0x83/0x290
[ 1.323257] Call Trace:
[ 1.323390] ? pci_write_config_word+0x27/0x30
[ 1.323626] init_dma_pools+0x41/0x1a0 [snps_udc_core]
[ 1.323899] udc_pci_probe+0x202/0x2b1 [amd5536udc_pci]
Fixes: 7c51247a1f (usb: gadget: udc: Provide correct arguments for 'dma_pool_create')
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Tong Zhang <ztong0001@gmail.com>
Link: https://lore.kernel.org/r/20210317230400.357756-1-ztong0001@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>