Speculative page fault handler needs to detect concurrent pmd changes
and relies on vma seqcount for that. pmdp_collapse_flush(), set_huge_pmd() and collapse_and_free_pmd() can modify a pmd.
vm_write_{begin|end} are needed in the paths which can call these
functions for page fault handler to detect pmd changes.
Bug: 257443051
Change-Id: Ieb784b5f44901b66a594f61b9e7c91190ff97f80
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Extend filemap_fault() to handle speculative faults.
In the speculative case, we will only be fishing existing pages out of
the page cache. The logic we use mirrors what is done in the
non-speculative case, assuming that pages are found in the page cache,
are up to date and not already locked, and that readahead is not
necessary at this time. In all other cases, the fault is aborted to be
handled non-speculatively.
Signed-off-by: Michel Lespinasse <michel@lespinasse.org>
Link: https://lore.kernel.org/all/20210407014502.24091-26-michel@lespinasse.org/
Conflicts:
mm/filemap.c
1. Added back file_ra_state variable used by SPF path.
2. Updated comment for filemap_fault to reflect SPF locking rules.
Bug: 161210518
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I82eba7fcfc81876245c2e65bc5ae3d33ddfcc368
Checks of pmd during speculative page fault handling are racy because
pmd is unprotected and might be modified or cleared. This might cause
use-after-free reads from speculative path, therefore prevent such
checks. At the beginning of speculation pmd is checked to be valid and
if it's changed before page fault is handled, the change will be detected
and page fault will be retried under mmap_lock protection.
Bug: 257443051
Change-Id: I0cbd3b0b44e8296cf0d6cb298fae48c696580068
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
handle_userfault() should be protected against a concurrent
userfaultfd_release(), therefore handling a userfaults speculatively
without mmap_lock protection should be disallowed.
Bug: 257443051
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: Ic6ae39329c73e8849048ea15b5351a49346404d3
Speculative page fault checks pmd to be valid before starting to handle
the page fault and pte_alloc() should do nothing if pmd stays valid.
If pmd gets changed during speculative page fault, we will detect the
change later and retry with mmap_lock. Therefore pte_alloc() can be
safely skipped and this prevents the racy pmd_lock() call which can
access pmd->ptl after pmd was cleared.
Bug: 257443051
Change-Id: Iec57df5530dba6e0e0bdf9f7500f910851c3d3fd
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
When a task yields, it relinquishes the cpu and
scheduler is tasked to find another task.
However our vendor scheduler logic implementation
could return the same task leading to a loop where
the yielded task gets to run back, so add hook point
in do_sched_yield() for vendor can do some work
before task is scheduled.
Bug: 205804537
Change-Id: I6528c3f4b0ee360559ef9c97cb1eb2b2d1357870
Signed-off-by: Tengfei Fan <quic_tengfan@quicinc.com>
Signed-off-by: Sai Harshini Nimmala <quic_snimmala@quicinc.com>
Signed-off-by: Khalid Shaik <khalid.s@samsung.com>
SysMMU_SYNCs provide an invalidation-complete signal to the S2MPU
driver but the latency can be quite high. Improve this by waiting for
all the SYNCs in parallel - separate the initiation of invalidation
barrier from waiting for completion. This way we initiate invalidation
on all SYNCs first, then wait for all of them to complete.
The previously introduced exponential-backoff only kicks in if the
SYNC_COMP_COMPLETE bit is not set after the parallel invalidation.
Bug: 249161451
Signed-off-by: David Brazdil <dbrazdil@google.com>
Change-Id: I9d544bc65f8633d376c7ccd65ea23195ca432964
Add a new callback to pkvm_iommu_ops called after
host_stage2_idmap_apply on all IOMMU devices. This allows the drivers to
complete operations like invalidation in two stages.
Bug: 249161451
Signed-off-by: David Brazdil <dbrazdil@google.com>
Change-Id: I9c077fd2b18ce54ad67eb34ef16bc94428797419
On the guest teardown path, pKVM will zero the pages used to back the
guest shadow data structures before returning them to the host as they
may contain secrets (e.g. in the vCPU registers). However, the zeroing
is done using a cacheable alias, and CMOs are missing, hence giving the
host a potential opportunity to read the original content of the shadow
structs from memory.
Fix this by issuing CMOs after zeroing the pages.
Bug: 259551298
Change-Id: Id696d47d16e4c3fd870cb70b792eeb7f2282fc78
Signed-off-by: Quentin Perret <qperret@google.com>
This will allow a client program to avoid redundant actions on ashmem
buffers which it has already seen.
Bug: 244233389
Change-Id: Ica57a8842ff163eae5f9eca8141b439091ec0940
Signed-off-by: Mark Fasheh <mfasheh@google.com>
If the host issues a PSCI SYSTEM_RESET2 call requesting a warm reset
while guest pages are live in the system, then pKVM attempts to convert
this to a cold PSCI SYSTEM_RESET request to ensure the EL3 will clear
memory on the next boot. However, this logic is quite badly broken and
will instead attempt to take the 'mem_protect_lock' spinlock twice which
results in a deadlock.
Fix the repainting so that the 'host_ctxt' is updated inline and we
forward the updated request directly to EL3.
Signed-off-by: Will Deacon <will@kernel.org>
Bug: 259523340
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I44719466b7f5abddf73730a3b74db13f935f92ec
Export reweight_task for vendor usage when they are trying to manipulate
task prio. After the prio changed, it will need to update its load
weight to take effect. Therefore, this function needs to be called
from vendor kernel module. It could be used with
trace_android_rvh_set_user_nice and trace_android_rvh_setscheduler.
Bug: 245675204
Change-Id: I0033518bf1cbd0a8129795743b95340f439d5fe8
Signed-off-by: Rick Yiu <rickyiu@google.com>
If block address is still alive, we should give a valid node block even after
shutdown. Otherwise, we can see zero data when reading out a file.
Bug: 257271565
Cc: stable@vger.kernel.org
Fixes: 83a3bfdb5a ("f2fs: indicate shutdown f2fs to allow unmount successfully")
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
(cherry picked from commit 6953bf65286d git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs.git dev)
Change-Id: Ifb70f6c73bd67d5112ee9fa1a5e4ad8e10ae8517
When a broken USB accessory connects to a USB host, usbcore might
keep doing enumeration retries. If the host has a watchdog mechanism,
the kernel panic will happen on the host.
This patch provides an attribute early_stop to limit the numbers of retries
for each port of a hub. If a port was marked with early_stop attribute,
unsuccessful connection attempts will fail quickly. In addition, if an
early_stop port has failed to initialize, it will ignore all future
connection events until early_stop attribute is clear.
Signed-off-by: Ray Chi <raychi@google.com>
Reviewed-by: Alan Stern <stern@rowland.harvard.edu>
Link: https://lore.kernel.org/r/20221107072754.3336357-1-raychi@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bug: 236915598
(cherry picked from commit 430d57f53e
https: //git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git/ usb-next)
Change-Id: Ib85522ca38c0f26ece9807d5304991853f155669
Signed-off-by: Ray Chi <raychi@google.com>
When a protected guest shares or unshares a page with the host, we
should decrement and increment the PSCI MEM_PROTECT refcount respectively
since shared pages do not require poisoning on the reclaim path and will
therefore not be accounted for.
Bug: 258425493
Reported-by: Quentin Perret <qperret@google.com>
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I80a4fad44de4313c6708a8259a1802ded379f03b
The recent change of page_cache_ra_unbounded() arguments was buggy in the
two callers, causing us to readahead the wrong pages. Move the definition
of ractl down to after the index is set correctly. This affected
performance on configurations that use fs-verity.
Link: https://lkml.kernel.org/r/20221012193419.1453558-1-willy@infradead.org
Fixes: 73bb49da50 ("mm/readahead: make page_cache_ra_unbounded take a readahead_control")
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reported-by: Jintao Yin <nicememory@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Bug: 258554362
(cherry picked from commit 4fa0e3ff21)
Change-Id: Ib5160c5c53629be328c370f5d5d464956d6a6312
Signed-off-by: Eric Biggers <ebiggers@google.com>
We shouldn't hold lru_lock to proceed blk_finish_plug.
Fixes: 89fed37332 ("ANDROID: vendor hook to control blk_plug for shrink_lruvec")
Bug: 255471591
Change-Id: Ie9d9b0e4ee76b4735e802b2a202fbb79d0ae090e
Signed-off-by: Martin Liu <liumartin@google.com>
When I/O is submitted to dm-user target, bio already
has a referance. Additional referance is not needed
in the I/O path.
Bug: 229696117
Test: OTA on Pixel
Change-Id: I8db6802e751336d7a10c6de0bc7a247a6d7f6b37
Signed-off-by: Akilesh Kailash <akailash@google.com>
[ Upstream commit 443685992b ]
Fix -Woverflow warnings for tegra irqchip driver which is a result
of moving arm64 custom MMIO accessor macros to asm-generic function
implementations giving a bonus type-checking now and uncovering these
overflow warnings.
drivers/irqchip/irq-tegra.c: In function ‘tegra_ictlr_suspend’:
drivers/irqchip/irq-tegra.c:151:18: warning: large integer implicitly truncated to unsigned type [-Woverflow]
writel_relaxed(~0ul, ictlr + ICTLR_COP_IER_CLR);
^
Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Sai Prakash Ranjan <quic_saipraka@quicinc.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Cc: Marc Zyngier <maz@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Fixes: de3ce08049 ("irqchip: tegra: Add DT-based support for legacy interrupt controller")
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: Iaee226d0220c9774635cd51953d577ab7e2ebe77
Signed-off-by: Lee Jones <joneslee@google.com>
[ Upstream commit 98692f52c5 ]
Fix -Woverflow warnings for drm/meson driver which is a result
of moving arm64 custom MMIO accessor macros to asm-generic function
implementations giving a bonus type-checking now and uncovering these
overflow warnings.
drivers/gpu/drm/meson/meson_viu.c: In function ‘meson_viu_init’:
drivers/gpu/drm/meson/meson_registers.h:1826:48: error: large integer implicitly truncated to unsigned type [-Werror=overflow]
#define VIU_OSD_BLEND_REORDER(dest, src) ((src) << (dest * 4))
^
drivers/gpu/drm/meson/meson_viu.c:472:18: note: in expansion of macro ‘VIU_OSD_BLEND_REORDER’
writel_relaxed(VIU_OSD_BLEND_REORDER(0, 1) |
^~~~~~~~~~~~~~~~~~~~~
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Sai Prakash Ranjan <quic_saipraka@quicinc.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Fixes: 147ae1cbaa ("drm: meson: viu: use proper macros instead of magic constants")
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: Id3502967ec9df74ea9420a34549bc0ac3c49dfa8
[ Upstream commit a09d2d00af ]
In pxa3xx_gcu_write, a count parameter of type size_t is passed to words of
type int. Then, copy_from_user() may cause a heap overflow because it is used
as the third argument of copy_from_user().
Bug: 245928838
Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I9e21917a52e2cb78cc640a77a6eba21838aa8655
The kernel has an awfully complicated boot sequence in order to cope
with the various EL2 configurations, including those that "enhanced"
the architecture. We go from EL2 to EL1, then back to EL2, staying
at EL2 if VHE capable and otherwise go back to EL1.
Here's a paracetamol tablet for you.
The cpu_resume path follows the same logic, because coming up with
two versions of a square wheel is hard.
However, things aren't this straightforward with pKVM, as the host
resume path is always proxied by the hypervisor, which means that
the kernel is always entered at EL1. Which contradicts what the
__boot_cpu_mode[] array contains (it obviously says EL2).
This thus triggers a HVC call from EL1 to EL2 in a vain attempt
to upgrade from EL1 to EL2 VHE, which we are, funnily enough,
reluctant to grant to the host kernel. This is also completely
unexpected, and puzzles your average EL2 hacker.
Address it by fixing up the boot mode at the point the host gets
deprivileged. is_hyp_mode_available() and co already have a static
branch to deal with this, making it pretty safe.
Cc: <stable@vger.kernel.org> # 5.15+
Reported-by: Vincent Donnefort <vdonnefort@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Tested-by: Vincent Donnefort <vdonnefort@google.com>
Bug: 258157858
Link: https://lore.kernel.org/all/20221108100138.3887862-1-vdonnefort@google.com/
Change-Id: I4a2269402ececa0ec47cab88343c3c623b4b2e3d
Where commit 4ef0c5c6b5 ("kernel/sched: Fix sched_fork() access an
invalid sched_task_group") fixed a fork race vs cgroup, it opened up a
race vs syscalls by not placing the task on the runqueue before it
gets exposed through the pidhash.
Commit 13765de814 ("sched/fair: Fix fault in reweight_entity") is
trying to fix a single instance of this, instead fix the whole class
of issues, effectively reverting this commit.
Bug: 255159688
Fixes: 4ef0c5c6b5 ("kernel/sched: Fix sched_fork() access an invalid sched_task_group")
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Tadeusz Struk <tadeusz.struk@linaro.org>
Tested-by: Zhang Qiao <zhangqiao22@huawei.com>
Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Link: https://lkml.kernel.org/r/YgoeCbwj5mbCR0qA@hirez.programming.kicks-ass.net
(cherry picked from commit b1e8206582)
Signed-off-by: Woody Lin <woodylin@google.com>
Change-Id: Ic593aafb0cc8dae5ba382cdc4ab68526973fdfca
enter_exception64() performs an MTE check, which involves dereferencing
vcpu->kvm. While vcpu has already been fixed up to be a HYP VA pointer,
kvm is still a pointer in the kernel VA space.
This only affects nVHE configurations with MTE enabled, as in other
cases, the pointer is either valid (VHE) or not dereferenced (!MTE).
Fix this by first converting kvm to a HYP VA pointer.
Fixes: ea7fc1bb1c ("KVM: arm64: Introduce MTE VM feature")
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Reviewed-by: Steven Price <steven.price@arm.com>
[maz: commit message tidy-up]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20221027120945.29679-1-ryan.roberts@arm.com
(cherry picked from commit b6bcdc9f6b)
[willdeacon@: Fixed conflict with aosp/2038249 rework moving MTE feature
check into caller]
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Id0aac0fc38dff2569081910af7468ecf97b6eca3
In commit 720c241924 ("ANDROID: binder: change down_write to
down_read") binder assumed the mmap read lock is sufficient to protect
alloc->vma inside binder_update_page_range(). This used to be accurate
until commit dd2283f260 ("mm: mmap: zap pages with read mmap_sem in
munmap"), which now downgrades the mmap_lock after detaching the vma
from the rbtree in munmap(). Then it proceeds to teardown and free the
vma with only the read lock held.
This means that accesses to alloc->vma in binder_update_page_range() now
will race with vm_area_free() in munmap() and can cause a UAF as shown
in the following KASAN trace:
==================================================================
BUG: KASAN: use-after-free in vm_insert_page+0x7c/0x1f0
Read of size 8 at addr ffff16204ad00600 by task server/558
CPU: 3 PID: 558 Comm: server Not tainted 5.10.150-00001-gdc8dcf942daa #1
Hardware name: linux,dummy-virt (DT)
Call trace:
dump_backtrace+0x0/0x2a0
show_stack+0x18/0x2c
dump_stack+0xf8/0x164
print_address_description.constprop.0+0x9c/0x538
kasan_report+0x120/0x200
__asan_load8+0xa0/0xc4
vm_insert_page+0x7c/0x1f0
binder_update_page_range+0x278/0x50c
binder_alloc_new_buf+0x3f0/0xba0
binder_transaction+0x64c/0x3040
binder_thread_write+0x924/0x2020
binder_ioctl+0x1610/0x2e5c
__arm64_sys_ioctl+0xd4/0x120
el0_svc_common.constprop.0+0xac/0x270
do_el0_svc+0x38/0xa0
el0_svc+0x1c/0x2c
el0_sync_handler+0xe8/0x114
el0_sync+0x180/0x1c0
Allocated by task 559:
kasan_save_stack+0x38/0x6c
__kasan_kmalloc.constprop.0+0xe4/0xf0
kasan_slab_alloc+0x18/0x2c
kmem_cache_alloc+0x1b0/0x2d0
vm_area_alloc+0x28/0x94
mmap_region+0x378/0x920
do_mmap+0x3f0/0x600
vm_mmap_pgoff+0x150/0x17c
ksys_mmap_pgoff+0x284/0x2dc
__arm64_sys_mmap+0x84/0xa4
el0_svc_common.constprop.0+0xac/0x270
do_el0_svc+0x38/0xa0
el0_svc+0x1c/0x2c
el0_sync_handler+0xe8/0x114
el0_sync+0x180/0x1c0
Freed by task 560:
kasan_save_stack+0x38/0x6c
kasan_set_track+0x28/0x40
kasan_set_free_info+0x24/0x4c
__kasan_slab_free+0x100/0x164
kasan_slab_free+0x14/0x20
kmem_cache_free+0xc4/0x34c
vm_area_free+0x1c/0x2c
remove_vma+0x7c/0x94
__do_munmap+0x358/0x710
__vm_munmap+0xbc/0x130
__arm64_sys_munmap+0x4c/0x64
el0_svc_common.constprop.0+0xac/0x270
do_el0_svc+0x38/0xa0
el0_svc+0x1c/0x2c
el0_sync_handler+0xe8/0x114
el0_sync+0x180/0x1c0
[...]
==================================================================
To prevent the race above, revert back to taking the mmap write lock
inside binder_update_page_range(). One might expect an increase of mmap
lock contention. However, binder already serializes these calls via top
level alloc->mutex. Also, there was no performance impact shown when
running the binder benchmark tests.
Note this patch is specific to stable branches 5.4 and 5.10. Since in
newer kernel releases binder no longer caches a pointer to the vma.
Instead, it has been refactored to use vma_lookup() which avoids the
issue described here. This switch was introduced in commit a43cfc87ca
("android: binder: stop saving a pointer to the VMA").
Bug: 254837884
Link: https://lore.kernel.org/all/20221104175450.306810-1-cmllamas@google.com/
Fixes: dd2283f260 ("mm: mmap: zap pages with read mmap_sem in munmap")
Reported-by: Jann Horn <jannh@google.com>
Cc: <stable@vger.kernel.org> # 5.10.x
Cc: Minchan Kim <minchan@kernel.org>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Change-Id: Ieabadbfa30f99812da9c226cf1ddd5e60f62c607
This is a stable-specific patch.
I botched the stable-specific rewrite of
commit b67fbebd4c ("mmu_gather: Force tlb-flush VM_PFNMAP vmas"):
As Hugh pointed out, unmap_region() actually operates on a list of VMAs,
and the variable "vma" merely points to the first VMA in that list.
So if we want to check whether any of the VMAs we're operating on is
PFNMAP or MIXEDMAP, we have to iterate through the list and check each VMA.
Bug: 245812080
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 3998dc50eb)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I115183f65fc7df5d33264e6211adcd2ec531d996
[ Upstream commit ba953a9d89 ]
When namespace support was added to xfrm/afkey, it caused the
previously single-threaded call to xfrm_probe_algs to become
multi-threaded. This is buggy and needs to be fixed with a mutex.
Bug: 245674737
Reported-by: Abhishek Shah <abhishek.shah@columbia.edu>
Fixes: 283bc9f35b ("xfrm: Namespacify xfrm state/policy locks")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Change-Id: I71fb89a999447862a6c4b1ff754378bb0452ad3a
Signed-off-by: Lee Jones <joneslee@google.com>
commit b67fbebd4c upstream.
Some drivers rely on having all VMAs through which a PFN might be
accessible listed in the rmap for correctness.
However, on X86, it was possible for a VMA with stale TLB entries
to not be listed in the rmap.
This was fixed in mainline with
commit b67fbebd4c ("mmu_gather: Force tlb-flush VM_PFNMAP vmas"),
but that commit relies on preceding refactoring in
commit 18ba064e42 ("mmu_gather: Let there be one tlb_{start,end}_vma()
implementation") and commit 1e9fdf21a4 ("mmu_gather: Remove per arch
tlb_{start,end}_vma()").
This patch provides equivalent protection without needing that
refactoring, by forcing a TLB flush between removing PTEs in
unmap_vmas() and the call to unlink_file_vma() in free_pgtables().
Bug: 245812080
[This is a stable-specific rewrite of the upstream commit!]
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I8f539ff0365fb9b5d10fddb84082d5995348b897
Memory donated to the hypervisor needs to be contiguous, which
might be difficult to find. To improve the odds of finding
contiguous memory, break up vcpu state donations per vcpu.
Bug: 232070947
Signed-off-by: Fuad Tabba <tabba@google.com>
Change-Id: Iff19b2e2b6ca58b1e6ef38c4b0f16c80dae34ab9
This is done as the first step towards donating memory per vcpu
in future patches without having to spend potentially too much
time in one hypercall.
Moreover, this has the nice effect of removing the need for
stashing the host vcpus in the memory donated for the pgd.
Bug: 232070947
Signed-off-by: Fuad Tabba <tabba@google.com>
Change-Id: I491c358fa29dd62ffc45347d6288696c846d5fc3
Factor out unpinning a single host vcpu from unpin_host_vcpus(),
since it will be used in a future patch in the error path.
No functional change intended.
Bug: 232070947
Signed-off-by: Fuad Tabba <tabba@google.com>
Change-Id: I321e41ae624b2daae8fc917432be0673e32235aa
Facilitates future patches that move the initialization of the
shadow vcpu to a separate hyp call.
Removed unused parameter (vcpu_array/pgd) from
init_shadow_structs().
No functional change intended.
Bug: 232070947
Signed-off-by: Fuad Tabba <tabba@google.com>
Change-Id: I5c3116e7558d958c03ea28dc5610122696a1fca2
Tidies up code and enables the reuse of this function.
No functional change intended.
Bug: 232070947
Signed-off-by: Fuad Tabba <tabba@google.com>
Change-Id: I3a93dd0284e3c177b12d0cabf5e99747dceb0fb4
We need to carry on state from zap_pte_range_tlb_start to
zap_pte_range_tlb_end.
The new param on the function stack will keep the function
trace_android_vh_zap_pte_range_tlb_start called or not and
pass the state to trace_android_vh_zap_pte_range_tlb_end.
Thus, trace_android_vh_zap_pte_range_tlb_end will know
the trace_android_vh_zap_pte_range_tlb_start was called.
If it was called, trace_android_vh_zap_pte_range_tlb_end
will do action to make pair. Otherwise, just skip it.
Bug: 238728493
Bug: 256549265
Change-Id: I95706d51da66f916ede626686483523f3b68dacb
Signed-off-by: Minchan Kim <minchan@google.com>
In aosp/1979327 we attempted to prevent tasks with pending signals and
PF_FREEZER_SKIP from being immediately rescheduled, because such tasks
would crash the kernel if run while no capable CPUs were online. This was
implemented by declining to immediately reschedule them unless various
conditions were met. However, this ended up causing signals to fail to
be delivered if the signal was received while a task is processing a
syscall, such as futex(2), that will block with PF_FREEZER_SKIP set,
as the kernel relies on a check for TIF_SIGPENDING after setting the
task state to TASK_INTERRUPTIBLE in order to deliver such a signal.
This patch is an alternative solution to the original problem that
avoids introducing the signal delivery bug. It works by changing
how freezer_should_skip() is implemented. Instead of just checking
PF_FREEZER_SKIP, we also use the on_rq field to check whether the task
is not on a runqueue. In this way we ensure that a task that will be
immediately rescheduled will not return true from freezer_should_skip(),
and the task will block the freezer unless it is actually taken off
the runqueue.
Signed-off-by: Peter Collingbourne <pcc@google.com>
Bug: 202918514
Bug: 251700836
Change-Id: I3f9b705ce9ad2ca1d2df959f43cf05bef78560f8