I recently found a case where de->name_len is 0 in f2fs_fill_dentries()
easily reproduced, and finally set the fsck flag.
Thread A Thread B
- f2fs_readdir
- f2fs_read_inline_dir
- ctx->pos = d.max
- f2fs_add_dentry
- f2fs_add_inline_entry
- do_convert_inline_dir
- f2fs_add_regular_entry
- f2fs_readdir
- f2fs_fill_dentries
- set_sbi_flag(sbi, SBI_NEED_FSCK)
Process A opens the folder, and has been reading without closing it.
During this period, Process B created a file under the folder (occupying
multiple f2fs_dir_entry, exceeding the d.max of the inline dir). After
creation, process A uses the d.max of inline dir to read it again, and
it will read that de->name_len is 0.
And Chao pointed out that w/o inline conversion, the race condition still
can happen as below:
dir_entry1: A
dir_entry2: B
dir_entry3: C
free slot: _
ctx->pos: ^
Thread A is traversing directory,
ctx-pos moves to below position after readdir() by thread A:
AAAABBBB___
^
Then thread B delete dir_entry2, and create dir_entry3.
Thread A calls readdir() to lookup dirents starting from middle
of new dirent slots as below:
AAAACCCCCC_
^
In these scenarios, the file system is not damaged, and it's hard to
avoid it. But we can bypass tagging FSCK flag if:
a) bit_pos (:= ctx->pos % d->max) is non-zero and
b) before bit_pos moves to first valid dir_entry.
Fixes: ddf06b753a ("f2fs: fix to trigger fsck if dirent.name_len is zero")
Signed-off-by: Yangtao Li <frank.li@vivo.com>
[Chao: clean up description]
Reviewed-by: Chao Yu <chao@kernel.org>
(cherry picked from commit 7b9b92ff1b)
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Change-Id: Iffb4f0e92a6c98cfafb9dcc7d7a0f5e5b1b3a1ef
Change-Id: Ic7bd8a5bb10f2770fee913f4b5faad4f2b177efb
Signed-off-by: haojianhua1 <haojianhua1@xiaomi.com>
Commit d483eed85f ("ANDROID: GKI: set vfs-only exports into their own
namespace") added a namespace for vfs functions. For some kernelci
build targets, configfs is built as a module, so add the proper
namespace marking for configfs as well to fix the reported build
problems.
Fixes: d483eed85f ("ANDROID: GKI: set vfs-only exports into their own namespace")
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I1abc2f17d9a8f90f1fe060b314eb69cb1e6cfc5d
Add one CONFIG to control removing the macros or not. On some platform,
configureing out the macros removes the associated members from the
structs, this reduces the object size of the slabs related with the
structs, therefore reduces the total slab memory consumption of system.
Besides, this also reduces vmlinux size a bit, therefore the total
kernel memory size increses a bit.
The macros are ANDROID_KABI_RESERVE, ANDROID_VENDOR_DATA,
ANDROID_VENDOR_DATA_ARRAY, ANDROID_OEM_DATA, ANDROID_OEM_DATA_ARRAY.
Bug: 206561931
Signed-off-by: Qingqing Zhou <quic_qqzhou@quicinc.com>
Change-Id: I0868d299ccce3c4b39f42af17916828500be6cc4
If write string to /sys/module/debug_kinfo/parameters/build_info
twice, kernel will crash. fix by removing vunmap in build_info_set.
Bug: 213120696
Signed-off-by: Chunhui Li <chunhui.li@mediatek.com>
Change-Id: I683859067a31068de0006be8490efa4b0107044f
We have namespaces, so use them for all vfs-exported namespaces so that
filesystems can use them, but not anything else.
Some in-kernel drivers that do direct filesystem accesses (because they
serve up files) are also allowed access to these symbols to keep 'make
allmodconfig' builds working properly, but it is not needed for Android
kernel images.
Bug: 157965270
Bug: 210074446
Cc: Matthias Maennich <maennich@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Iaf6140baf3a18a516ab2d5c3966235c42f3f70de
We want to add some hooks in the binder module so that we can reduce
block time until binder thread is available
Here are what new hooks do for:
1、android_vh_binder_looper_state_registered: choose a binder thread(do proc work) as a low-level thread.Only this thread has power to excute background binder transaction.
2、android_vh_binder_thread_read: let binder thread do works which come from
our list.
3、android_vh_binder_free_proc: free some pointers and variable.
4、android_vh_binder_thread_release: free the list that we create before.
5、android_vh_binder_has_work_ilocked: to check if our list has work.
6、android_vh_binder_read_done: because of we add hook in binder_has_work_ilocked,
binder_has_work_ilocked may return true, so we try to wake up low-level thread immediately.
Bug: 212483521
Change-Id: Ic40f452cc4dcf8fc85422e23e6f1a7ad77547309
Signed-off-by: Liujie Xie <xieliujie@oppo.com>
To allow process_mrelease to reap targeted mm in parallel with exit_mmap
mark the victim with MMF_OOM_VICTIM flag.
Bug: 189803002
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I89cf5f8fbeeb18b93a340b9ebe7f200837ebe846
With exit_mmap holding mmap_write_lock during free_pgtables call,
process_mrelease does not need to elevate mm->mm_users in order to
prevent exit_mmap from destrying pagetables while __oom_reap_task_mm
is walking the VMA tree. The change prevents process_mrelease from
calling the last mmput, which can lead to waiting for IO completion
in exit_aio.
Fixes: 337546e83f ("mm/oom_kill.c: prevent a race between process_mrelease and exit_mmap")
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Link: https://lore.kernel.org/all/20211124235906.14437-2-surenb@google.com/
Bug: 130172058
Bug: 189803002
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I1e2728e0c477af9cc20e9e0b715ee67dee760618
oom-reaper and process_mrelease system call should protect against
races with exit_mmap which can destroy page tables while they
walk the VMA tree. oom-reaper protects from that race by setting
MMF_OOM_VICTIM and by relying on exit_mmap to set MMF_OOM_SKIP
before taking and releasing mmap_write_lock. process_mrelease has
to elevate mm->mm_users to prevent such race. Both oom-reaper and
process_mrelease hold mmap_read_lock when walking the VMA tree.
The locking rules and mechanisms could be simpler if exit_mmap takes
mmap_write_lock while executing destructive operations such as
free_pgtables.
Change exit_mmap to hold the mmap_write_lock when calling
free_pgtables. Operations like unmap_vmas() and unlock_range() are not
destructive and could run under mmap_read_lock but for simplicity we
take one mmap_write_lock during almost the entire operation. Note
also that because oom-reaper checks VM_LOCKED flag, unlock_range()
should not be allowed to race with it.
In most cases this lock should be uncontended. Previously, Kirill
reported ~4% regression caused by a similar change [1]. We reran the
same test and although the individual results are quite noisy, the
percentiles show lower regression with 1.6% being the worst case [2].
The change allows oom-reaper and process_mrelease to execute safely
under mmap_read_lock without worries that exit_mmap might destroy page
tables from under them.
[1] https://lore.kernel.org/all/20170725141723.ivukwhddk2voyhuc@node.shutemov.name/
[2] https://lore.kernel.org/all/CAJuCfpGC9-c9P40x7oy=jy5SphMcd0o0G_6U1-+JAziGKG6dGA@mail.gmail.com/
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Link: https://lore.kernel.org/all/20211124235906.14437-1-surenb@google.com/
Bug: 130172058
Bug: 189803002
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: Ic87272d09a0b68a1b0e968e8f1a1510fd6fc776a
Race between process_mrelease and exit_mmap, where free_pgtables is
called while __oom_reap_task_mm is in progress, leads to kernel crash
during pte_offset_map_lock call. oom-reaper avoids this race by setting
MMF_OOM_VICTIM flag and causing exit_mmap to take and release
mmap_write_lock, blocking it until oom-reaper releases mmap_read_lock.
Reusing MMF_OOM_VICTIM for process_mrelease would be the simplest way to
fix this race, however that would be considered a hack. Fix this race
by elevating mm->mm_users and preventing exit_mmap from executing until
process_mrelease is finished. Patch slightly refactors the code to
adapt for a possible mmget_not_zero failure.
This fix has considerable negative impact on process_mrelease
performance and will likely need later optimization.
Link: https://lkml.kernel.org/r/20211022014658.263508-1-surenb@google.com
Fixes: 884a7e5964 ("mm: introduce process_mrelease system call")
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Christian Brauner <christian@brauner.io>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Jan Engelhardt <jengelh@inai.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 337546e83f)
Bug: 189803002
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I7cf9c869faa7b746995a94ea93f6a617104385aa
In modern systems it's not unusual to have a system component monitoring
memory conditions of the system and tasked with keeping system memory
pressure under control. One way to accomplish that is to kill
non-essential processes to free up memory for more important ones.
Examples of this are Facebook's OOM killer daemon called oomd and
Android's low memory killer daemon called lmkd.
For such system component it's important to be able to free memory quickly
and efficiently. Unfortunately the time process takes to free up its
memory after receiving a SIGKILL might vary based on the state of the
process (uninterruptible sleep), size and OPP level of the core the
process is running. A mechanism to free resources of the target process
in a more predictable way would improve system's ability to control its
memory pressure.
Introduce process_mrelease system call that releases memory of a dying
process from the context of the caller. This way the memory is freed in a
more controllable way with CPU affinity and priority of the caller. The
workload of freeing the memory will also be charged to the caller. The
operation is allowed only on a dying process.
After previous discussions [1, 2, 3] the decision was made [4] to
introduce a dedicated system call to cover this use case.
The API is as follows,
int process_mrelease(int pidfd, unsigned int flags);
DESCRIPTION
The process_mrelease() system call is used to free the memory of
an exiting process.
The pidfd selects the process referred to by the PID file
descriptor.
(See pidfd_open(2) for further information)
The flags argument is reserved for future use; currently, this
argument must be specified as 0.
RETURN VALUE
On success, process_mrelease() returns 0. On error, -1 is
returned and errno is set to indicate the error.
ERRORS
EBADF pidfd is not a valid PID file descriptor.
EAGAIN Failed to release part of the address space.
EINTR The call was interrupted by a signal; see signal(7).
EINVAL flags is not 0.
EINVAL The memory of the task cannot be released because the
process is not exiting, the address space is shared
with another live process or there is a core dump in
progress.
ENOSYS This system call is not supported, for example, without
MMU support built into Linux.
ESRCH The target process does not exist (i.e., it has terminated
and been waited on).
[1] https://lore.kernel.org/lkml/20190411014353.113252-3-surenb@google.com/
[2] https://lore.kernel.org/linux-api/20201113173448.1863419-1-surenb@google.com/
[3] https://lore.kernel.org/linux-api/20201124053943.1684874-3-surenb@google.com/
[4] https://lore.kernel.org/linux-api/20201223075712.GA4719@lst.de/
Link: https://lkml.kernel.org/r/20210809185259.405936-1-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Jan Engelhardt <jengelh@inai.de>
Cc: Tim Murray <timmurray@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 884a7e5964)
Bug: 189803002
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I60d37051acaeff1b7eb7d10aeca23dfa1f2469a3
Add a new fips140_lab_util command 'show_invalid_inputs' which uses
AF_ALG to call some crypto algorithms with invalid parameters to show
that they fail. This is needed to meet a new requirement we've received
from the lab. This requirement is vague, but a representative sample of
algorithms and inputs appears to be acceptable.
For this to work, AF_ALG needs to be enabled in the kernel. This makes
fips140_lab_util start depending on a custom kernel build, not just on a
custom fips140 module build as was the case before. However, the lab
testing was going to need custom boot images anyway once fips140.ko is
included in the normal builds, since the production build of fips140.ko
won't have CONFIG_CRYPTO_FIPS140_MOD_EVAL_TESTING=y. AF_ALG is also
needed to do the Jitter RNG entropy analysis properly, and the
AF_ALG-enabled kernel can also be reused for ACVP testing.
Bug: 188620248
Change-Id: I69054eab5005fc3ca0ea081760877f73ea229f5b
Signed-off-by: Eric Biggers <ebiggers@google.com>
(cherry picked from commit 04e49b41be)
fips140_lab_test doesn't really do any tests per se, but rather is a
utility program that dumps some output. The actual "test" is when the
lab checks the output; we aren't allowed to check it ourselves.
We also need to add some new functionality, which would work well as
sub-commands. Also, the original idea was that this was just sample
code which the lab would modify, but that's not actually happening.
Therefore, rename fips140_lab_test to fips140_lab_util, and refactor its
functionality into sub-commands 'show_module_version' and
'show_service_indicators'. This fits better with what is needed.
Bug: 188620248
Change-Id: I7da84a139283f185f79b8d866547151169f26415
Signed-off-by: Eric Biggers <ebiggers@google.com>
(cherry picked from commit 6ed33b82ea)
A new API, rproc_set_firmware() is added to allow the remoteproc platform
drivers and remoteproc client drivers to be able to configure a custom
firmware name that is different from the default name used during
remoteproc registration. This function is being introduced to provide
a kernel-level equivalent of the current sysfs interface to remoteproc
client drivers, and can only change firmwares when the remoteproc is
offline. This allows some remoteproc drivers to choose different firmwares
at runtime based on the functionality the remote processor is providing.
The TI PRU Ethernet driver will be an example of such usage as it
requires to use different firmwares for different supported protocols.
Also, update the firmware_store() function used by the sysfs interface
to reuse this function to avoid code duplication.
Bug: 213024513
Change-Id: Ie365179ac296c43c7c5c54b46f9f9f7587d5d263
Reviewed-by: Rishabh Bhatnagar <rishabhb@codeaurora.org>
Signed-off-by: Suman Anna <s-anna@ti.com>
Link: https://lore.kernel.org/r/20201121032042.6195-1-s-anna@ti.com
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
(cherry picked from commit 4c1ad562d3
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master)
Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com>
In __arm_v7s_alloc_table function:
iommu call kmem_cache_alloc to allocate page table, this function
allocate memory may fail, when kmem_cache_alloc fails to allocate
table, call virt_to_phys will be abnomal and return unexpected phys
and goto out_free, then call kmem_cache_free to release table will
trigger KE, __get_free_pages and free_pages have similar problem,
so add error handle for page table allocation failure.
Fixes: 29859aeb8a ("iommu/io-pgtable-arm-v7s: Abort allocation when table address overflows the PTE")
Signed-off-by: Yunfei Wang <yf.wang@mediatek.com>
Cc: <stable@vger.kernel.org> # 5.10.*
Acked-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/20211207113315.29109-1-yf.wang@mediatek.com
Signed-off-by: Will Deacon <will@kernel.org>
Bug: 210958369
(cherry picked from commit a556cfe4cahttps://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git for-joerg/arm-smmu/updates)
Signed-off-by: Yunfei Wang <yf.wang@mediatek.com>
Change-Id: I6435903336d1e15b5a57d08c284b3d3d66ea985d
commit ef6c8d6ccf upstream.
When SCTP handles an INIT chunk, it calls for example:
sctp_sf_do_5_1B_init
sctp_verify_init
sctp_verify_param
sctp_process_init
sctp_process_param
handling of SCTP_PARAM_SET_PRIMARY
sctp_verify_init() wasn't doing proper size validation and neither the
later handling, allowing it to work over the chunk itself, possibly being
uninitialized memory.
Bug: 197154735
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Aaron Ding <aaronding@google.com>
Change-Id: I032230924ead7a03dfb3101e9cd4d48e36bfc616
commit b6ffe7671b upstream.
In one of the fallbacks that SCTP has for identifying an association for an
incoming packet, it looks for AddIp chunk (from ASCONF) and take a peek.
Thing is, at this stage nothing was validating that the chunk actually had
enough content for that, allowing the peek to happen over uninitialized
memory.
Similar check already exists in actual asconf handling in
sctp_verify_asconf().
Bug: 197154735
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Aaron Ding <aaronding@google.com>
Change-Id: Ibfe53fc724143423353ed6b2984d2508ee4fc457
[ Upstream commit 30e29a9a2b ]
In prealloc_elems_and_freelist(), the multiplication to calculate the
size passed to bpf_map_area_alloc() could lead to an integer overflow.
As a result, out-of-bounds write could occur in pcpu_freelist_populate()
as reported by KASAN:
[...]
[ 16.968613] BUG: KASAN: slab-out-of-bounds in pcpu_freelist_populate+0xd9/0x100
[ 16.969408] Write of size 8 at addr ffff888104fc6ea0 by task crash/78
[ 16.970038]
[ 16.970195] CPU: 0 PID: 78 Comm: crash Not tainted 5.15.0-rc2+ #1
[ 16.970878] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
[ 16.972026] Call Trace:
[ 16.972306] dump_stack_lvl+0x34/0x44
[ 16.972687] print_address_description.constprop.0+0x21/0x140
[ 16.973297] ? pcpu_freelist_populate+0xd9/0x100
[ 16.973777] ? pcpu_freelist_populate+0xd9/0x100
[ 16.974257] kasan_report.cold+0x7f/0x11b
[ 16.974681] ? pcpu_freelist_populate+0xd9/0x100
[ 16.975190] pcpu_freelist_populate+0xd9/0x100
[ 16.975669] stack_map_alloc+0x209/0x2a0
[ 16.976106] __sys_bpf+0xd83/0x2ce0
[...]
The possibility of this overflow was originally discussed in [0], but
was overlooked.
Fix the integer overflow by changing elem_size to u64 from u32.
[0] https://lore.kernel.org/bpf/728b238e-a481-eb50-98e9-b0f430ab01e7@gmail.com/
Bug: 202511260
Fixes: 557c0c6e7d ("bpf: convert stackmap to pre-allocation")
Signed-off-by: Tatsuhiko Yasumatsu <th.yasumatsu@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210930135545.173698-1-th.yasumatsu@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Aaron Ding <aaronding@google.com>
Change-Id: I45de17135336ce329b539d3e9e95fdcddafb2b00
We want to use this hook to record the sleeping time due to Futex
Bug: 210947226
Signed-off-by: Liujie Xie <xieliujie@oppo.com>
Change-Id: I637f889dce42937116d10979e0c40fddf96cd1a2
Add a tracehook to allow callers into dma_alloc_contiguous() to make
use of the built-in CMA area if the caller has addressing limitations;
this provides a means of allocating from memory whose bounds are
restricted to the lower 4 GB of memory, without having to enable DMA32
(assuming the default CMA area has been restricted to the appropriate
address ranges).
Leaf changes summary: 1 artifact changed
Changed leaf types summary: 0 leaf type changed
Removed/Changed/Added functions summary: 0 Removed, 0 Changed, 0 Added function
Removed/Changed/Added variables summary: 0 Removed, 0 Changed, 1 Added variable
1 Added variable:
[A] 'tracepoint __tracepoint_android_vh_subpage_dma_contig_alloc'
Bug: 199917449
Change-Id: Ia86fb416376bca231405b06ab27b0674c8fe3e14
Signed-off-by: Chris Goldsworthy <quic_cgoldswo@quicinc.com>
Currently the standard memory allocator (snd_dma_malloc_pages*())
passes the byte size to allocate as is. Most of the backends
allocates real pages, hence the actual allocations are aligned in page
size. However, the genalloc doesn't seem assuring the size alignment,
hence it may result in the access outside the buffer when the whole
memory pages are exposed via mmap.
For avoiding such inconsistencies, this patch makes the allocation
size always to be aligned in page size.
Note that, after this change, snd_dma_buffer.bytes field contains the
aligned size, not the originally requested size. This value is also
used for releasing the pages in return.
BUG: 209931573
cherry picked from commit 5c1733e33c
Change-Id: Ib65f0e29b87d55e13006c7416793a4539d376cc8
Reviewed-by: Lars-Peter Clausen <lars@metafoo.de>
Link: https://lore.kernel.org/r/20201218145625.2045-2-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Denis Hsu <denis.hsu@mediatek.com>
mmu_notifier_trylock definition for CONFIG_MMU_NOTIFIER=n configuration
has not been modified from the older version. Correct that mistake.
Fixes: 6971350406 ("ANDROID: fix mmu_notifier race caused by not taking mmap_lock during SPF")
Bug: 161210518
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I71b8644bd2864b6ed98a7ff9c15a99fbd4c5a6c5
When a kernel thread calls dma_buf_put() to release the last reference
to a dma-buf, fput_many() defers calling the release callback to a
workqueue. This means that if the same kernel thread later calls
dma_heap_buffer_alloc(), it has no guarantee that the memory from the
prior free is available, leading to random failures. As a short-term
workaround, call flush_delayed_fput() to ensure the free completes
synchronously.
Leaf changes summary: 1 artifact changed
Changed leaf types summary: 0 leaf type changed
Removed/Changed/Added functions summary: 0 Removed, 0 Changed, 1 Added function
Removed/Changed/Added variables summary: 0 Removed, 0 Changed, 0 Added variable
1 Added function:
[A] 'function void flush_delayed_fput()'
Bug: 210598057
Change-Id: Id936aa0bcd410b23b12f4b922b676aa61a358b4c
Signed-off-by: Patrick Daly <quic_pdaly@quicinc.com>
To prevent ABI breakage, move mm->mmu_notifier_lock into
mm->notifier_subscriptions and allocate mm->notifier_subscriptions
during mm creation in mmu_notifier_subscriptions_init. This results
in additional 176 bytes allocated for each mm, but prevents ABI breakage.
mmu_notifier_subscriptions_hdr structure is introduced at the beginning
of mmu_notifier_subscriptions to keep mmu_notifier_subscriptions hidden
and prevent its type CRC from changing when used in other structures.
Bug: 161210518
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I6f435708d642b70b22e0243c8b33108c208ce5bb
percpu_rw_semaphore changes to allow calling percpu_free_rwsem in atomic
context cause ABI breakage. Introduce percpu_free_rwsem_atomic wrapper
and change percpu_rwsem_destroy to use it in order to keep
percpu_rw_semaphore struct intact and fix ABI breakage.
Bug: 161210518
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I198a6381fb48059f2aaa2ec38b8c1e5e5e936bb0
When pagefaults are handled speculatively,the pair of
mmu_notifier_invalidate_range_start/mmu_notifier_invalidate_range_end
calls happen without mmap_lock being taken. This enables the following
race:
mmu_notifier_invalidate_range_start
mmap_write_lock
mmu_notifier_register
mmap_write_unlock
mmu_notifier_invalidate_range_end
In this case mmu_notifier_invalidate_range_end will see a new
subscriber not seen at the time of mmu_notifier_invalidate_range_start
and will call ops->invalidate_range_end for that subscriber without
the matching ops->invalidate_range_start, creating imbalance.
Fix this by introducing a new mm->mmu_notifier_lock percpu_rw_semaphore
to synchronize mmu_notifier_invalidate_range_start/
mmu_notifier_invalidate_range_end with mmu_notifier_register when
handling pagefaults speculatively without holding mmap_lock.
percpu_rw_semaphore is used instead of rw_semaphore to prevent cache
line bouncing in the pagefault path.
Fixes: 86ee4a531e ("FROMLIST: x86/mm: add speculative pagefault handling")
Bug: 161210518
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I9c363b2348efcad19818f93b010abf956870ab55
Android OTA failed due to SBI_NEED_FSCK flag when pinning the file. Let's avoid
it since we can do in-place-updates.
Bug: 210593661
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
(cherry picked from commit 70da2736a4138b86a12873d33fefbb495e22e6f8
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs.git dev)
Signed-off-by: Huang Jianan <huangjianan@oppo.com>
Change-Id: I3fd33c984417c10b38e23de6cec017b03d588945