Commit Graph

1149966 Commits

Author SHA1 Message Date
Zhaoyang Huang
7ae1e02abb UPSTREAM: mm: skip CMA pages when they are not available
This patch fixes unproductive reclaiming of CMA pages by skipping them
when they are not available for current context.  It arises from the below
OOM issue, which was caused by a large proportion of MIGRATE_CMA pages
among free pages.

[   36.172486] [03-19 10:05:52.172] ActivityManager: page allocation failure: order:0, mode:0xc00(GFP_NOIO), nodemask=(null),cpuset=foreground,mems_allowed=0
[   36.189447] [03-19 10:05:52.189] DMA32: 0*4kB 447*8kB (C) 217*16kB (C) 124*32kB (C) 136*64kB (C) 70*128kB (C) 22*256kB (C) 3*512kB (C) 0*1024kB 0*2048kB 0*4096kB = 35848kB
[   36.193125] [03-19 10:05:52.193] Normal: 231*4kB (UMEH) 49*8kB (MEH) 14*16kB (H) 13*32kB (H) 8*64kB (H) 2*128kB (H) 0*256kB 1*512kB (H) 0*1024kB 0*2048kB 0*4096kB = 3236kB
...
[   36.234447] [03-19 10:05:52.234] SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
[   36.234455] [03-19 10:05:52.234] cache: ext4_io_end, object size: 64, buffer size: 64, default order: 0, min order: 0
[   36.234459] [03-19 10:05:52.234] node 0: slabs: 53,objs: 3392, free: 0

This change further decreases the chance for wrong OOMs in the presence
of a lot of CMA memory.

[david@redhat.com: changelog addition]
Link: https://lkml.kernel.org/r/1685501461-19290-1-git-send-email-zhaoyang.huang@unisoc.com
Change-Id: I84f1145c38b5ff7b825f2122b33bc55997931bd7
Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: ke.wang <ke.wang@unisoc.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
(cherry picked from commit 5da226dbfc)
Bug: 288383787
Bug: 291719697
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
2023-08-15 19:57:01 +00:00
Dan Carpenter
7666325265 UPSTREAM: dma-buf: fix an error pointer vs NULL bug
Smatch detected potential error pointer dereference.

    drivers/gpu/drm/drm_syncobj.c:888 drm_syncobj_transfer_to_timeline()
    error: 'fence' dereferencing possible ERR_PTR()

The error pointer comes from dma_fence_allocate_private_stub().  One
caller expected error pointers and one expected NULL pointers.  Change
it to return NULL and update the caller which expected error pointers,
drm_syncobj_assign_null_handle(), to check for NULL instead.

Bug: 286438670
Fixes: f781f661e8 ("dma-buf: keep the signaling time of merged fences v3")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/b09f1996-3838-4fa2-9193-832b68262e43@moroto.mountain
(cherry picked from commit 00ae1491f9)
Change-Id: I9fe1e61543e84a0f22d8ec26e01d94b809620744
Signed-off-by: Jindong Yue <jindong.yue@nxp.com>
2023-08-15 16:58:34 +00:00
Christian König
e61d76121f UPSTREAM: dma-buf: keep the signaling time of merged fences v3
Some Android CTS is testing if the signaling time keeps consistent
during merges.

v2: use the current time if the fence is still in the signaling path and
the timestamp not yet available.
v3: improve comment, fix one more case to use the correct timestamp

Bug: 286438670
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230630120041.109216-1-christian.koenig@amd.com
(cherry picked from commit f781f661e8)
Change-Id: I5cd3178213fc28ac67146f58fddf83f7d482fd76
Signed-off-by: Jindong Yue <jindong.yue@nxp.com>
2023-08-15 16:58:34 +00:00
Pablo Neira Ayuso
fda157ce15 UPSTREAM: netfilter: nf_tables: skip bound chain on rule flush
[ Upstream commit 6eaf41e87a ]

Skip bound chain when flushing table rules, the rule that owns this
chain releases these objects.

Otherwise, the following warning is triggered:

  WARNING: CPU: 2 PID: 1217 at net/netfilter/nf_tables_api.c:2013 nf_tables_chain_destroy+0x1f7/0x210 [nf_tables]
  CPU: 2 PID: 1217 Comm: chain-flush Not tainted 6.1.39 #1
  RIP: 0010:nf_tables_chain_destroy+0x1f7/0x210 [nf_tables]

Bug: 294357305
Fixes: d0e2c7de92 ("netfilter: nf_tables: add NFT_CHAIN_BINDING")
Reported-by: Kevin Rich <kevinrich1337@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit e18922ce3e)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I48f43d0ce3410efec2513479a1f4c7708a097b01
2023-08-15 16:18:17 +00:00
Pedro Tammela
110a26edd1 UPSTREAM: net/sched: sch_qfq: account for stab overhead in qfq_enqueue
[ Upstream commit 3e337087c3 ]

Lion says:
-------
In the QFQ scheduler a similar issue to CVE-2023-31436
persists.

Consider the following code in net/sched/sch_qfq.c:

static int qfq_enqueue(struct sk_buff *skb, struct Qdisc *sch,
                struct sk_buff **to_free)
{
     unsigned int len = qdisc_pkt_len(skb), gso_segs;

    // ...

     if (unlikely(cl->agg->lmax < len)) {
         pr_debug("qfq: increasing maxpkt from %u to %u for class %u",
              cl->agg->lmax, len, cl->common.classid);
         err = qfq_change_agg(sch, cl, cl->agg->class_weight, len);
         if (err) {
             cl->qstats.drops++;
             return qdisc_drop(skb, sch, to_free);
         }

    // ...

     }

Similarly to CVE-2023-31436, "lmax" is increased without any bounds
checks according to the packet length "len". Usually this would not
impose a problem because packet sizes are naturally limited.

This is however not the actual packet length, rather the
"qdisc_pkt_len(skb)" which might apply size transformations according to
"struct qdisc_size_table" as created by "qdisc_get_stab()" in
net/sched/sch_api.c if the TCA_STAB option was set when modifying the qdisc.

A user may choose virtually any size using such a table.

As a result the same issue as in CVE-2023-31436 can occur, allowing heap
out-of-bounds read / writes in the kmalloc-8192 cache.
-------

We can create the issue with the following commands:

tc qdisc add dev $DEV root handle 1: stab mtu 2048 tsize 512 mpu 0 \
overhead 999999999 linklayer ethernet qfq
tc class add dev $DEV parent 1: classid 1:1 htb rate 6mbit burst 15k
tc filter add dev $DEV parent 1: matchall classid 1:1
ping -I $DEV 1.1.1.2

This is caused by incorrectly assuming that qdisc_pkt_len() returns a
length within the QFQ_MIN_LMAX < len < QFQ_MAX_LMAX.

Bug: 292249631
Fixes: 462dbc9101 ("pkt_sched: QFQ Plus: fair-queueing service at DRR cost")
Reported-by: Lion <nnamrec@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 70feebdbfa)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I69bec7b092e980fe8e0946c26ed9b5ac7c57bf3d
2023-08-15 16:15:08 +00:00
Pedro Tammela
9db1437238 UPSTREAM: net/sched: sch_qfq: refactor parsing of netlink parameters
[ Upstream commit 25369891fc ]

Two parameters can be transformed into netlink policies and
validated while parsing the netlink message.

Bug: 292249631
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stable-dep-of: 3e337087c3 ("net/sched: sch_qfq: account for stab overhead in qfq_enqueue")
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 4b33836824)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: Ifce65b6b0ce2f7dee2040a4c91fd90ea7b2e8f3c
2023-08-15 16:15:08 +00:00
Florian Westphal
7688102949 UPSTREAM: netfilter: nft_set_pipapo: fix improper element removal
[ Upstream commit 87b5a5c209 ]

end key should be equal to start unless NFT_SET_EXT_KEY_END is present.

Its possible to add elements that only have a start key
("{ 1.0.0.0 . 2.0.0.0 }") without an internval end.

Insertion treats this via:

if (nft_set_ext_exists(ext, NFT_SET_EXT_KEY_END))
   end = (const u8 *)nft_set_ext_key_end(ext)->data;
else
   end = start;

but removal side always uses nft_set_ext_key_end().
This is wrong and leads to garbage remaining in the set after removal
next lookup/insert attempt will give:

BUG: KASAN: slab-use-after-free in pipapo_get+0x8eb/0xb90
Read of size 1 at addr ffff888100d50586 by task nft-pipapo_uaf_/1399
Call Trace:
 kasan_report+0x105/0x140
 pipapo_get+0x8eb/0xb90
 nft_pipapo_insert+0x1dc/0x1710
 nf_tables_newsetelem+0x31f5/0x4e00
 ..

Bug: 293587745
Fixes: 3c4287f620 ("nf_tables: Add set type for arbitrary concatenation of ranges")
Reported-by: lonial con <kongln9170@gmail.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 90c3955beb)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I51a423aaa2c31c4df89776505b602aa2c1523b82
2023-08-15 11:50:35 +01:00
Yifan Hong
37f4509407 ANDROID: Add checkpatch target.
Running the following will run scripts/checkpatch.pl on a
patch of HEAD

  tools/bazel run //common:checkpatch

or a given Git SHA1:

  tools/bazel run //common:checkpatch -- --git_sha1 ...

For additional flags, see

  tools/bazel run //common:checkpatch -- --help

For details, see
  build/kernel/kleaf/docs/checkpatch.md
in your source tree.

Test: TH
Bug: 259995152
Change-Id: Iaad8fd69508cf9be11340166aafbb84930d4805c
Signed-off-by: Yifan Hong <elsk@google.com>
(cherry picked from commit 7dbf26568fcccde88470e7a25c07f0c7229e85f1)
2023-08-11 17:53:56 +00:00
Alan Stern
d7dacaa439 UPSTREAM: USB: Gadget: core: Help prevent panic during UVC unconfigure
Avichal Rakesh reported a kernel panic that occurred when the UVC
gadget driver was removed from a gadget's configuration.  The panic
involves a somewhat complicated interaction between the kernel driver
and a userspace component (as described in the Link tag below), but
the analysis did make one thing clear: The Gadget core should
accomodate gadget drivers calling usb_gadget_deactivate() as part of
their unbind procedure.

Currently this doesn't work.  gadget_unbind_driver() calls
driver->unbind() while holding the udc->connect_lock mutex, and
usb_gadget_deactivate() attempts to acquire that mutex, which will
result in a deadlock.

The simple fix is for gadget_unbind_driver() to release the mutex when
invoking the ->unbind() callback.  There is no particular reason for
it to be holding the mutex at that time, and the mutex isn't held
while the ->bind() callback is invoked.  So we'll drop the mutex
before performing the unbind callback and reacquire it afterward.

We'll also add a couple of comments to usb_gadget_activate() and
usb_gadget_deactivate().  Because they run in process context they
must not be called from a gadget driver's ->disconnect() callback,
which (according to the kerneldoc for struct usb_gadget_driver in
include/linux/usb/gadget.h) may run in interrupt context.  This may
help prevent similar bugs from arising in the future.

Reported-and-tested-by: Avichal Rakesh <arakesh@google.com>
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Fixes: 286d9975a8 ("usb: gadget: udc: core: Prevent soft_connect_store() race")
Link: https://lore.kernel.org/linux-usb/4d7aa3f4-22d9-9f5a-3d70-1bd7148ff4ba@google.com/
Cc: Badhri Jagan Sridharan <badhri@google.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/48b2f1f1-0639-46bf-bbfc-98cb05a24914@rowland.harvard.edu
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 291976100
Change-Id: Icff01d8e88f041af4bda8726242de9cd518a247a
(cherry picked from commit 65dadb2bee)
Signed-off-by: Avichal Rakesh <arakesh@google.com>
2023-08-11 17:30:16 +00:00
Zhanyuan Hu
4dc009c3a8 ANDROID: GKI: Update symbols to symbol list
Update symbols to symbol list externed by oppo memory group.

ABI DIFFERENCES HAVE BEEN DETECTED!
1 variable symbol(s) added
  'unsigned long zero_pfn'

Bug: 292051411

Change-Id: I913c01c7671729bf33b78a218c61cfb94628fb0e
Signed-off-by: huzhanyuan <huzhanyuan@oppo.com>
2023-08-11 17:12:52 +00:00
xieliujie
fadc35923d ANDROID: vendor_hook: fix the error record position of mutex
Make sure vendorhook trace_android_vh_record_mutex_lock_starttime woking both in fastpath unlock and slowpath unlock.

Fixes: 57750518de ("ANDROID: vendor_hook: Avoid clearing protect-flag before waking waiters")
Bug: 286024926
Change-Id: Ib91c1b88d27aaa4ef872d44102969ffc3c9adb58
Signed-off-by: xieliujie <xieliujie@oppo.com>
2023-08-11 16:57:28 +00:00
Woogeun Lee
3fc69d3f70 ANDROID: ABI: add allowed list for galaxy
19 function symbol(s) added
  'int __fsnotify_parent(struct dentry*, __u32, const void*, int)'
  'int __traceiter_android_vh_wq_lockup_pool(void*, int, unsigned long)'
  'int cleancache_register_ops(const struct cleancache_ops*)'
  'int fsnotify(__u32, const void*, int, struct inode*, const struct qstr*, struct inode*, u32)'
  'void kernel_neon_begin()'
  'void kernel_neon_end()'
  'int kstrtos16(const char*, unsigned int, s16*)'
  'int regulator_get_current_limit(struct regulator*)'
  'int smpboot_register_percpu_thread(struct smp_hotplug_thread*)'
  'void smpboot_unregister_percpu_thread(struct smp_hotplug_thread*)'
  'int snd_soc_add_card_controls(struct snd_soc_card*, const struct snd_kcontrol_new*, int)'
  'unsigned int stack_trace_save_regs(struct pt_regs*, unsigned long*, unsigned int, unsigned int)'
  'int tcp_register_congestion_control(struct tcp_congestion_ops*)'
  'void tcp_reno_cong_avoid(struct sock*, u32, u32)'
  'u32 tcp_reno_ssthresh(struct sock*)'
  'u32 tcp_reno_undo_cwnd(struct sock*)'
  'u32 tcp_slow_start(struct tcp_sock*, u32)'
  'void tcp_unregister_congestion_control(struct tcp_congestion_ops*)'
  'int usb_set_configuration(struct usb_device*, int)'

1 variable symbol(s) added
  'struct tracepoint __tracepoint_android_vh_wq_lockup_pool'

Bug: 294125592
Change-Id: I6c2f2fb274dbe45263e39e43b4b8bc3766ef2bab
Signed-off-by: Woogeun Lee <woogeun.lee@samsung.com>
2023-08-10 19:30:21 +00:00
Jaewon Kim
a5a662187f ANDROID: gfp: add __GFP_CMA in gfpflag_names
The __GFP_CMA was added but not added to the gfpflag_names. Let me add
it to show on %pGg printk.

Bug: 295271520
Signed-off-by: Jaewon Kim <jaewon31.kim@samsung.com>
Change-Id: I155fdcc0e2c18db390b5166ba8d2b93c793caae6
2023-08-10 18:41:10 +00:00
Ramji Jiyani
b520b90913 ANDROID: ABI: Update to fix slab-out-of-bounds in xhci_vendor_get_ops
type 'struct xhci_hcd' changed
  member 'union { struct xhci_vendor_ops* vendor_ops; struct { u64 android_kabi_reserved1; }; union { }; }' was added
  member 'u64 android_kabi_reserved1' was removed

Bug: 293869685
Test: TH
Change-Id: I1fa551fc1b9263302d38f4e2989eed9f5f0d816a
Signed-off-by: Ramji Jiyani <ramjiyani@google.com>
2023-08-10 18:29:38 +00:00
Howard Yen
c2cbb3cc24 ANDROID: usb: host: fix slab-out-of-bounds in xhci_vendor_get_ops
slab-out-of-bounds happens if the xhci platform drivers don't define
the extra_priv_size in their xhci_driver_overrides structure. Move
xhci_vendor_ops structure to xhci main structure to avoid
extra_priv_size affacts xhci_vendor_get_ops which causes the
slab-out-of-bounds error.

Fixes: 90ab8e7f98 ("ANDROID: usb: host: add xhci hooks for USB offload")
Bug: 293869685
Bug: 194461020
Test: build and boot pass
Change-Id: Id17fdfbfd3e8edcc89a05c9c2f553ffab494215e
Signed-off-by: Howard Yen <howardyen@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
(cherry picked from commit 34f6c9c308)
(cherry picked from commit 00666b8e3e)
2023-08-10 18:29:38 +00:00
André Draszik
64787ee451 ANDROID: GKI: update pixel symbol list for xhci
Pixel is using these symbols in its USB driver implementation.

3 function symbol(s) added
  'int xhci_address_device(struct usb_hcd*, struct usb_device*)'
  'int xhci_bus_resume(struct usb_hcd*)'
  'int xhci_bus_suspend(struct usb_hcd*)'

Bug: 277396090
Bug: 287008367
Change-Id: Id89097ab094e0582560383793c91278c88cb078f
Signed-off-by: André Draszik <draszik@google.com>
2023-08-10 14:31:27 +01:00
Andrew Yang
b0c06048a8 FROMGIT: fs: drop_caches: draining pages before dropping caches
We expect a file page access after dropping caches should be a major
fault, but sometimes it's still a minor fault.  That's because a file page
can't be dropped if it's in a per-cpu pagevec.  Draining all pages from
per-cpu pagevec to lru list before trying to drop caches.

Link: https://lkml.kernel.org/r/20230630092203.16080-1-andrew.yang@mediatek.com
Change-Id: I9b03c53e39b87134d5ddd0c40ac9b36cf4d190cd
Signed-off-by: Andrew Yang <andrew.yang@mediatek.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Bug: 285794522
(cherry picked from commit a481c6fdf3e4fdf31bda91098dfbf46098037e76
 https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-unstable)
2023-08-10 10:47:24 +00:00
Author Name
2f76bb83b1 ANDROID: GKI: update symbol list file for xiaomi
INFO: ABI DIFFERENCES HAVE BEEN DETECTED!
INFO: 8 function symbol(s) added
  'int sock_wake_async(struct socket_wq *wq, int how, int band)'
  'void bpf_map_put(struct bpf_map *map)'
  'void bpf_map_inc(struct bpf_map *map)'
  'int __dev_direct_xmit(struct sk_buff *skb, u16 queue_id)'
  'void napi_busy_loop(unsigned int napi_id,bool (*loop_end)(void *,
                       unsigned long),void *loop_end_arg, bool
                       prefer_busy_poll, u16 budget)'
  'bool dma_need_sync(struct device *dev, dma_addr_t dma_addr)'
  'void page_pool_put_page_bulk(struct page_pool *pool, void **data,
                                int count)'
  'struct sk_buff *build_skb_around(struct sk_buff *skb,void *data,
                                    unsigned int frag_size)'
INFO: 2 variable symbol(s) added
  'DECLARE_PER_CPU(struct bpf_redirect_info, bpf_redirect_info),
  'DEFINE_PER_CPU(struct mem_cgroup *, int_active_memcg)'

Bug: 294257769

Change-Id: I98da395227810eecb1fd978dedd20fba445757d0
Signed-off-by: dongziqi <dongziqi1@xiaomi.corp-partner.google.com>
2023-08-09 22:53:11 +00:00
Elliot Berman
8e86825eec ANDROID: uid_sys_stats: Use a single work for deferred updates
uid_sys_stats tries to acquire a lock when any task exits to do some
bookkeeping in common data structure. If the lock is contended, it
allocates and schedules a work to do the work later to avoid task exit
latency.

In a stress test which creates many tasks exiting, the workqueue can be
overwhelmed by the number of works being scheduled and allocates more
worker threads to handle queue. The growth of the number of threads is
effectively unbounded and can exhaust the process table. This causes
denial of service to userspace trying to fork().

Instead of allocating a new work each, create a linked list of the
update stats deferred work and have a single work to drain the linked
list. The linked list is implemented using an atomic_long_t.

Bug: 294468796
Fixes: 5586278c0f ("ANDROID: uid_sys_stats: defer process_notifier work if uid_lock is contended")
Change-Id: I15f20f4f69ea66a452bdf815c4ef3a0da3edfd36
Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>
2023-08-09 20:50:55 +00:00
Junki Min
960d9828ee ANDROID: ABI: Update symbol for Exynos SoC
Update symbols for Exynos WLBT driver.

1 function symbol(s) added
  'unsigned long __find_nth_bit(const unsigned long*, unsigned long, unsigned long)'

Bug: 294470344
Change-Id: I9f8d9d20f643b34bbc475dde468dbaa11f56e667
Signed-off-by: Junki Min <joonki.min@samsung.com>
2023-08-08 18:02:10 +00:00
Jiewen Wang
3926cc6ef8 ANDROID: GKI: Add symbols to symbol list for vivo
INFO: 1 function symbol(s) added
  'int __traceiter_android_vh_tune_scan_type(void*, enum scan_balance*)'

1 variable symbol(s) added
  'struct tracepoint __tracepoint_android_vh_tune_scan_type'

Bug: 294180281

Change-Id: I171099cdbe68c04885e286554f56290356d543d2
Signed-off-by: Jiewen Wang <jiewen.wang@vivo.com>
2023-08-07 18:11:35 +00:00
Jiewen Wang
dbb09068c1 ANDROID: vendor_hooks: Add tune scan type hook in get_scan_count()
Add hook in get_scan_count() for oem to wield customized reclamation strategy

Bug: 294180281
Change-Id: Ic54d35128e458661fc2b641809f5371b1d9a488e
Signed-off-by: Jiewen Wang <jiewen.wang@vivo.com>
2023-08-07 18:11:35 +00:00
Kalesh Singh
5e1d25ac2a FROMGIT: BACKPORT: Multi-gen LRU: Fix can_swap in lru_gen_look_around()
walk->can_swap might be invalid since it's not guaranteed to be
initialized for the particular lruvec.  Instead deduce it from the folio
type (anon/file).

Link: https://lkml.kernel.org/r/20230802025606.346758-3-kaleshsingh@google.com
Fixes: 018ee47f14 ("mm: multi-gen LRU: exploit locality in rmap")
Change-Id: I1ae78011d4972d87bac9f2db8c56352cdb7a9be6
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Tested-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> [mediatek]
Tested-by: Charan Teja Kalla <quic_charante@quicinc.com>
Cc: Yu Zhao <yuzhao@google.com>
Cc: Aneesh Kumar K V <aneesh.kumar@linux.ibm.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Brian Geffon <bgeffon@google.com>
Cc: Jan Alexander Steffens (heftig) <heftig@archlinux.org>
Cc: Lecopzer Chen <lecopzer.chen@mediatek.com>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Cc: Oleksandr Natalenko <oleksandr@natalenko.name>
Cc: Qi Zheng <zhengqi.arch@bytedance.com>
Cc: Steven Barrett <steven@liquorix.net>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
(cherry picked from commit fdf19e8c8f1cdcee4eccf4c98a875f44f39d8b9d https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-unstable)
Bug: 288383787
Bug: 291719697
[ Kalesh Singh - Fix trivial conflict in lru_gen_look_around() ]
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
2023-08-03 20:45:58 +00:00
Kalesh Singh
addf1a9a65 FROMGIT: Multi-gen LRU: Avoid race in inc_min_seq()
inc_max_seq() will try to inc_min_seq() if nr_gens == MAX_NR_GENS. This
is because the generations are reused (the last oldest now empty
generation will become the next youngest generation).

inc_min_seq() is retried until successful, dropping the lru_lock
and yielding the CPU on each failure, and retaking the lock before
trying again:

        while (!inc_min_seq(lruvec, type, can_swap)) {
                spin_unlock_irq(&lruvec->lru_lock);
                cond_resched();
                spin_lock_irq(&lruvec->lru_lock);
        }

However, the initial condition that required incrementing the min_seq
(nr_gens == MAX_NR_GENS) is not retested. This can change by another
call to inc_max_seq() from run_aging() with force_scan=true from the
debugfs interface.

Since the eviction stalls when the nr_gens == MIN_NR_GENS, avoid
unnecessarily incrementing the min_seq by rechecking the number of
generations before each attempt.

This issue was uncovered in previous discussion on the list by Yu Zhao
and Aneesh Kumar [1].

[1] https://lore.kernel.org/linux-mm/CAOUHufbO7CaVm=xjEb1avDhHVvnC8pJmGyKcFf2iY_dpf+zR3w@mail.gmail.com/

Link: https://lkml.kernel.org/r/20230802025606.346758-2-kaleshsingh@google.com
Fixes: d6c3af7d8a ("mm: multi-gen LRU: debugfs interface")
Change-Id: I89e84ef2927eb1b0091f1be28bd03eb04dee4c57
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Tested-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> [mediatek]
Tested-by: Charan Teja Kalla <quic_charante@quicinc.com>
Cc: Yu Zhao <yuzhao@google.com>
Cc: Aneesh Kumar K V <aneesh.kumar@linux.ibm.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Brian Geffon <bgeffon@google.com>
Cc: Jan Alexander Steffens (heftig) <heftig@archlinux.org>
Cc: Lecopzer Chen <lecopzer.chen@mediatek.com>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Cc: Oleksandr Natalenko <oleksandr@natalenko.name>
Cc: Qi Zheng <zhengqi.arch@bytedance.com>
Cc: Steven Barrett <steven@liquorix.net>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
(cherry picked from commit 250dbd10306126b06415afda8adfc27b2b780428 https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-unstable)
Bug: 288383787
Bug: 291719697
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
2023-08-03 20:45:58 +00:00
Kalesh Singh
a7adb98897 FROMGIT: Multi-gen LRU: Fix per-zone reclaim
MGLRU has a LRU list for each zone for each type (anon/file) in each
generation:

	long nr_pages[MAX_NR_GENS][ANON_AND_FILE][MAX_NR_ZONES];

The min_seq (oldest generation) can progress independently for each
type but the max_seq (youngest generation) is shared for both anon and
file. This is to maintain a common frame of reference.

In order for eviction to advance the min_seq of a type, all the per-zone
lists in the oldest generation of that type must be empty.

The eviction logic only considers pages from eligible zones for
eviction or promotion.

    scan_folios() {
	...
	for (zone = sc->reclaim_idx; zone >= 0; zone--)  {
	    ...
	    sort_folio(); 	// Promote
	    ...
	    isolate_folio(); 	// Evict
	}
	...
    }

Consider the system has the movable zone configured and default 4
generations. The current state of the system is as shown below
(only illustrating one type for simplicity):

Type: ANON

	Zone    DMA32     Normal    Movable    Device

	Gen 0       0          0        4GB         0

	Gen 1       0        1GB        1MB         0

	Gen 2     1MB        4GB        1MB         0

	Gen 3     1MB        1MB        1MB         0

Now consider there is a GFP_KERNEL allocation request (eligible zone
index <= Normal), evict_folios() will return without doing any work
since there are no pages to scan in the eligible zones of the oldest
generation. Reclaim won't make progress until triggered from a ZONE_MOVABLE
allocation request; which may not happen soon if there is a lot of free
memory in the movable zone. This can lead to OOM kills, although there
is 1GB pages in the Normal zone of Gen 1 that we have not yet tried to
reclaim.

This issue is not seen in the conventional active/inactive LRU since
there are no per-zone lists.

If there are no (not enough) folios to scan in the eligible zones, move
folios from ineligible zone (zone_index > reclaim_index) to the next
generation. This allows for the progression of min_seq and reclaiming
from the next generation (Gen 1).

Qualcomm, Mediatek and raspberrypi [1] discovered this issue independently.

[1] https://github.com/raspberrypi/linux/issues/5395

Link: https://lkml.kernel.org/r/20230802025606.346758-1-kaleshsingh@google.com
Fixes: ac35a49023 ("mm: multi-gen LRU: minimal implementation")
Change-Id: I5bbf44bd7ffe42f4347df4be59a75c1603c9b947
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Reported-by: Charan Teja Kalla <quic_charante@quicinc.com>
Reported-by: Lecopzer Chen <lecopzer.chen@mediatek.com>
Tested-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> [mediatek]
Tested-by: Charan Teja Kalla <quic_charante@quicinc.com>
Cc: Yu Zhao <yuzhao@google.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Brian Geffon <bgeffon@google.com>
Cc: Jan Alexander Steffens (heftig) <heftig@archlinux.org>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Cc: Oleksandr Natalenko <oleksandr@natalenko.name>
Cc: Qi Zheng <zhengqi.arch@bytedance.com>
Cc: Steven Barrett <steven@liquorix.net>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Aneesh Kumar K V <aneesh.kumar@linux.ibm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
(cherry picked from commit 1462260adc41c5974362cb54ff577c2a15b8c7b2 https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-unstable)
Bug: 288383787
Bug: 291719697
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
2023-08-03 20:45:58 +00:00
Jaewon Kim
03812b904e ANDROID: ABI: update symbol list for galaxy
INFO: 1 function symbol(s) added
  'int cleancache_register_ops(const struct cleancache_ops*)

Bug: 294177078
Change-Id: Ic22ddae4e92896ed28bc876d98969c6c3e94cb9d
Signed-off-by: Jaewon Kim <jaewon31.kim@samsung.com>
2023-08-03 16:48:30 +00:00
huzhanyuan
b283f9b41f ANDROID: oplus: Update the ABI xml and symbol list
INFO: ABI DIFFERENCES HAVE BEEN DETECTED!
INFO: 4 function symbol(s) added
'int __traceiter_android_vh_check_folio_look_around_ref(void*, struct folio*,int*)'
'int __traceiter_android_vh_look_around(void*, struct page_vma_mapped_walk*,struct folio*, struct vm_area_struct*, int*)'
'int __traceiter_android_vh_look_around_migrate_folio(void*, struct folio*, struct folio*)'
'int __traceiter_android_vh_test_clear_look_around_ref(void*, struct page*)'

4 variable symbol(s) added
'struct tracepoint __tracepoint_android_vh_check_folio_look_around_ref'
'struct tracepoint __tracepoint_android_vh_look_around'
'struct tracepoint __tracepoint_android_vh_look_around_migrate_folio'
'struct tracepoint __tracepoint_android_vh_test_clear_look_around_ref'

Bug: 292051411
Change-Id: I25fff4eefc6773d3e1130bd0ff3f3cc21d6c0964
signed-off-by: Zhanyuan Hu <huzhanyuan@oppo.com>
2023-08-02 21:57:15 +00:00
Peifeng Li
c3d26e2b5a ANDROID: vendor_hooks: Add hooks for lookaround
Add hooks for support lookaround in memory reclamation.

- android_vh_test_clear_look_around_ref
- android_vh_check_folio_look_around_ref
- android_vh_look_around_migrate_folio
- android_vh_look_around

Bug: 292051411

Signed-off-by: Peifeng Li <lipeifeng@oppo.com>
Change-Id: I9a606ae71d2f1303df3b02403b30bc8fdc9d06dd
(cherry picked from commit f50f24e781)
[huzhanyuan: changed page to folio where appropriate]
2023-08-02 21:57:15 +00:00
Giuliano Procida
29e2f3e3d1 ANDROID: ABI: Update STG ABI to format version 2
If you have trouble reading this new file format, please refresh your
prebuilt version of STG with repo sync.

Bug: 294213765
Change-Id: I4d7ee716231956c5f4da1343cc0db5170aaaa3b1
Signed-off-by: Giuliano Procida <gprocida@google.com>
2023-08-02 18:33:42 +00:00
Jindong Yue
3bd3d13701 ANDROID: ABI: Update symbol list for imx
2 function symbol(s) added
  'bool kthread_freezable_should_stop(bool*)'
  'int v4l2_enum_dv_timings_cap(struct v4l2_enum_dv_timings*, const struct v4l2_dv_timings_cap*, v4l2_check_dv_timings_fnc*, void*)'

Bug: 283014063
Change-Id: Ib4f8f9c67277501dcaa2fa5d8f2867d5fa670de3
Signed-off-by: Jindong Yue <jindong.yue@nxp.com>
2023-08-02 14:56:10 +00:00
sunshijie
ad0b008167 FROMGIT: erofs: fix wrong primary bvec selection on deduplicated extents
When handling deduplicated compressed data, there can be multiple
decompressed extents pointing to the same compressed data in one shot.

In such cases, the bvecs which belong to the longest extent will be
selected as the primary bvecs for real decompressors to decode and the
other duplicated bvecs will be directly copied from the primary bvecs.

Previously, only relative offsets of the longest extent were checked to
decompress the primary bvecs.  On rare occasions, it can be incorrect
if there are several extents with the same start relative offset.
As a result, some short bvecs could be selected for decompression and
then cause data corruption.

For example, as Shijie Sun reported off-list, considering the following
extents of a file:
117:   903345..  915250 |   11905 :     385024..    389120 |    4096
...
119:   919729..  930323 |   10594 :     385024..    389120 |    4096
...
124:   968881..  980786 |   11905 :     385024..    389120 |    4096

The start relative offset is the same: 2225, but extent 119 (919729..
930323) is shorter than the others.

Let's restrict the bvec length in addition to the start offset if bvecs
are not full.

Reported-by: Shijie Sun <sunshijie@xiaomi.com>
Fixes: 5c2a64252c ("erofs: introduce partial-referenced pclusters")
Tested-by Shijie Sun <sunshijie@xiaomi.com>
Reviewed-by: Yue Hu <huyue2@coolpad.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230719065459.60083-1-hsiangkao@linux.alibaba.com
(cherry picked from commit 7d15c91a75aae55767f368e8abbabd7cedf4ec94
https://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs.git dev)
Bug: 293245292
Change-Id: Ic8ded9b2d3592ffd0863f4f0d2ac4ae6a1821a1b
Signed-off-by: sunshijie <sunshijie@xiaomi.corp-partner.google.com>
2023-08-01 21:50:12 +00:00
Ming Qian
126ef64cba UPSTREAM: media: Add ABGR64_12 video format
ABGR64_12 is a reversed RGB format with alpha channel last,
12 bits per component like ABGR32,
expanded to 16bits.
Data in the 12 high bits, zeros in the 4 low bits,
arranged in little endian order.

Bug: 293213303
Change-Id: Idc4e1100c9e2134a48b594151e3398f6436b010d
(cherry picked from commit 302b988ca0)
Signed-off-by: Ming Qian <ming.qian@nxp.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Jindong Yue <jindong.yue@nxp.com>
2023-08-01 21:45:37 +00:00
Ming Qian
86e2e8fd05 BACKPORT: media: Add BGR48_12 video format
BGR48_12 is a reversed RGB format with 12 bits per component like BGR24,
expanded to 16bits.
Data in the 12 high bits, zeros in the 4 low bits,
arranged in little endian order.

Bug: 293213303
Change-Id: I27d14a33c8e2b4847a63ea05b285786766949ebf
(cherry picked from commit da0b7a400e)
[Jindong: Fixed conflicts in .rst file and v4l2-ioctl.c]
Signed-off-by: Ming Qian <ming.qian@nxp.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Jindong Yue <jindong.yue@nxp.com>
2023-08-01 21:45:37 +00:00
Ming Qian
892293272c UPSTREAM: media: Add YUV48_12 video format
YUV48_12 is a YUV format with 12-bits per component like YUV24,
expanded to 16bits.
Data in the 12 high bits, zeros in the 4 low bits,
arranged in little endian order.

[hverkuil: replaced a . by ,]

Bug: 293213303
Change-Id: I12e6f02b99918a429224320da2127d6b4d777584
(cherry picked from commit 99c9549677)
Signed-off-by: Ming Qian <ming.qian@nxp.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Jindong Yue <jindong.yue@nxp.com>
2023-08-01 21:45:37 +00:00
Ming Qian
b2cf7e4268 UPSTREAM: media: Add Y212 v4l2 format info
Y212 is a YUV format with 12-bits per component like YUYV,
expanded to 16bits.
Data in the 12 high bits, zeros in the 4 low bits,
arranged in little endian order.

Add the missing v4l2 foramt info of Y212

Bug: 293213303
Change-Id: Ibdf9bb3a3f1eb895da9eca52d115e08b656b5153
(cherry picked from commit a178dd3bbe)
Signed-off-by: Ming Qian <ming.qian@nxp.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Jindong Yue <jindong.yue@nxp.com>
2023-08-01 21:45:37 +00:00
Tomi Valkeinen
0f3f7a21af UPSTREAM: media: Add Y210, Y212 and Y216 formats
Add Y210, Y212 and Y216 formats.

Bug: 293213303
Change-Id: I2d580dd82481f6a1364dfcedfd918e82d25ac211
(cherry picked from commit 0dc1d7a79a)
Signed-off-by: Tomi Valkeinen <tomi.valkeinen+renesas@ideasonboard.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Acked-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Acked-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Jindong Yue <jindong.yue@nxp.com>
2023-08-01 21:45:37 +00:00
Ming Qian
ca7b45b128 UPSTREAM: media: Add Y012 video format
Y012 is a luma-only formats with 12-bits per pixel,
expanded to 16bits.
Data in the 12 high bits, zeros in the 4 low bits,
arranged in little endian order.

Bug: 293213303
Change-Id: I1a8f73162932e0760aabbe44525d7c74ace9f7bd
(cherry picked from commit a490ea6844)
Signed-off-by: Ming Qian <ming.qian@nxp.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Jindong Yue <jindong.yue@nxp.com>
2023-08-01 21:45:37 +00:00
Ming Qian
343b85ecad UPSTREAM: media: Add P012 and P012M video format
P012 is a YUV format with 12-bits per component with interleaved UV,
like NV12, expanded to 16 bits.
Data in the 12 high bits, zeros in the 4 low bits,
arranged in little endian order.
And P012M has two non contiguous planes.

Bug: 293213303
Change-Id: I1fbfa7c445bc682766f479cca07eb8cb16cbb44f
(cherry picked from commit aa10804042)
Signed-off-by: Ming Qian <ming.qian@nxp.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Jindong Yue <jindong.yue@nxp.com>
2023-08-01 21:45:37 +00:00
Ramji Jiyani
7beed73af0 ANDROID: GKI: Create symbol files in include/config
Create input symbol files to generate GKI modules header
under include/config. By placing files in this generated
directory, the default filters that ignore certain files
will work without any special handling required, and they
will also be available to inspect after the build to inspect
for the debugging purposes.

abi_gki_protected_exports: Input for gki_module_protected_exports.h
From :- ${objtree}/abi_gki_protected_exports
To :- include/config/abi_gki_protected_exports

all_kmi_symbols: Input for gki_module_unprotected.h
- Rename to abi_gki_kmi_symbols
From :- all_kmi_symbols
To :- include/config/abi_gki_kmi_symbols

Bug: 286529877
Test: TH
Test: Manual verification of the generated files
Change-Id: Iafa10631e7712a8e1e87a2f56cfd614de6b1053a
Signed-off-by: Ramji Jiyani <ramjiyani@google.com>
2023-08-01 21:21:29 +00:00
Paul Lawrence
295e779e8f ANDROID: fuse-bpf: Use stored bpf for create_open
create_open would always take its parent directory's bpf for the created
object. Modify to use the bpf stored in fuse_dentry which is set by
lookup.

Bug: 291705489
Test: fuse_test passes, adb push file /sdcard/Android/data works
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Change-Id: I0a1ea2a291a8fdf67923f1827176b2ea96bd4c2d
2023-07-31 23:09:25 +00:00
Paul Lawrence
74d9daa59a ANDROID: fuse-bpf: Add bpf to negative fuse_dentry
Store the results of a negative lookup in the fuse_dentry so later
opcodes can use them to create files

Bug: 291705489
Test: fuse_test passes
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Change-Id: I725e714a1d6ce43f24431d07c24e96349ef1a55c
2023-07-31 23:09:25 +00:00
Paul Lawrence
6aef06abba ANDROID: fuse-bpf: Check inode not null
fuse_iget_backing returns an inode or null, not a ERR_PTR. So check it's
not NULL

Also make sure we put the inode if d_splice_alias fails

Bug: 293349757
Test: fuse_test runs
Signed_off_by: Paul Lawrence <paullawrence@google.com>

Change-Id: I1eadad32f80bab6730e461412b4b7ab4d6c56bf2
2023-07-31 23:09:25 +00:00
Paul Lawrence
4bbda90bd8 ANDROID: fuse-bpf: Fix flock test compile error
Bug: 293161755
Test: fuse_test compiles
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Change-Id: I249672bab85966e20a26018f65f135fe15c6eff5
2023-07-31 23:09:25 +00:00
Daniel Rosenberg
84ac22a0d3 ANDROID: fuse-bpf: Add partial ioctl support
This adds passthrough only support for ioctls with fuse-bpf.
compat_ioctls will return -ENOTTY.

Bug: 279519292
Test: F2fsMiscTest#testAtomicWrite
Change-Id: Ia3052e465d87dc1d15ae13955fba8a7f93bc387b
Signed-off-by: Daniel Rosenberg <drosen@google.com>
2023-07-31 23:09:25 +00:00
xieliujie
e341d2312c ANDROID: ABI: Update oplus symbol list
3 function symbol(s) added
  'int __traceiter_android_rvh_rtmutex_force_update(void*, struct task_struct*, struct task_struct*, int*)'
  'int __traceiter_android_vh_rtmutex_waiter_prio(void*, struct task_struct*, int*)'
  'int __traceiter_android_vh_task_blocks_on_rtmutex(void*, struct rt_mutex_base*, struct rt_mutex_waiter*, struct task_struct*, struct ww_acquire_ctx*, unsigned int*)'

3 variable symbol(s) added
  'struct tracepoint __tracepoint_android_rvh_rtmutex_force_update'
  'struct tracepoint __tracepoint_android_vh_rtmutex_waiter_prio'
  'struct tracepoint __tracepoint_android_vh_task_blocks_on_rtmutex'

Bug: 290585456
Change-Id: I4af3d1c8df44822b7f5fd5d5682e65d7c6c4dcc3
Signed-off-by: xieliujie <xieliujie@oppo.com>
2023-07-31 22:47:04 +00:00
Jann Horn
f5c707dc65 UPSTREAM: mm/mempolicy: Take VMA lock before replacing policy
mbind() calls down into vma_replace_policy() without taking the per-VMA
locks, replaces the VMA's vma->vm_policy pointer, and frees the old
policy.  That's bad; a concurrent page fault might still be using the
old policy (in vma_alloc_folio()), resulting in use-after-free.

Normally this will manifest as a use-after-free read first, but it can
result in memory corruption, including because vma_alloc_folio() can
call mpol_cond_put() on the freed policy, which conditionally changes
the policy's refcount member.

This bug is specific to CONFIG_NUMA, but it does also affect non-NUMA
systems as long as the kernel was built with CONFIG_NUMA.

Signed-off-by: Jann Horn <jannh@google.com>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Fixes: 5e31275cc9 ("mm: add per-VMA lock and helper functions to control it")
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Bug: 293665307
(cherry picked from commit 6c21e066f9)
Change-Id: I2e3a4ee8bad97457ee3e127694f0609e7a240a2f
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2023-07-29 07:25:37 +00:00
Jann Horn
890b1aabb1 BACKPORT: mm: lock_vma_under_rcu() must check vma->anon_vma under vma lock
lock_vma_under_rcu() tries to guarantee that __anon_vma_prepare() can't
be called in the VMA-locked page fault path by ensuring that
vma->anon_vma is set.

However, this check happens before the VMA is locked, which means a
concurrent move_vma() can concurrently call unlink_anon_vmas(), which
disassociates the VMA's anon_vma.

This means we can get UAF in the following scenario:

  THREAD 1                   THREAD 2
  ========                   ========
  <page fault>
    lock_vma_under_rcu()
      rcu_read_lock()
      mas_walk()
      check vma->anon_vma

                             mremap() syscall
                               move_vma()
                                vma_start_write()
                                 unlink_anon_vmas()
                             <syscall end>

    handle_mm_fault()
      __handle_mm_fault()
        handle_pte_fault()
          do_pte_missing()
            do_anonymous_page()
              anon_vma_prepare()
                __anon_vma_prepare()
                  find_mergeable_anon_vma()
                    mas_walk() [looks up VMA X]

                             munmap() syscall (deletes VMA X)

                    reusable_anon_vma() [called on freed VMA X]

This is a security bug if you can hit it, although an attacker would
have to win two races at once where the first race window is only a few
instructions wide.

This patch is based on some previous discussion with Linus Torvalds on
the security list.

Cc: stable@vger.kernel.org
Fixes: 5e31275cc9 ("mm: add per-VMA lock and helper functions to control it")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Bug: 293665307
(cherry picked from commit 657b514695)
[surenb: removed vma_is_tcp() call not present in 6.1]
Change-Id: I4bd91e1db337ff35eb7c1d436f4372944556dd7d
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2023-07-29 06:57:25 +00:00
Lorenzo Pieralisi
d3b37a712a BACKPORT: FROMGIT: irqchip/gic-v3: Workaround for GIC-700 erratum 2941627
GIC700 erratum 2941627 may cause GIC-700 missing SPIs wake
requests when SPIs are deactivated while targeting a
sleeping CPU - ie a CPU for which the redistributor:

GICR_WAKER.ProcessorSleep == 1

This runtime situation can happen if an SPI that has been
activated on a core is retargeted to a different core, it
becomes pending and the target core subsequently enters a
power state quiescing the respective redistributor.

When this situation is hit, the de-activation carried out
on the core that activated the SPI (through either ICC_EOIR1_EL1
or ICC_DIR_EL1 register writes) does not trigger a wake
requests for the sleeping GIC redistributor even if the SPI
is pending.

Work around the erratum by de-activating the SPI using the
redistributor GICD_ICACTIVER register if the runtime
conditions require it (ie the IRQ was retargeted between
activation and de-activation).

Bug: 292459437
Change-Id: Ide915b8c925a631a7fc9ccebca19d9175def162e
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230704155034.148262-1-lpieralisi@kernel.org
(cherry picked from commit 6fe5c68ee6 https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git irq/irqchip-fixes)
Signed-off-by: Carlos Galo <carlosgalo@google.com>
2023-07-27 19:40:08 +00:00
wangshuai12
a89e2cbbc0 ANDROID: GKI: update xiaomi symbol list
1 function symbol(s) added
  'int __blk_mq_debugfs_rq_show(struct seq_file*, struct request*)'

Bug: 290730657
Change-Id: Ib3711e9e875e3d6ccc809a87c607fae149159a58
Signed-off-by: wangshuai12 <wangshuai12@xiaomi.corp-partner.google.com>
2023-07-27 15:11:16 +00:00
Hugh Dickins
371f8d901a UPSTREAM: mm: lock newly mapped VMA with corrected ordering
Lockdep is certainly right to complain about

  (&vma->vm_lock->lock){++++}-{3:3}, at: vma_start_write+0x2d/0x3f
                 but task is already holding lock:
  (&mapping->i_mmap_rwsem){+.+.}-{3:3}, at: mmap_region+0x4dc/0x6db

Invert those to the usual ordering.

Fixes: 33313a747e ("mm: lock newly mapped VMA which can be modified after it becomes visible")
Cc: stable@vger.kernel.org
Signed-off-by: Hugh Dickins <hughd@google.com>
Tested-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 1c7873e336)
Change-Id: I85f9cfb6ee8f3d9fefda5518c5637a7dff64bac3
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-07-27 12:19:09 +00:00