Commit Graph

988215 Commits

Author SHA1 Message Date
Minchan Kim
611d3745f3 ANDROID: mm: keep __get_user_pages_remote behavior
Originally, in the FOLL_LONGTERM case, __get_user_pages_remote
returned with __gup_longterm_locked's return value directly
but [1] broke the behavior so keep old behavior.

[1] d5d9a23576, ANDROID: mm: retry GUP with orignal gup_flags on failure
Bug: 231990030
Signed-off-by: Minchan Kim <minchan@google.com>
Change-Id: If91b01c666cfbeb11d535d282c1ee7eec5700125
2022-05-11 16:05:48 +00:00
Ray Chi
9afeef924c ANDROID: Update the ABI representation
Leaf changes summary: 2 artifacts changed
Changed leaf types summary: 0 leaf type changed
Removed/Changed/Added functions summary: 0 Removed, 0 Changed, 2 Added functions
Removed/Changed/Added variables summary: 0 Removed, 0 Changed, 0 Added variable

2 Added functions:

  [A] 'function int usb_gadget_activate(usb_gadget*)'
  [A] 'function int usb_gadget_deactivate(usb_gadget*)'

Bug: 217638743
Signed-off-by: Ray Chi <raychi@google.com>
Change-Id: I989e50b0ea6c01dc8c420bcb6e6943f6127e09b2
2022-05-11 13:01:31 +00:00
Lina Wang
ec9b4b8fff UPSTREAM: xfrm: fix tunnel model fragmentation behavior
[ Upstream commit 4ff2980b6b ]

in tunnel mode, if outer interface(ipv4) is less, it is easily to let
inner IPV6 mtu be less than 1280. If so, a Packet Too Big ICMPV6 message
is received. When send again, packets are fragmentized with 1280, they
are still rejected with ICMPV6(Packet Too Big) by xfrmi_xmit2().

According to RFC4213 Section3.2.2:
if (IPv4 path MTU - 20) is less than 1280
	if packet is larger than 1280 bytes
		Send ICMPv6 "packet too big" with MTU=1280
                Drop packet
        else
		Encapsulate but do not set the Don't Fragment
                flag in the IPv4 header.  The resulting IPv4
                packet might be fragmented by the IPv4 layer
                on the encapsulator or by some router along
                the IPv4 path.
	endif
else
	if packet is larger than (IPv4 path MTU - 20)
        	Send ICMPv6 "packet too big" with
                MTU = (IPv4 path MTU - 20).
                Drop packet.
        else
                Encapsulate and set the Don't Fragment flag
                in the IPv4 header.
        endif
endif
Packets should be fragmentized with ipv4 outer interface, so change it.

After it is fragemtized with ipv4, there will be double fragmenation.
No.48 & No.51 are ipv6 fragment packets, No.48 is double fragmentized,
then tunneled with IPv4(No.49& No.50), which obey spec. And received peer
cannot decrypt it rightly.

48              2002::10        2002::11 1296(length) IPv6 fragment (off=0 more=y ident=0xa20da5bc nxt=50)
49   0x0000 (0) 2002::10        2002::11 1304         IPv6 fragment (off=0 more=y ident=0x7448042c nxt=44)
50   0x0000 (0) 2002::10        2002::11 200          ESP (SPI=0x00035000)
51              2002::10        2002::11 180          Echo (ping) request
52   0x56dc     2002::10        2002::11 248          IPv6 fragment (off=1232 more=n ident=0xa20da5bc nxt=50)

xfrm6_noneed_fragment has fixed above issues. Finally, it acted like below:
1   0x6206 192.168.1.138   192.168.1.1 1316 Fragmented IP protocol (proto=Encap Security Payload 50, off=0, ID=6206) [Reassembled in #2]
2   0x6206 2002::10        2002::11    88   IPv6 fragment (off=0 more=y ident=0x1f440778 nxt=50)
3   0x0000 2002::10        2002::11    248  ICMPv6    Echo (ping) request

Signed-off-by: Lina Wang <lina.wang@mediatek.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Bug: 226699354
Change-Id: Ideec82bea6a1efa26352680cb3113f7c36b945ef
Signed-off-by: Lina Wang <lina.wang@mediatek.com>
2022-05-11 09:35:33 +00:00
Minchan Kim
42596c7b41 ANDROID: fix ABI breakage caused by per_cpu_pages
The patchset adds a new spin_lock field into per_cpu_pages which
breaks KMI so this patch introduces per_cpu_pages_ext and
per_cpu_pageset_ext and changes relavant functions and code
to use the _ext data structures instead of original one.

Bug: 230899966
Signed-off-by: Minchan Kim <minchan@google.com>
Change-Id: Ic5156f784223695c9716409036e2973df69ef99b
2022-05-10 14:02:02 -07:00
Minchan Kim
2eb3710ce5 ANDROID: fix ABI breakage caused by adding union type in struct page
The patchset includes two additional fields along with lru in struct page
but they were all union so it shouldn't break change the semantic.
However, ABI is broken so this patch reverts the patchset since it
doesn't change runtime behavior difference. Just lose code readability.

Bug: 230899966
Signed-off-by: Minchan Kim <minchan@google.com>
Change-Id: I4eb1a55a9ca52794e136870bfddbd04175f1134b
2022-05-10 14:02:02 -07:00
Nicolas Saenz Julienne
fc19a77b2a FROMLIST: BACKPORT: mm/page_alloc: Remotely drain per-cpu lists
Some setups, notably NOHZ_FULL CPUs, are too busy to handle the per-cpu
drain work queued by __drain_all_pages(). So introduce new a mechanism
to remotely drain the per-cpu lists. It is made possible by remotely
locking 'struct per_cpu_pages' new per-cpu spinlocks. A benefit of this
new scheme is that drain operations are now migration safe.

There was no observed performance degradation vs. the previous scheme.
Both netperf and hackbench were run in parallel to triggering the
__drain_all_pages(NULL, true) code path around ~100 times per second.
The new scheme performs a bit better (~5%), although the important point
here is there are no performance regressions vs. the previous mechanism.
Per-cpu lists draining happens only in slow paths.

Signed-off-by: Nicolas Saenz Julienne <nsaenzju@redhat.com>
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Link: https://lore.kernel.org/all/20220420095906.27349-7-mgorman@techsingularity.net/

Conflicts:
	mm/page_alloc.c

1. aosp doesn't need 9c25cbfcb3, skip it

Bug: 230899966
Signed-off-by: Minchan Kim <minchan@google.com>
Change-Id: I8c4120d215836b04c53d0e4950a821fce4c99075
2022-05-10 14:02:01 -07:00
Minchan Kim
b71c6184df FROMLIST: BACKPORT: mm/page_alloc: Protect PCP lists with a spinlock
Currently the PCP lists are protected by using local_lock_irqsave to
prevent migration and IRQ reentrancy but this is inconvenient. Remote
draining of the lists is impossible and a workqueue is required and
every task allocation/free must disable then enable interrupts which is
expensive.

As preparation for dealing with both of those problems, protect the
lists with a spinlock. The IRQ-unsafe version of the lock is used
because IRQs are already disabled by local_lock_irqsave. spin_trylock
is used in preparation for a time when local_lock could be used instead
of lock_lock_irqsave.

The per_cpu_pages still fits within the same number of cache lines after
this patch relative to before the series.

struct per_cpu_pages {
    spinlock_t                 lock;                 /*     0     4 */
    int                        count;                /*     4     4 */
    int                        high;                 /*     8     4 */
    int                        batch;                /*    12     4 */
    short int                  free_factor;          /*    16     2 */
    short int                  expire;               /*    18     2 */

    /* XXX 4 bytes hole, try to pack */

    struct list_head           lists[13];            /*    24   208 */

    /* size: 256, cachelines: 4, members: 7 */
    /* sum members: 228, holes: 1, sum holes: 4 */
    /* padding: 24 */
} __attribute__((__aligned__(64)));

Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Link: https://lore.kernel.org/all/20220420095906.27349-6-mgorman@techsingularity.net/
Link: https://lore.kernel.org/all/20220429091321.GB3441@techsingularity.net/

Conflicts:
	mm/page_alloc.c

1. per_cpu_pages are updated from 44042b4498 at 5.13 so conflicted
Since we don't need to have high-order page pcp atm, skip the patch.

Bug: 230899966
Signed-off-by: Minchan Kim <minchan@google.com>
Change-Id: I03ff1c22301e7f8735947e71413376ea143e855a
2022-05-10 14:02:01 -07:00
Mel Gorman
c249c40b79 FROMLIST: BACKPORT: mm/page_alloc: Split out buddy removal code from rmqueue into separate helper
This is a preparation page to allow the buddy removal code to be reused
in a later patch.

No functional change.

Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Link: https://lore.kernel.org/all/20220420095906.27349-4-mgorman@techsingularity.net/

Conflicts:
        mm/page_alloc.c

1. Skipped changes in __rmqueue_pcplist which are not present in 5.10 kernel.
2. [1] introduced page allocation path a lot change to support cma first
   allocation policy so needed to adapt the change.

[1] ANDROID: cma: redirect page allocation to CMA

Bug: 230899966
Signed-off-by: Minchan Kim <minchan@google.com>
Change-Id: I4584fbfdebf2637534d6a68635a44a81a176c253
2022-05-10 14:02:01 -07:00
Mel Gorman
a248d08a94 FROMLIST: BACKPORT: mm/page_alloc: Add page->buddy_list and page->pcp_list
The page allocator uses page->lru for storing pages on either buddy or
PCP lists. Create page->buddy_list and page->pcp_list as a union with
page->lru. This is simply to clarify what type of list a page is on
in the page allocator.

No functional change intended.

Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Link: https://lore.kernel.org/all/20220420095906.27349-2-mgorman@techsingularity.net/

Bug: 230899966
Signed-off-by: Minchan Kim <minchan@google.com>
Change-Id: Ieef253fa28c2a411008da64b38716f6401a66961
2022-05-10 14:02:01 -07:00
Nicolas Saenz Julienne
e70a2e110b UPSTREAM: BACKPORT: mm/page_alloc: don't pass pfn to free_unref_page_commit()
free_unref_page_commit() doesn't make use of its pfn argument, so get
rid of it.

Link: https://lkml.kernel.org/r/20220202140451.415928-1-nsaenzju@redhat.com
Signed-off-by: Nicolas Saenz Julienne <nsaenzju@redhat.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 566513775d)

Bug: 230899966
Signed-off-by: Minchan Kim <minchan@google.com>
Change-Id: Ica1c200408ff1f91589cb2ed28c4fac0ce0c62f8
2022-05-10 14:02:01 -07:00
Mel Gorman
5707719280 UPSTREAM: BACKPORT: mm/page_alloc: avoid conflating IRQs disabled with zone->lock
Historically when freeing pages, free_one_page() assumed that callers had
IRQs disabled and the zone->lock could be acquired with spin_lock().  This
confuses the scope of what local_lock_irq is protecting and what
zone->lock is protecting in free_unref_page_list in particular.

This patch uses spin_lock_irqsave() for the zone->lock in free_one_page()
instead of relying on callers to have disabled IRQs.
free_unref_page_commit() is changed to only deal with PCP pages protected
by the local lock.  free_unref_page_list() then first frees isolated pages
to the buddy lists with free_one_page() and frees the rest of the pages to
the PCP via free_unref_page_commit().  The end result is that
free_one_page() is no longer depending on side-effects of local_lock to be
correct.

Note that this may incur a performance penalty while memory hot-remove is
running but that is not a common operation.

[lkp@intel.com: Ensure CMA pages get addded to correct pcp list]

Link: https://lkml.kernel.org/r/20210512095458.30632-9-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit df1acc8569)

Conflicts:
    mm/page_alloc.c

1. AOSP has custom change [1] so MIGRATE_PCPTYPES conflicted.
Use MIGRATE_CMA instead of MIGRATE_PCPTYPES.

[1] ANDROID: mm: freeing MIGRATE_ISOLATE page instantly

Bug: 230899966
Signed-off-by: Minchan Kim <minchan@google.com>
Change-Id: I23598cdcd7417e06a3389216ca2230aa446a2647
2022-05-10 14:02:01 -07:00
Alexandru Elisei
49f6aaf99d UPSTREAM: Revert "usb: dwc3: core: Add shutdown callback for dwc3"
This reverts commit 568262bf54.
The commit causes the following panic when shutting down a rockpro64-v2
board:

[   41.684569] xhci-hcd xhci-hcd.2.auto: USB bus 1 deregistered
[   41.686301] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000a0
[   41.687096] Mem abort info:
[   41.687345]   ESR = 0x96000004
[   41.687615]   EC = 0x25: DABT (current EL), IL = 32 bits
[   41.688082]   SET = 0, FnV = 0
[   41.688352]   EA = 0, S1PTW = 0
[   41.688628] Data abort info:
[   41.688882]   ISV = 0, ISS = 0x00000004
[   41.689219]   CM = 0, WnR = 0
[   41.689481] user pgtable: 4k pages, 48-bit VAs, pgdp=00000000073b2000
[   41.690046] [00000000000000a0] pgd=0000000000000000, p4d=0000000000000000
[   41.690654] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[   41.691143] Modules linked in:
[   41.691416] CPU: 5 PID: 1 Comm: shutdown Not tainted 5.13.0-rc4 #43
[   41.691966] Hardware name: Pine64 RockPro64 v2.0 (DT)
[   41.692409] pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--)
[   41.692937] pc : down_read_interruptible+0xec/0x200
[   41.693373] lr : simple_recursive_removal+0x48/0x280
[   41.693815] sp : ffff800011fab910
[   41.694107] x29: ffff800011fab910 x28: ffff0000008fe480 x27: ffff0000008fe4d8
[   41.694736] x26: ffff800011529a90 x25: 00000000000000a0 x24: ffff800011edd030
[   41.695364] x23: 0000000000000080 x22: 0000000000000000 x21: ffff800011f23994
[   41.695992] x20: ffff800011f23998 x19: ffff0000008fe480 x18: ffffffffffffffff
[   41.696620] x17: 000c0400bb44ffff x16: 0000000000000009 x15: ffff800091faba3d
[   41.697248] x14: 0000000000000004 x13: 0000000000000000 x12: 0000000000000020
[   41.697875] x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f x9 : 6f6c746364716e62
[   41.698502] x8 : 7f7f7f7f7f7f7f7f x7 : fefefeff6364626d x6 : 0000000000000440
[   41.699130] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 00000000000000a0
[   41.699758] x2 : 0000000000000001 x1 : 0000000000000000 x0 : 00000000000000a0
[   41.700386] Call trace:
[   41.700602]  down_read_interruptible+0xec/0x200
[   41.701003]  debugfs_remove+0x5c/0x80
[   41.701328]  dwc3_debugfs_exit+0x1c/0x6c
[   41.701676]  dwc3_remove+0x34/0x1a0
[   41.701988]  platform_remove+0x28/0x60
[   41.702322]  __device_release_driver+0x188/0x22c
[   41.702730]  device_release_driver+0x2c/0x44
[   41.703106]  bus_remove_device+0x124/0x130
[   41.703468]  device_del+0x16c/0x424
[   41.703777]  platform_device_del.part.0+0x1c/0x90
[   41.704193]  platform_device_unregister+0x28/0x44
[   41.704608]  of_platform_device_destroy+0xe8/0x100
[   41.705031]  device_for_each_child_reverse+0x64/0xb4
[   41.705470]  of_platform_depopulate+0x40/0x84
[   41.705853]  __dwc3_of_simple_teardown+0x20/0xd4
[   41.706260]  dwc3_of_simple_shutdown+0x14/0x20
[   41.706652]  platform_shutdown+0x28/0x40
[   41.706998]  device_shutdown+0x158/0x330
[   41.707344]  kernel_power_off+0x38/0x7c
[   41.707684]  __do_sys_reboot+0x16c/0x2a0
[   41.708029]  __arm64_sys_reboot+0x28/0x34
[   41.708383]  invoke_syscall+0x48/0x114
[   41.708716]  el0_svc_common.constprop.0+0x44/0xdc
[   41.709131]  do_el0_svc+0x28/0x90
[   41.709426]  el0_svc+0x2c/0x54
[   41.709698]  el0_sync_handler+0xa4/0x130
[   41.710045]  el0_sync+0x198/0x1c0
[   41.710342] Code: c8047c62 35ffff84 17fffe5f f9800071 (c85ffc60)
[   41.710881] ---[ end trace 406377df5178f75c ]---
[   41.711299] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[   41.712084] Kernel Offset: disabled
[   41.712391] CPU features: 0x10001031,20000846
[   41.712775] Memory Limit: none
[   41.713049] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---

As Felipe explained: "dwc3_shutdown() is just called dwc3_remove()
directly, then we end up calling debugfs_remove_recursive() twice."

Reverting the commit fixes the panic.

Fixes: 568262bf54 ("usb: dwc3: core: Add shutdown callback for dwc3")
Acked-by: Felipe Balbi <balbi@kernel.org>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Link: https://lore.kernel.org/r/20210603151742.298243-1-alexandru.elisei@arm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

(cherry picked from commit 8f11fe7e40)

Bug: 229811473
Signed-off-by: Ray Chi <raychi@google.com>
Change-Id: Iec76be88df9049b590b6b1592c625316d264beb4
2022-05-10 17:22:28 +00:00
Lee Jones
721fb79e0e BACKPORT: staging: ion: Prevent incorrect reference counting behavour
Supply additional check in order to prevent unexpected results.

Bug: 205573273
Fixes: b892bf75b2 ("ion: Switch ion to use dma-buf")
Suggested-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[Lee: Patch now applies to ion_buffer.c instead of ion.c]
Change-Id: Ia6afdd9ca502caa9cad6619d438fc6c8e8457679
(cherry picked from commit fea24b07ed)
2022-05-10 10:32:58 +00:00
Lina Wang
0f6bc2b736 FROMGIT: net: fix wrong network header length
When clatd starts with ebpf offloaing, and NETIF_F_GRO_FRAGLIST is enable,
several skbs are gathered in skb_shinfo(skb)->frag_list. The first skb's
ipv6 header will be changed to ipv4 after bpf_skb_proto_6_to_4,
network_header\transport_header\mac_header have been updated as ipv4 acts,
but other skbs in frag_list didnot update anything, just ipv6 packets.

udp_queue_rcv_skb will call skb_segment_list to traverse other skbs in
frag_list and make sure right udp payload is delivered to user space.
Unfortunately, other skbs in frag_list who are still ipv6 packets are
updated like the first skb and will have wrong transport header length.

e.g.before bpf_skb_proto_6_to_4,the first skb and other skbs in frag_list
has the same network_header(24)& transport_header(64), after
bpf_skb_proto_6_to_4, ipv6 protocol has been changed to ipv4, the first
skb's network_header is 44,transport_header is 64, other skbs in frag_list
didnot change.After skb_segment_list, the other skbs in frag_list has
different network_header(24) and transport_header(44), so there will be 20
bytes different from original,that is difference between ipv6 header and
ipv4 header. Just change transport_header to be the same with original.

Actually, there are two solutions to fix it, one is traversing all skbs
and changing every skb header in bpf_skb_proto_6_to_4, the other is
modifying frag_list skb's header in skb_segment_list. Considering
efficiency, adopt the second one--- when the first skb and other skbs in
frag_list has different network_header length, restore them to make sure
right udp payload is delivered to user space.

Signed-off-by: Lina Wang <lina.wang@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit cf3ab8d4a7 https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git master)
Bug: 218157620
Test: TreeHugger
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Change-Id: I36f2f329ec1a56bb0742141a7fa482cafa183ad3
2022-05-09 15:41:46 +00:00
Minchan Kim
f6f08b9b18 UPSTREAM: mm: fix unexpected zeroed page mapping with zram swap
commit e914d8f003 upstream.

Two processes under CLONE_VM cloning, user process can be corrupted by
seeing zeroed page unexpectedly.

      CPU A                        CPU B

  do_swap_page                do_swap_page
  SWP_SYNCHRONOUS_IO path     SWP_SYNCHRONOUS_IO path
  swap_readpage valid data
    swap_slot_free_notify
      delete zram entry
                              swap_readpage zeroed(invalid) data
                              pte_lock
                              map the *zero data* to userspace
                              pte_unlock
  pte_lock
  if (!pte_same)
    goto out_nomap;
  pte_unlock
  return and next refault will
  read zeroed data

The swap_slot_free_notify is bogus for CLONE_VM case since it doesn't
increase the refcount of swap slot at copy_mm so it couldn't catch up
whether it's safe or not to discard data from backing device.  In the
case, only the lock it could rely on to synchronize swap slot freeing is
page table lock.  Thus, this patch gets rid of the swap_slot_free_notify
function.  With this patch, CPU A will see correct data.

      CPU A                        CPU B

  do_swap_page                do_swap_page
  SWP_SYNCHRONOUS_IO path     SWP_SYNCHRONOUS_IO path
                              swap_readpage original data
                              pte_lock
                              map the original data
                              swap_free
                                swap_range_free
                                  bd_disk->fops->swap_slot_free_notify
  swap_readpage read zeroed data
                              pte_unlock
  pte_lock
  if (!pte_same)
    goto out_nomap;
  pte_unlock
  return
  on next refault will see mapped data by CPU B

The concern of the patch would increase memory consumption since it
could keep wasted memory with compressed form in zram as well as
uncompressed form in address space.  However, most of cases of zram uses
no readahead and do_swap_page is followed by swap_free so it will free
the compressed form from in zram quickly.

Link: https://lkml.kernel.org/r/YjTVVxIAsnKAXjTd@google.com
Fixes: 0bcac06f27 ("mm, swap: skip swapcache for swapin of synchronous device")
Reported-by: Ivan Babrou <ivan@cloudflare.com>
Tested-by: Ivan Babrou <ivan@cloudflare.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: David Hildenbrand <david@redhat.com>
Cc: <stable@vger.kernel.org>	[4.14+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Minchan Kim <minchan@google.com>
(cherry picked from commit 20ed94f818)
 git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
 linux-5.10.y)
Bug: 214353194
Change-Id: Iba642f36737c31fb5e9d8fc8a4e2a062bd28c37a
2022-05-06 20:55:10 +00:00
Fuad Tabba
c607c61848 ANDROID: KVM: arm64: Fix for do not allow memslot changes after first VM run under pKVM
Move the check for protected VMs up to ensure that we don't miss
a KVM_MR_DELETE.

Bug: 231684412
Change-Id: Ia5cecc13232e8c430f2a1747a3cebd7e7bd5e348
Signed-off-by: Fuad Tabba <tabba@google.com>
2022-05-06 11:52:48 +00:00
Fuad Tabba
b9b94e2aca ANDROID: KVM: arm64: pkvm: Ensure that TLBs and I-cache are private to each vcpu
If a different vcpu from the same vm is loaded on the same
physical CPU, we must flush the CPU context.

This patch ensures that by tracking the vcpu that was last loaded
on this CPU, and flushes if that changes. This could lead to
over-invalidation, which could affect performance but not
correctness.

Bug: 228810735
Signed-off-by: Fuad Tabba <tabba@google.com>
Change-Id: Ic623d98bfe3156591640f6b7908fb836456c2bd0
2022-05-06 08:32:29 +00:00
Sebastian Ene
392241199b ANDROID: Update the ABI representation
Leaf changes summary: 1 artifact changed
Changed leaf types summary: 0 leaf type changed
Removed/Changed/Added functions summary: 0 Removed, 0 Changed, 1 Added function
Removed/Changed/Added variables summary: 0 Removed, 0 Changed, 0 Added variable

1 Added function:

  [A] 'function void nvhe_hyp_panic_handler(u64, u64, u64, u64, u64, uintptr_t, u64, u64)'

Bug: 210011561
Change-Id: I8a129e4cb252101c17e862ad7546ccc6b07d9886
Signed-off-by: Sebastian Ene <sebastianene@google.com>
2022-05-05 08:05:44 +00:00
Sebastian Ene
cebb2c99be ANDROID: Update the ABI symbol list
Add a new symbol to KMI that will be used to simulate the hypervisor
panic. This symbol nvhe_hyp_panic_handler will be used by the
pixel_debug module to collect debug trace information.

Bug: 210011561
Signed-off-by: Sebastian Ene <sebastianene@google.com>
Change-Id: I8cb8035016f7215c71a7ba933010252eb52fed76
2022-05-05 08:05:44 +00:00
Sebastian Ene
10b114cc3c ANDROID: KVM: arm64: Export nvhe_hyp_panic_handler
Make nvhe_hyp_panic_handler available to the kernel modules.
This is required to simulate the hypervisor panic for debug trace
verification.

Bug: 210011561
Change-Id: I2618cc75db747df8183d76199b69d6e90a62b1d1
Signed-off-by: Sebastian Ene <sebastianene@google.com>
2022-05-05 08:05:44 +00:00
Prakruthi Deepak Heragu
67bef07aab FROMLIST: arm64: paravirt: Use RCU read locks to guard stolen_time
During hotplug, the stolen time data structure is unmapped and memset.
There is a possibility of the timer IRQ being triggered before memset
and stolen time is getting updated as part of this timer IRQ handler. This
causes the below crash in timer handler -

  [ 3457.473139][    C5] Unable to handle kernel paging request at virtual address ffffffc03df05148
  ...
  [ 3458.154398][    C5] Call trace:
  [ 3458.157648][    C5]  para_steal_clock+0x30/0x50
  [ 3458.162319][    C5]  irqtime_account_process_tick+0x30/0x194
  [ 3458.168148][    C5]  account_process_tick+0x3c/0x280
  [ 3458.173274][    C5]  update_process_times+0x5c/0xf4
  [ 3458.178311][    C5]  tick_sched_timer+0x180/0x384
  [ 3458.183164][    C5]  __run_hrtimer+0x160/0x57c
  [ 3458.187744][    C5]  hrtimer_interrupt+0x258/0x684
  [ 3458.192698][    C5]  arch_timer_handler_virt+0x5c/0xa0
  [ 3458.198002][    C5]  handle_percpu_devid_irq+0xdc/0x414
  [ 3458.203385][    C5]  handle_domain_irq+0xa8/0x168
  [ 3458.208241][    C5]  gic_handle_irq.34493+0x54/0x244
  [ 3458.213359][    C5]  call_on_irq_stack+0x40/0x70
  [ 3458.218125][    C5]  do_interrupt_handler+0x60/0x9c
  [ 3458.223156][    C5]  el1_interrupt+0x34/0x64
  [ 3458.227560][    C5]  el1h_64_irq_handler+0x1c/0x2c
  [ 3458.232503][    C5]  el1h_64_irq+0x7c/0x80
  [ 3458.236736][    C5]  free_vmap_area_noflush+0x108/0x39c
  [ 3458.242126][    C5]  remove_vm_area+0xbc/0x118
  [ 3458.246714][    C5]  vm_remove_mappings+0x48/0x2a4
  [ 3458.251656][    C5]  __vunmap+0x154/0x278
  [ 3458.255796][    C5]  stolen_time_cpu_down_prepare+0xc0/0xd8
  [ 3458.261542][    C5]  cpuhp_invoke_callback+0x248/0xc34
  [ 3458.266842][    C5]  cpuhp_thread_fun+0x1c4/0x248
  [ 3458.271696][    C5]  smpboot_thread_fn+0x1b0/0x400
  [ 3458.276638][    C5]  kthread+0x17c/0x1e0
  [ 3458.280691][    C5]  ret_from_fork+0x10/0x20

As a fix, introduce rcu lock to update stolen time structure.

Fixes: 75df529bec ("arm64: paravirt: Initialize steal time when cpu is online")
Cc: stable@vger.kernel.org
Acked-by: Will Deacon <will@kernel.org>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Prakruthi Deepak Heragu <quic_pheragu@quicinc.com>
Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>

Bug: 231271475
Link: https://lore.kernel.org/all/20220428183536.2866667-1-quic_eberman@quicinc.com/
Change-Id: Iac533d6a361a3ddacff315f8d9916af72b301383
Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>
2022-05-04 17:31:05 +00:00
Yi Kong
4dce9d7a65 ANDROID: clang: update to 14.0.7
Bug: 219872481
Change-Id: I5f6cabe293431b5351dd45db632be6bbd82b34b1
Signed-off-by: Yi Kong <yikong@google.com>
2022-05-03 09:14:34 +00:00
Will Deacon
43e6093d9d FROMGIT: KVM: arm64: Handle host stage-2 faults from 32-bit EL0
When pKVM is enabled, host memory accesses are translated by an identity
mapping at stage-2, which is populated lazily in response to synchronous
exceptions from 64-bit EL1 and EL0.

Extend this handling to cover exceptions originating from 32-bit EL0 as
well. Although these are very unlikely to occur in practice, as the
kernel typically ensures that user pages are initialised before mapping
them in, drivers could still map previously untouched device pages into
userspace and expect things to work rather than panic the system.

Cc: Quentin Perret <qperret@google.com>
Cc: Marc Zyngier <maz@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220427171332.13635-1-will@kernel.org
(cherry picked from commit 2a50fc5fd0
 git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git fixes)
Bug: 216811181
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I98ad9d9f0e2a78751ed73cc5d7c481d07a3ed1db
2022-05-03 08:30:58 +00:00
Todd Kjos
4eb197cb06 ANDROID: fix kernelci build issue for configfs module
This fixes the kernelci error:

"ERROR: modpost: module configfs uses symbol kern_path from namespace
VFS_internal_I_am_really_a_filesystem_and_am_NOT_a_driver, but does not import it."

Bug: 230913804
Fixes: 0a77fca3aa ("ANDROID: GKI: set vfs-only exports into their own namespace")
Signed-off-by: Todd Kjos <tkjos@google.com>
Change-Id: Ib4ab1b83c8c8c996b1f15c419fb8ce0549832699
2022-05-03 00:31:50 +00:00
Maciej Żenczykowski
3ed683cb94 ANDROID: gki - set CONFIG_USB_NET_AX88179_178A=y (usb gbit ethernet dongle)
Per Kconfig:
  config USB_NET_AX88179_178A
    tristate "ASIX AX88179/178A USB 3.0/2.0 to Gigabit Ethernet"
  depends on USB_USBNET
  select CRC32
  select PHYLIB
  default y
  help
    This option adds support for ASIX AX88179 based USB 3.0/2.0
    to Gigabit Ethernet adapters.
    This driver should work with at least the following devices:
      * ASIX AX88179
      * ASIX AX88178A
      * Sitcomm LN-032
    This driver creates an interface named "ethX", where X depends on
    what other networking devices you have in use.

This was already enabled on 'db845c_gki.fragment',
which suggests this hardware is reasonably common
(even though I don't have a dongle that requires it).

Test: TreeHugger
Bug: 200269356
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Change-Id: I9915cfb54a324f007d508a8e3d2aad1d6fc9e5de
2022-05-02 14:13:04 -07:00
Lecopzer Chen
277827dd5b ANDROID: fix KCFLAGS override by __ANDROID_COMMON_KERNEL__
Our test build is broken by KCFLAGS overrided in build.config.comm.

Since Linux Makefile supports 'export KCFLAGS=XXX' to customize the
KCFLAGS, and we should keep this functionality.

Bug: 230818006
Fixes: 4053a1e898 ("ANDROID: Add flag to indicate compiling against ACK")
Signed-off-by: Lecopzer Chen <lecopzer.chen@mediatek.com>
Change-Id: I9425d79697bc1fe816ce82d523f91631dee6b8f4
2022-04-29 13:28:14 -07:00
Elliot Berman
4053a1e898 ANDROID: Add flag to indicate compiling against ACK
Add a flag: __ANDROID_COMMON_KERNEL__ which out-of-tree vendor drivers
can use to check if they are compiling against an Android Common Kernel.
These out-of-tree vendor drivers can use this flag +
LINUX_KERNEL_VERSION to determine if a feature has been backported.

Bug: 229953929
Change-Id: I832344d63f3639479784753edfb7ac405068312f
Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>
2022-04-29 13:25:12 -07:00
Charan Teja Kalla
e78c5b621d UPSTREAM: mm: madvise: return correct bytes advised with process_madvise
Patch series "mm: madvise: return correct bytes processed with
process_madvise", v2.  With the process_madvise(), always choose to return
non zero processed bytes over an error.  This can help the user to know on
which VMA, passed in the 'struct iovec' vector list, is failed to advise
thus can take the decission of retrying/skipping on that VMA.

This patch (of 2):

The process_madvise() system call returns error even after processing some
VMA's passed in the 'struct iovec' vector list which leaves the user
confused to know where to restart the advise next.  It is also against
this syscall man page[1] documentation where it mentions that "return
value may be less than the total number of requested bytes, if an error
occurred after some iovec elements were already processed.".

Consider a user passed 10 VMA's in the 'struct iovec' vector list of which
9 are processed but one.  Then it just returns the error caused on that
failed VMA despite the first 9 VMA's processed, leaving the user confused
about on which VMA it is failed.  Returning the number of bytes processed
here can help the user to know which VMA it is failed on and thus can
retry/skip the advise on that VMA.

[1]https://man7.org/linux/man-pages/man2/process_madvise.2.html.

Link: https://lkml.kernel.org/r/cover.1647008754.git.quic_charante@quicinc.com
Link: https://lkml.kernel.org/r/125b61a0edcee5c2db8658aed9d06a43a19ccafc.1647008754.git.quic_charante@quicinc.com
Fixes: ecb8ac8b1f14("mm/madvise: introduce process_madvise() syscall: an external memory hinting API")
Signed-off-by: Charan Teja Kalla <quic_charante@quicinc.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: David Rientjes <rientjes@google.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

(cherry picked from commit 5bd009c7c9)

Bug: 205658049
Test: Manual, build and flash

Signed-off-by: Edgar Arriaga García <edgararriaga@google.com>
Change-Id: I03f14997e542943fed75f0988b2c37aeaed649f7
2022-04-28 19:32:47 +00:00
Marco Elver
5f9fb34d8b UPSTREAM: kfence, x86: fix preemptible warning on KPTI-enabled systems
On systems with KPTI enabled, we can currently observe the following
warning:

  BUG: using smp_processor_id() in preemptible
  caller is invalidate_user_asid+0x13/0x50
  CPU: 6 PID: 1075 Comm: dmesg Not tainted 5.12.0-rc4-gda4a2b1a5479-kfence_1+ #1
  Hardware name: Hewlett-Packard HP Pro 3500 Series/2ABF, BIOS 8.11 10/24/2012
  Call Trace:
   dump_stack+0x7f/0xad
   check_preemption_disabled+0xc8/0xd0
   invalidate_user_asid+0x13/0x50
   flush_tlb_one_kernel+0x5/0x20
   kfence_protect+0x56/0x80
   ...

While it normally makes sense to require preemption to be off, so that
the expected CPU's TLB is flushed and not another, in our case it really
is best-effort (see comments in kfence_protect_page()).

Avoid the warning by disabling preemption around flush_tlb_one_kernel().

Link: https://lore.kernel.org/lkml/YGIDBAboELGgMgXy@elver.google.com/
Link: https://lkml.kernel.org/r/20210330065737.652669-1-elver@google.com
Signed-off-by: Marco Elver <elver@google.com>
Reported-by: Tomi Sarvela <tomi.p.sarvela@intel.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Jann Horn <jannh@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

(cherry picked from commit 6a77d38efc)

Bug: 229863099
Signed-off-by: Colin Downs-Razouk <colindr@google.com>
Change-Id: Ia917b052ffbb267254f281f55141c34ad193c78e
2022-04-28 19:27:56 +00:00
Eric Dumazet
a0046956bf BACKPORT: net/packet: fix slab-out-of-bounds access in packet_recvmsg()
[ Upstream commit c700525fcc ]

syzbot found that when an AF_PACKET socket is using PACKET_COPY_THRESH
and mmap operations, tpacket_rcv() is queueing skbs with
garbage in skb->cb[], triggering a too big copy [1]

Presumably, users of af_packet using mmap() already gets correct
metadata from the mapped buffer, we can simply make sure
to clear 12 bytes that might be copied to user space later.

BUG: KASAN: stack-out-of-bounds in memcpy include/linux/fortify-string.h:225 [inline]
BUG: KASAN: stack-out-of-bounds in packet_recvmsg+0x56c/0x1150 net/packet/af_packet.c:3489
Write of size 165 at addr ffffc9000385fb78 by task syz-executor233/3631

CPU: 0 PID: 3631 Comm: syz-executor233 Not tainted 5.17.0-rc7-syzkaller-02396-g0b3660695e80 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
 print_address_description.constprop.0.cold+0xf/0x336 mm/kasan/report.c:255
 __kasan_report mm/kasan/report.c:442 [inline]
 kasan_report.cold+0x83/0xdf mm/kasan/report.c:459
 check_region_inline mm/kasan/generic.c:183 [inline]
 kasan_check_range+0x13d/0x180 mm/kasan/generic.c:189
 memcpy+0x39/0x60 mm/kasan/shadow.c:66
 memcpy include/linux/fortify-string.h:225 [inline]
 packet_recvmsg+0x56c/0x1150 net/packet/af_packet.c:3489
 sock_recvmsg_nosec net/socket.c:948 [inline]
 sock_recvmsg net/socket.c:966 [inline]
 sock_recvmsg net/socket.c:962 [inline]
 ____sys_recvmsg+0x2c4/0x600 net/socket.c:2632
 ___sys_recvmsg+0x127/0x200 net/socket.c:2674
 __sys_recvmsg+0xe2/0x1a0 net/socket.c:2704
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fdfd5954c29
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 41 15 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffcf8e71e48 EFLAGS: 00000246 ORIG_RAX: 000000000000002f
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fdfd5954c29
RDX: 0000000000000000 RSI: 0000000020000500 RDI: 0000000000000005
RBP: 0000000000000000 R08: 000000000000000d R09: 000000000000000d
R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffcf8e71e60
R13: 00000000000f4240 R14: 000000000000c1ff R15: 00007ffcf8e71e54
 </TASK>

addr ffffc9000385fb78 is located in stack of task syz-executor233/3631 at offset 32 in frame:
 ____sys_recvmsg+0x0/0x600 include/linux/uio.h:246

this frame has 1 object:
 [32, 160) 'addr'

Memory state around the buggy address:
 ffffc9000385fa80: 00 04 f3 f3 f3 f3 f3 00 00 00 00 00 00 00 00 00
 ffffc9000385fb00: 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00
>ffffc9000385fb80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f3
                                                                ^
 ffffc9000385fc00: f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 f1
 ffffc9000385fc80: f1 f1 f1 00 f2 f2 f2 00 f2 f2 f2 00 00 00 00 00
==================================================================

Bug: 224546354
Fixes: 0fb375fb9b ("[AF_PACKET]: Allow for > 8 byte hardware addresses.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Link: https://lore.kernel.org/r/20220312232958.3535620-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Change-Id: I37e4a05a8d81b2645bc65db002e644b40d1a984d
2022-04-28 11:35:01 +00:00
Jiazi Li
06bb3003c6 BACKPORT: dm: fix NULL pointer issue when free bio
dm_io_dec_pending call end_io_acct first, will dec md in-flight
pending count. If a task is swapping table at same time.
task1                             task2
do_resume
 ->do_suspend
  ->dm_wait_for_completion
                                  bio_endio
				   ->clone_endio
				    ->dm_io_dec_pending
				     ->end_io_acct
				      ->wakeup task1
 ->dm_swap_table
  ->__bind
   ->__bind_mempools
    ->bioset_exit
     ->mempool_exit
                                     ->free_io
mempool->elements is NULL, and lead to following crash:
[ 67.330330] Unable to handle kernel NULL pointer dereference at virtual
address 0000000000000000
......
[ 67.330494] pstate: 80400085 (Nzcv daIf +PAN -UAO)
[ 67.330510] pc : mempool_free+0x70/0xa0
[ 67.330515] lr : mempool_free+0x4c/0xa0
[ 67.330520] sp : ffffff8008013b20
[ 67.330524] x29: ffffff8008013b20 x28: 0000000000000004
[ 67.330530] x27: ffffffa8c2ff40a0 x26: 00000000ffff1cc8
[ 67.330535] x25: 0000000000000000 x24: ffffffdada34c800
[ 67.330541] x23: 0000000000000000 x22: ffffffdada34c800
[ 67.330547] x21: 00000000ffff1cc8 x20: ffffffd9a1304d80
[ 67.330552] x19: ffffffdada34c970 x18: 000000b312625d9c
[ 67.330558] x17: 00000000002dcfbf x16: 00000000000006dd
[ 67.330563] x15: 000000000093b41e x14: 0000000000000010
[ 67.330569] x13: 0000000000007f7a x12: 0000000034155555
[ 67.330574] x11: 0000000000000001 x10: 0000000000000001
[ 67.330579] x9 : 0000000000000000 x8 : 0000000000000000
[ 67.330585] x7 : 0000000000000000 x6 : ffffff80148b5c1a
[ 67.330590] x5 : ffffff8008013ae0 x4 : 0000000000000001
[ 67.330596] x3 : ffffff80080139c8 x2 : ffffff801083bab8
[ 67.330601] x1 : 0000000000000000 x0 : ffffffdada34c970
[ 67.330609] Call trace:
[ 67.330616] mempool_free+0x70/0xa0
[ 67.330627] bio_put+0xf8/0x110
[ 67.330638] dec_pending+0x13c/0x230
[ 67.330644] clone_endio+0x90/0x180
[ 67.330649] bio_endio+0x198/0x1b8
[ 67.330655] dec_pending+0x190/0x230
[ 67.330660] clone_endio+0x90/0x180
[ 67.330665] bio_endio+0x198/0x1b8
[ 67.330673] blk_update_request+0x214/0x428
[ 67.330683] scsi_end_request+0x2c/0x300
[ 67.330688] scsi_io_completion+0xa0/0x710
[ 67.330695] scsi_finish_command+0xd8/0x110
[ 67.330700] scsi_softirq_done+0x114/0x148
[ 67.330708] blk_done_softirq+0x74/0xd0
[ 67.330716] __do_softirq+0x18c/0x374
[ 67.330724] irq_exit+0xb4/0xb8
[ 67.330732] __handle_domain_irq+0x84/0xc0
[ 67.330737] gic_handle_irq+0x148/0x1b0
[ 67.330744] el1_irq+0xe8/0x190
[ 67.330753] lpm_cpuidle_enter+0x4f8/0x538
[ 67.330759] cpuidle_enter_state+0x1fc/0x398
[ 67.330764] cpuidle_enter+0x18/0x20
[ 67.330772] do_idle+0x1b4/0x290
[ 67.330778] cpu_startup_entry+0x20/0x28
[ 67.330786] secondary_start_kernel+0x160/0x170

Move end_io_acct after free_io to fix this issue.

Bug: 228982905
Link: https://lore.kernel.org/dm-devel/1632916768-22379-1-git-send-email-lijiazi@xiaomi.com/T/#u
[Akilesh: Resolved merge conflict in drivers/md/dm.c]
Signed-off-by: Jiazi Li <lijiazi@xiaomi.com>
Signed-off-by: Akilesh Kailash <akailash@google.com>
(cherry picked from commit d208b89401)
Change-Id: I9f122cab2af3b961c472b8cf2087399c63c28de1
2022-04-27 22:38:31 +00:00
Lee Jones
98c15b2bad ANDROID: dm-bow: Protect Ranges fetched and erased from the RB tree
Bug: 195565510
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Change-Id: Ic8134eb902aa7d929e3121b2f69b1d258f570652
2022-04-27 16:07:42 +00:00
Sebastian Ene
6450df3d7e ANDROID: arm64: Auto-enroll MMIO guard on protected vms
Set the MMIO guard flag for protected vms prior to entering the guest
for the first time.

Bug: 216798684
Signed-off-by: Sebastian Ene <sebastianene@google.com>
Change-Id: I1448102ae85176d495ae7f8d6d20de4092049f0d
2022-04-27 10:05:05 +00:00
Martin Liu
3e591c63b1 ANDROID: cma: allow to use CMA in swap-in path
Now, we allow to use CMA pages for certain user space
allocations. One of them is anonymous page fault case.
To align the use case, we should also allow to use CMA
pages in swap-in cases. This could help mitigate OOM
on swap-in cases showing plenty of free CMA left.

logd.klogd invoked oom-killer: gfp_mask=0x1100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=-1000
CPU: 0 PID: 433 Comm: logd.klogd Tainted: G        W  OE     5.10.100-android13-0 #1

Call trace:
 dump_backtrace.cfi_jt+0x0/0x8
 show_stack+0x1c/0x2c
 dump_stack_lvl+0xc0/0x13c
 dump_header+0x54/0x238
 oom_kill_process+0xb0/0x158
 out_of_memory+0x17c/0x328
 __alloc_pages_slowpath+0x5c4/0x8d0
 __alloc_pages_nodemask+0x1bc/0x2e0
 __read_swap_cache_async+0xdc/0x370
 swap_vma_readahead+0x3b4/0x488
 swapin_readahead+0x3c/0x54
 do_swap_page+0x1e0/0xaa0
 handle_pte_fault+0x128/0x1e0
 handle_mm_fault+0x308/0x590
 do_page_fault+0x33c/0x478
 do_translation_fault+0x58/0x11c
 do_mem_abort+0x68/0x144
 el0_da+0x24/0x34
 el0_sync_handler+0xc4/0xec
 el0_sync+0x1c0/0x200
Mem-Info:
active_anon:0 inactive_anon:3222 isolated_anon:62
 active_file:232 inactive_file:428 isolated_file:0
 unevictable:37232 dirty:3 writeback:40
 slab_reclaimable:19943 slab_unreclaimable:281193
 mapped:37126 shmem:2815 pagetables:8981 bounce:0
 free:126007 free_pcp:223 free_cma:123062
Node 0 active_anon:16kB inactive_anon:13160kB active_file:292kB inactive_file:2000kB unevictable:148928kB isolated(anon):0kB isolated(file):0kB mapped:148308kB dirty:12kB writeback:164kB shmem:11260kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB kernel_stack:20528kB shadow_call_stack:5200kB all_unreclaimable? no
DMA32 free:14128kB min:7572kB low:22636kB high:37700kB reserved_highatomic:4096KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:1913856kB managed:1553276kB mlocked:0kB pagetables:1292kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:2520kB
lowmem_reserve[]: 0 0 0
Normal free:489888kB min:19808kB low:59220kB high:98632kB reserved_highatomic:36864KB active_anon:20kB inactive_anon:12168kB active_file:0kB inactive_file:1640kB unevictable:148928kB writepending:180kB present:4194304kB managed:4063392kB mlocked:148928kB pagetables:34632kB bounce:0kB free_pcp:1928kB local_pcp:0kB free_cma:489752kB
lowmem_reserve[]: 0 0 0
DMA32: 166*4kB (UME) 163*8kB (UMECH) 592*16kB (UMCH) 5*32kB (UC) 2*64kB (C) 2*128kB (C) 2*256kB (C) 1*512kB (C) 1*1024kB (C) 0*2048kB 0*4096kB = 14032kB
Normal: 969*4kB (C) 77*8kB (C) 40*16kB (C) 17*32kB (C) 5*64kB (C) 1*128kB (C) 2*256kB (C) 1*512kB (C) 0*1024kB 0*2048kB 118*4096kB (C) = 490476kB
40220 total pagecache pages
30 pages in swap cache
Swap cache stats: add 2634625, delete 2635304, find 160621/2963954
Free swap  = 1473788kB
Total swap = 2097148kB

Bug: 229822798
Signed-off-by: Martin Liu <liumartin@google.com>
Change-Id: Ia0bb6f72e52f77f26062e1769bfd92e831f07cab
2022-04-27 03:05:53 +00:00
Jaegeuk Kim
c56ecad172 UPSTREAM: f2fs: should not truncate blocks during roll-forward recovery
If the file preallocated blocks and fsync'ed, we should not truncate them during
roll-forward recovery which will recover i_size correctly back.

Bug: 223740163
Fixes: d4dd19ec1e ("f2fs: do not expose unwritten blocks to user by DIO")
Cc: <stable@vger.kernel.org> # 5.17+
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
(cherry picked from commit 4d8ec91208)
Change-Id: I5e974a564667115455b53a18f31902a875d86dee
2022-04-25 23:09:59 +00:00
Chen-Yu Tsai
a50ef731e0 BACKPORT: media: v4l2-mem2mem: Apply DST_QUEUE_OFF_BASE on MMAP buffers across ioctls
[ Upstream commit 8310ca9407 ]

DST_QUEUE_OFF_BASE is applied to offset/mem_offset on MMAP capture buffers
only for the VIDIOC_QUERYBUF ioctl, while the userspace fields (including
offset/mem_offset) are filled in for VIDIOC_{QUERY,PREPARE,Q,DQ}BUF
ioctls. This leads to differences in the values presented to userspace.
If userspace attempts to mmap the capture buffer directly using values
from DQBUF, it will fail.

Move the code that applies the magic offset into a helper, and call
that helper from all four ioctl entry points.

[hverkuil: drop unnecessary '= 0' in v4l2_m2m_querybuf() for ret]

Bug: 223375145
Fixes: 7f98639def ("V4L/DVB: add memory-to-memory device helper framework for videobuf")
Fixes: 908a0d7c58 ("[media] v4l: mem2mem: port to videobuf2")
Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Change-Id: Ifab9933f85f4ba7e0fadddf52debaf837830a06d
(cherry picked from commit 6def3a5ed8)
2022-04-25 09:04:21 +00:00
tuhailong
0496c13ded ANDROID: GKI: build damon reclaim
To enable it, echo 1 > /sys/module/damon_reclaim/parameters/enable

Bug: 228223814
Signed-off-by: Hailong Tu <tuhailong@oppo.com>
Change-Id: Ibb0701b6dd83070c6fbc8c96fdac3bc839821377
2022-04-22 23:25:16 +00:00
Hailong Tu
b3190b539a FROMLIST: mm/damon/reclaim: Fix the timer always stays active
The timer stays active even if the reclaim mechanism is never enabled.
It is unnecessary overhead can be completely avoided by using module_param_cb() for enabled flag.

Signed-off-by: Hailong Tu <tuhailong@gmail.com>

Link: https://lore.kernel.org/all/20220421125910.1052459-1-tuhailong@gmail.com/

Bug: 228223814
Signed-off-by: Hailong Tu <tuhailong@oppo.com>
Change-Id: I77591e41ce424ce16a4a5c70e7a86cdae996a354
2022-04-22 23:24:39 +00:00
Jakub Kicinski
ca5cc6bc4c BACKPORT: treewide: Add missing includes masked by cgroup -> bpf dependency
cgroup.h (therefore swap.h, therefore half of the universe)
includes bpf.h which in turn includes module.h and slab.h.
Since we're about to get rid of that dependency we need
to clean things up.

v2: drop the cpu.h include from cacheinfo.h, it's not necessary
and it makes riscv sensitive to ordering of include files.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Krzysztof Wilczyński <kw@linux.com>
Acked-by: Peter Chen <peter.chen@kernel.org>
Acked-by: SeongJae Park <sj@kernel.org>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/all/20211120035253.72074-1-kuba@kernel.org/  # v1
Link: https://lore.kernel.org/all/20211120165528.197359-1-kuba@kernel.org/ # cacheinfo discussion
Link: https://lore.kernel.org/bpf/20211202203400.1208663-1-kuba@kernel.org

(cherry picked from commit 8581fd402a)

Dropped all the changes except for the ones made in mm/damon/vaddr.c

Bug: 228223814
Signed-off-by: Hailong Tu <tuhailong@oppo.com>
Change-Id: Ib64ecbe4e06f192ed576af77e03e6b49d538bac9
2022-04-22 15:19:40 -07:00
Xin Hao
891f111a14 UPSTREAM: mm/damon: modify damon_rand() macro to static inline function
damon_rand() cannot be implemented as a macro.

Example:
	damon_rand(a++, b);

The value of 'a' will be incremented twice, This is obviously
unreasonable, So there fix it.

Link: https://lkml.kernel.org/r/110ffcd4e420c86c42b41ce2bc9f0fe6a4f32cd3.1638795127.git.xhao@linux.alibaba.com
Fixes: b9a6ac4e4e ("mm/damon: adaptively adjust regions")
Signed-off-by: Xin Hao <xhao@linux.alibaba.com>
Reported-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

(cherry picked from commit 234d68732b)

Bug: 228223814
Signed-off-by: Hailong Tu <tuhailong@oppo.com>
Change-Id: Idcc316e6f582959254111fb160d04875d235c306
2022-04-22 15:19:40 -07:00
Xin Hao
284927effa UPSTREAM: mm/damon: add 'age' of region tracepoint support
In Damon, we can get age information by analyzing the nr_access change,
But short time sampling is not effective, we have to obtain enough data
for analysis through long time trace, this also means that we need to
consume more cpu resources and storage space.

Now the region add a new 'age' variable, we only need to get the change of
age value through a little time trace, for example, age has been
increasing to 141, but nr_access shows a value of 0 at the same time,
Through this,we can conclude that the region has a very low nr_access
value for a long time.

Link: https://lkml.kernel.org/r/b9def1262af95e0dc1d0caea447886434db01161.1636989871.git.xhao@linux.alibaba.com
Signed-off-by: Xin Hao <xhao@linux.alibaba.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Cc: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

(cherry picked from commit c46b0bb6a7)

Bug: 228223814
Signed-off-by: Hailong Tu <tuhailong@oppo.com>
Change-Id: I39aeea71756f5af6f270efe9d5653815c905af46
2022-04-22 15:19:40 -07:00
SeongJae Park
3d89e63310 UPSTREAM: mm/damon: hide kernel pointer from tracepoint event
DAMON's virtual address spaces monitoring primitive uses 'struct pid *'
of the target process as its monitoring target id.  The kernel address
is exposed as-is to the user space via the DAMON tracepoint,
'damon_aggregated'.

Though primarily only privileged users are allowed to access that, it
would be better to avoid unnecessarily exposing kernel pointers so.
Because the trace result is only required to be able to distinguish each
target, we aren't need to use the pointer as-is.

This makes the tracepoint to use the index of the target in the
context's targets list as its id in the tracepoint, to hide the kernel
space address.

Link: https://lkml.kernel.org/r/20211229131016.23641-5-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

(cherry picked from commit 76fd0285b4)

Bug: 228223814
Signed-off-by: Hailong Tu <tuhailong@oppo.com>
Change-Id: Iee4a6f56cf9bf5f61fb95f438dfc3316c10198e5
2022-04-22 15:19:39 -07:00
SeongJae Park
1656aa6e49 UPSTREAM: mm/damon/vaddr: hide kernel pointer from damon_va_three_regions() failure log
The failure log message for 'damon_va_three_regions()' prints the target
id, which is a 'struct pid' pointer in the case.  To avoid exposing the
kernel pointer via the log, this makes the log to use the index of the
target in the context's targets list instead.

Link: https://lkml.kernel.org/r/20211229131016.23641-4-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

(cherry picked from commit 962fe7a6b1)

Bug: 228223814
Signed-off-by: Hailong Tu <tuhailong@oppo.com>
Change-Id: I6b9b09a9b7024630f05ec41db030006d63f47ce1
2022-04-22 15:19:39 -07:00
SeongJae Park
a0220f613b UPSTREAM: mm/damon/vaddr: use pr_debug() for damon_va_three_regions() failure logging
Failure of 'damon_va_three_regions()' is logged using 'pr_err()'.  But,
the function can fail in legal situations.  To avoid making users be
surprised and to keep the kernel clean, this makes the log to be printed
using 'pr_debug()'.

Link: https://lkml.kernel.org/r/20211229131016.23641-3-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

(cherry picked from commit 251403f19a)

Bug: 228223814
Signed-off-by: Hailong Tu <tuhailong@oppo.com>
Change-Id: I0a2c08d856d0067af6af7a008e89f63fa5fac381
2022-04-22 15:19:39 -07:00
SeongJae Park
6be0ebcb89 UPSTREAM: mm/damon/dbgfs: remove an unnecessary variable
Patch series "mm/damon: Hide unnecessary information disclosures".

DAMON is exposing some unnecessary information including kernel pointer
in kernel log and tracepoint.  This patchset hides such information.
The first patch is only for a trivial cleanup, though.

This patch (of 4):

This commit removes a unnecessarily used variable in
dbgfs_target_ids_write().

Link: https://lkml.kernel.org/r/20211229131016.23641-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20211229131016.23641-2-sj@kernel.org
Fixes: 4bc05954d0 ("mm/damon: implement a debugfs-based user space interface")
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

(cherry picked from commit 70b8480812)

Bug: 228223814
Signed-off-by: Hailong Tu <tuhailong@oppo.com>
Change-Id: I774c900de92780ae9ebf09f01c0b0536eb43f822
2022-04-22 15:19:39 -07:00
Guoqing Jiang
1b9e81febe UPSTREAM: mm/damon: move the implementation of damon_insert_region to damon.h
Usually, inline function is declared static since it should sit between
storage and type.  And implement it in a header file if used by multiple
files.

And this change also fixes compile issue when backport damon to 5.10.

  mm/damon/vaddr.c: In function `damon_va_evenly_split_region':
  ./include/linux/damon.h:425:13: error: inlining failed in call to `always_inline' `damon_insert_region': function body not available
  425 | inline void damon_insert_region(struct damon_region *r,
      | ^~~~~~~~~~~~~~~~~~~
  mm/damon/vaddr.c:86:3: note: called from here
  86 | damon_insert_region(n, r, next, t);
     | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Link: https://lkml.kernel.org/r/20211223085703.6142-1-guoqing.jiang@linux.dev
Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

(cherry picked from commit 2cd4b8e10c)

Bug: 228223814
Signed-off-by: Hailong Tu <tuhailong@oppo.com>
Change-Id: Iaa05318092b8e98bfbfc72a8c5df2cb6de97d224
2022-04-22 15:19:39 -07:00
Baolin Wang
196600574b UPSTREAM: mm/damon: add access checking for hugetlb pages
The process's VMAs can be mapped by hugetlb page, but now the DAMON did
not implement the access checking for hugetlb pte, so we can not get the
actual access count like below if a process VMAs were mapped by hugetlb.

  damon_aggregated: target_id=18446614368406014464 nr_regions=12 4194304-5476352: 0 545
  damon_aggregated: target_id=18446614368406014464 nr_regions=12 140662370467840-140662372970496: 0 545
  damon_aggregated: target_id=18446614368406014464 nr_regions=12 140662372970496-140662375460864: 0 545
  damon_aggregated: target_id=18446614368406014464 nr_regions=12 140662375460864-140662377951232: 0 545
  damon_aggregated: target_id=18446614368406014464 nr_regions=12 140662377951232-140662380449792: 0 545
  damon_aggregated: target_id=18446614368406014464 nr_regions=12 140662380449792-140662382944256: 0 545
  ......

Thus this patch adds hugetlb access checking support, with this patch we
can see below VMA mapped by hugetlb access count.

  damon_aggregated: target_id=18446613056935405824 nr_regions=12 140296486649856-140296489914368: 1 3
  damon_aggregated: target_id=18446613056935405824 nr_regions=12 140296489914368-140296492978176: 1 3
  damon_aggregated: target_id=18446613056935405824 nr_regions=12 140296492978176-140296495439872: 1 3
  damon_aggregated: target_id=18446613056935405824 nr_regions=12 140296495439872-140296498311168: 1 3
  damon_aggregated: target_id=18446613056935405824 nr_regions=12 140296498311168-140296501198848: 1 3
  damon_aggregated: target_id=18446613056935405824 nr_regions=12 140296501198848-140296504320000: 1 3
  damon_aggregated: target_id=18446613056935405824 nr_regions=12 140296504320000-140296507568128: 1 2
  ......

[baolin.wang@linux.alibaba.com: fix unused var warning]
  Link: https://lkml.kernel.org/r/1aaf9c11-0d8e-b92d-5c92-46e50a6e8d4e@linux.alibaba.com
[baolin.wang@linux.alibaba.com: v3]
  Link: https://lkml.kernel.org/r/486927ecaaaecf2e3a7fbe0378ec6e1c58b50747.1640852276.git.baolin.wang@linux.alibaba.com

Link: https://lkml.kernel.org/r/6afcbd1fda5f9c7c24f320d26a98188c727ceec3.1639623751.git.baolin.wang@linux.alibaba.com
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

(cherry picked from commit 49f4203aae)

Bug: 228223814
Signed-off-by: Hailong Tu <tuhailong@oppo.com>
Change-Id: Id7c86f5c0344efef150b3b27f662b696263947ed
2022-04-22 15:19:39 -07:00
SeongJae Park
2d885a4902 UPSTREAM: mm/damon/dbgfs: support all DAMOS stats
Currently, DAMON debugfs interface is not supporting DAMON-based
Operation Schemes (DAMOS) stats for schemes successfully applied regions
and time/space quota limit exceeds.  This adds the support.

Link: https://lkml.kernel.org/r/20211210150016.35349-6-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

(cherry picked from commit 3a619fdb8d)

Bug: 228223814
Signed-off-by: Hailong Tu <tuhailong@oppo.com>
Change-Id: Ide3cfaff1b63e053ece5a8746a463fb87ef6c607
2022-04-22 15:19:39 -07:00
SeongJae Park
4baaaded13 UPSTREAM: mm/damon/reclaim: provide reclamation statistics
This implements new DAMON_RECLAIM parameters for statistics reporting.
Those can be used for understanding how DAMON_RECLAIM is working, and
for tuning the other parameters.

Link: https://lkml.kernel.org/r/20211210150016.35349-4-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

(cherry picked from commit 60e52e7c46)

Bug: 228223814
Signed-off-by: Hailong Tu <tuhailong@oppo.com>
Change-Id: I721bdacb3b2d8b41154296f5dbf8ef48c5dd0744
2022-04-22 15:19:38 -07:00
SeongJae Park
5388d0502f UPSTREAM: mm/damon/schemes: account how many times quota limit has exceeded
If the time/space quotas of a given DAMON-based operation scheme is too
small, the scheme could show unexpectedly slow progress.  However, there
is no good way to notice the case in runtime.  This commit extends the
DAMOS stat to provide how many times the quota limits exceeded so that
the users can easily notice the case and tune the scheme.

Link: https://lkml.kernel.org/r/20211210150016.35349-3-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

(cherry picked from commit 6268eac34c)

Bug: 228223814
Signed-off-by: Hailong Tu <tuhailong@oppo.com>
Change-Id: I36184dc51917810c81ac8d576b144e33a4454dc1
2022-04-22 15:19:38 -07:00