Commit Graph

1072446 Commits

Author SHA1 Message Date
Carlos Llamas
3cd672870a ANDROID: binder: fix KMI-break due to alloc->lock
Wrap 'struct binder_proc' inside 'struct binder_proc_wrap' to add the
alloc->lock equivalent without breaking the KMI. Also, add convenient
apis to access/modify this new spinlock.

Without this patch, the following KMI issues show up:

  type 'struct binder_proc' changed
    byte size changed from 616 to 576

  type 'struct binder_alloc' changed
    byte size changed from 152 to 112
    member 'spinlock_t lock' was added
    member 'struct mutex mutex' was removed

Bug: 254650075
Change-Id: Ic31dc39fb82800a3e47be10a7873cd210f7b60be
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-01-09 01:15:49 +00:00
Carlos Llamas
63f7ddea2e ANDROID: binder: fix KMI-break due to address type change
In commit ("binder: keep vma addresses type as unsigned long") the vma
address type in 'struct binder_alloc' and 'struct binder_buffer' is
changed from 'void __user *' to 'unsigned long'.

This triggers the following KMI issues:

  type 'struct binder_buffer' changed
    member changed from 'void* user_data' to 'unsigned long user_data'
      type changed from 'void*' to 'unsigned long'

  type 'struct binder_alloc' changed
    member changed from 'void* buffer' to 'unsigned long buffer'
      type changed from 'void*' to 'unsigned long'

This offending commit is being backported as part of a larger patchset
from upstream in [1]. Lets fix these issues by doing a partial revert
that restores the original types and casts to an integer type where
necessary.

Note this approach is preferred over dropping the single KMI-breaking
patch from the backport, as this would have created non-trivial merge
conflicts in the subsequent cherry-picks.

Bug: 254650075
Link: https://lore.kernel.org/all/20231201172212.1813387-1-cmllamas@google.com/ [1]
Change-Id: Ief9de717d0f34642f5954ffa2e306075a5b4e02e
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-01-09 01:15:49 +00:00
Carlos Llamas
74ecd99c15 BACKPORT: FROMGIT: binder: switch alloc->mutex to spinlock_t
The alloc->mutex is a highly contended lock that causes performance
issues on Android devices. When a low-priority task is given this lock
and it sleeps, it becomes difficult for the task to wake up and complete
its work. This delays other tasks that are also waiting on the mutex.

The problem gets worse when there is memory pressure in the system,
because this increases the contention on the alloc->mutex while the
shrinker reclaims binder pages.

Switching to a spinlock helps to keep the waiters running and avoids the
overhead of waking up tasks. This significantly improves the transaction
latency when the problematic scenario occurs.

The performance impact of this patchset was measured by stress-testing
the binder alloc contention. In this test, several clients of different
priorities send thousands of transactions of different sizes to a single
server. In parallel, pages get reclaimed using the shinker's debugfs.

The test was run on a Pixel 8, Pixel 6 and qemu machine. The results
were similar on all three devices:

after:
  | sched  | prio | average | max       | min     |
  |--------+------+---------+-----------+---------|
  | fifo   |   99 | 0.135ms |   1.197ms | 0.022ms |
  | fifo   |   01 | 0.136ms |   5.232ms | 0.018ms |
  | other  |  -20 | 0.180ms |   7.403ms | 0.019ms |
  | other  |   19 | 0.241ms |  58.094ms | 0.018ms |

before:
  | sched  | prio | average | max       | min     |
  |--------+------+---------+-----------+---------|
  | fifo   |   99 | 0.350ms | 248.730ms | 0.020ms |
  | fifo   |   01 | 0.357ms | 248.817ms | 0.024ms |
  | other  |  -20 | 0.399ms | 249.906ms | 0.020ms |
  | other  |   19 | 0.477ms | 297.756ms | 0.022ms |

The key metrics above are the average and max latencies (wall time).
These improvements should roughly translate to p95-p99 latencies on real
workloads. The response time is up to 200x faster in these scenarios and
there is no penalty in the regular path.

Note that it is only possible to convert this lock after a series of
changes made by previous patches. These mainly include refactoring the
sections that might_sleep() and changing the locking order with the
mmap_lock amongst others.

Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-29-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit 7710e2cca32e7f3958480e8bd44f50e29d0c2509
 https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
 char-misc-next)
Change-Id: I67121be071d5f072ac0e5eb719c95c0f1dee5eb5
[cmllamas: fixed conflicts due to missing e66b77e505]
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-01-09 01:15:49 +00:00
Carlos Llamas
5a8658eac3 BACKPORT: FROMGIT: binder: reverse locking order in shrinker callback
The locking order currently requires the alloc->mutex to be acquired
first followed by the mmap lock. However, the alloc->mutex is converted
into a spinlock in subsequent commits so the order needs to be reversed
to avoid nesting the sleeping mmap lock under the spinlock.

The shrinker's callback binder_alloc_free_page() is the only place that
needs to be reordered since other functions have been refactored and no
longer nest these locks.

Some minor cosmetic changes are also included in this patch.

Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-28-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit e50f4e6cc9bfaca655d3b6a3506d27cf2caa1d40
 https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
 char-misc-next)
Change-Id: I7f7501945a477ac5571082a5dd2a7934f484b8ab
[cmllamas: fixed conflicts due to missing e66b77e505]
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-01-09 01:15:49 +00:00
Carlos Llamas
f0667c870c FROMGIT: binder: avoid user addresses in debug logs
Prefer logging vma offsets instead of addresses or simply drop the debug
log altogether if not useful. Note this covers the instances affected by
the switch to store addresses as unsigned long. However, there are other
sections in the driver that could do the same.

Signed-off-by: Carlos Llamas <cmllamas@google.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-27-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit 162c79731448a5a052e93af7753df579dfe0bf7a
 https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
 char-misc-next)
Change-Id: I92b7f409e45d9006492d56302e911ccdd8efc950
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-01-09 01:15:49 +00:00
Carlos Llamas
b93c9f8565 FROMGIT: binder: refactor binder_delete_free_buffer()
Skip the freelist call immediately as needed, instead of continuing the
pointless checks. Also, drop the debug logs that we don't really need.

Signed-off-by: Carlos Llamas <cmllamas@google.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-26-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit f07b83a48e944c8a1cc1e9f6703fae5e34df2ba4
 https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
 char-misc-next)
Change-Id: I035bd6cd5c06ec984cd6eb3c3b53e0958c64df4f
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-01-09 01:15:49 +00:00
Carlos Llamas
f6b1c043ae FROMGIT: binder: collapse print_binder_buffer() into caller
The code in print_binder_buffer() is quite small so it can be collapsed
into its single caller binder_alloc_print_allocated().

No functional change in this patch.

Signed-off-by: Carlos Llamas <cmllamas@google.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-25-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit 8e905217c4543af9cf1754809846157a7dbbb261
 https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
 char-misc-next)
Change-Id: Ic3e2522b4702e60e09be3d5940f88ec8252ac793
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-01-09 01:15:49 +00:00
Carlos Llamas
683f84a35f FROMGIT: binder: document the final page calculation
The code to determine the page range for binder_lru_freelist_del() is
quite obscure. It leverages the buffer_size calculated before doing an
oversized buffer split. This is used to figure out if the last page is
being shared with another active buffer. If so, the page gets trimmed
out of the range as it has been previously removed from the freelist.

This would be equivalent to getting the start page of the next in-use
buffer explicitly. However, the code for this is much larger as we can
see in binder_free_buf_locked() routine. Instead, lets settle on
documenting the tricky step and using better names for now.

I believe an ideal solution would be to count the binder_page->users to
determine when a page should be added or removed from the freelist.
However, this is a much bigger change than what I'm willing to risk at
this time.

Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-24-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit 67dcc880780569ec40391cae4d8299adc1e7a44e
 https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
 char-misc-next)
Change-Id: Iec2466605fe7f8aa338c8313f586cdb7519a36e7
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-01-09 01:15:49 +00:00
Carlos Llamas
4c82cbad43 BACKPORT: FROMGIT: binder: rename lru shrinker utilities
Now that the page allocation step is done separately we should rename
the binder_free_page_range() and binder_allocate_page_range() functions
to provide a more accurate description of what they do. Lets borrow the
freelist concept used in other parts of the kernel for this.

No functional change here.

Signed-off-by: Carlos Llamas <cmllamas@google.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-23-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit ea9cdbf0c7273b55e251b2ed8f85794cfadab5d5
 https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
 char-misc-next)
Change-Id: I0d0dfcc6f72d54209da310be2ad5e30f3d722652
[cmllamas: fixed trivial conflicts due to missing commits e33c267ab7
 95a542da5322e]
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-01-09 01:15:49 +00:00
Minghao Chi
eba1fb9603 UPSTREAM: drivers/android: remove redundant ret variable
Return value from list_lru_count() directly instead
of taking this in another redundant variable.

Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn>
Signed-off-by: CGEL ZTE <cgel.zte@gmail.com>
Link: https://lore.kernel.org/r/20220104113500.602158-1-chi.minghao@zte.com.cn
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit b2fb28dedd)
Change-Id: Ibe0d4a6c93a860e8ca2a3308a7073fa2eff9e785
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-01-09 01:15:49 +00:00
Carlos Llamas
356047fe2a FROMGIT: binder: make oversized buffer code more readable
The sections in binder_alloc_new_buf_locked() dealing with oversized
buffers are scattered which makes them difficult to read. Instead,
consolidate this code into a single block to improve readability.

No functional change here.

Signed-off-by: Carlos Llamas <cmllamas@google.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-22-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit de0e6573125f8ea7a01a9b05a45b0c73116c73b2
 https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
 char-misc-next)
Change-Id: I62c2cec7341e13d9174b4f0839a1345df7cfd808
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-01-09 01:15:49 +00:00
Carlos Llamas
f7476dca31 FROMGIT: binder: remove redundant debug log
The debug information in this statement is already logged earlier in the
same function. We can get rid of this duplicate log.

Signed-off-by: Carlos Llamas <cmllamas@google.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-21-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit 258ce20ede33c551002705fa1488864fb287752c
 https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
 char-misc-next)
Change-Id: Ie533a55ea10b2af927004f1d0e244b386ba25360
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-01-09 01:15:49 +00:00
Carlos Llamas
477e8e8453 BACKPORT: FROMGIT: binder: perform page installation outside of locks
Split out the insertion of pages to be outside of the alloc->mutex in a
separate binder_install_buffer_pages() routine. Since this is no longer
serialized, we must look at the full range of pages used by the buffers.
The installation is protected with mmap_sem in write mode since multiple
tasks might race to install the same page.

Besides avoiding unnecessary nested locking this helps in preparation of
switching the alloc->mutex into a spinlock_t in subsequent patches.

Signed-off-by: Carlos Llamas <cmllamas@google.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-20-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit 37ebbb4f73a0d299fa0c7dd043932a2f5fbbb779
 https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
 char-misc-next)
Change-Id: I7b0684310b8824194d7e4a51a1fd67944f8ec06a
[cmllamas: fixed conflicts due to missing e66b77e505]
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-01-09 01:15:49 +00:00
Carlos Llamas
af71193412 FROMGIT: binder: initialize lru pages in mmap callback
Rather than repeatedly initializing some of the binder_lru_page members
during binder_alloc_new_buf(), perform this initialization just once in
binder_alloc_mmap_handler(), after the pages have been created.

Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-19-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit 68aef12d094e4c96d972790f1620415460a4f3cf
 https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
 char-misc-next)
Change-Id: I3197038683f76a5cb98a79d017d1515429df2d73
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-01-09 01:15:49 +00:00
Carlos Llamas
ef524f4dd4 FROMGIT: binder: malloc new_buffer outside of locks
Preallocate new_buffer before acquiring the alloc->mutex and hand it
down to binder_alloc_new_buf_locked(). The new buffer will be used in
the vast majority of requests (measured at 98.2% in field data). The
buffer is discarded otherwise. This change is required in preparation
for transitioning alloc->mutex into a spinlock in subsequent commits.

Signed-off-by: Carlos Llamas <cmllamas@google.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-18-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit c7ac30fad18231a1637d38aa8a97d6b4788ed8ad
 https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
 char-misc-next)
Change-Id: Ib7c8eb3c53e8383694a118fabc776a6a22783c75
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-01-09 01:15:49 +00:00
Carlos Llamas
b23dbdbf19 BACKPORT: FROMGIT: binder: refactor page range allocation
Instead of looping through the page range twice to first determine if
the mmap lock is required, simply do it per-page as needed. Split out
all this logic into a separate binder_install_single_page() function.

Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-17-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit ea2735ce19c1c6ce0f6011f813a1eea0272c231d
 https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
 char-misc-next)
Change-Id: Ic057e9cfaeb22754f99bdec2a51076cf58c86855
[cmllamas: fixed conflicts due to missing e66b77e505]
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-01-09 01:15:49 +00:00
Carlos Llamas
59e0d62fc8 BACKPORT: FROMGIT: binder: relocate binder_alloc_clear_buf()
Move this function up along with binder_alloc_get_page() so that their
prototypes aren't necessary.

No functional change in this patch.

Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-16-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit cbc174a64b8d0ab542752c167dc1334b52b88624
 https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
 char-misc-next)
Change-Id: I0d3c69c9a26c7415308202c4b7868a36b83d089c
[cmllamas: fixed conflicts due to missing 26eff2d66a]
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-01-09 01:15:49 +00:00
Carlos Llamas
081ddad216 FROMGIT: binder: relocate low space calculation
Move the low async space calculation to debug_low_async_space_locked().
This logic not only fits better here but also offloads some of the many
tasks currently done in binder_alloc_new_buf_locked().

No functional change in this patch.

Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-15-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit c13500eaabd2343aa4cbb76b54ec624cb0c0ef8d
 https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
 char-misc-next)
Change-Id: I1b396f59f2a5b6640d8664767f2d45a675af7197
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-01-09 01:15:49 +00:00
Carlos Llamas
e1d195e94d FROMGIT: binder: separate the no-space debugging logic
Move the no-space debugging logic into a separate function. Lets also
mark this branch as unlikely in binder_alloc_new_buf_locked() as most
requests will fit without issue.

Also add a few cosmetic changes and suggestions from checkpatch.

Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-14-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit 9409af24e4503d14093b27db9425f7c99e64fef4
 https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
 char-misc-next)
Change-Id: I4ff8ced5728a63815f7d47df9eb9ac85aa0a362d
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-01-09 01:15:49 +00:00
Carlos Llamas
26d06d9349 FROMGIT: binder: remove pid param in binder_alloc_new_buf()
Binder attributes the buffer allocation to the current->tgid everytime.
There is no need to pass this as a parameter so drop it.

Also add a few touchups to follow the coding guidelines. No functional
changes are introduced in this patch.

Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-13-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit 89f71743bf42217dd4092fda703a8e4f6f4e55ac
 https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
 char-misc-next)
Change-Id: Ib21fdc5afd7eeb4723b08913ba40ded762421b0b
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-01-09 01:15:49 +00:00
Carlos Llamas
d5c44f9065 FROMGIT: binder: do unlocked work in binder_alloc_new_buf()
Extract non-critical sections from binder_alloc_new_buf_locked() that
don't require holding the alloc->mutex. While we are here, consolidate
the checks for size overflow and zero-sized padding into a separate
sanitized_size() helper function.

Also add a few touchups to follow the coding guidelines.

Signed-off-by: Carlos Llamas <cmllamas@google.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-12-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit 377e1684db7a1e23261f3c3ebf76523c0554d512
 https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
 char-misc-next)
Change-Id: I8fc18c06563ad2c26536633034fb3e94b0aaf510
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-01-09 01:15:49 +00:00
Carlos Llamas
0b24368fff FROMGIT: binder: split up binder_update_page_range()
The binder_update_page_range() function performs both allocation and
freeing of binder pages. However, these two operations are unrelated and
have no common logic. In fact, when a free operation is requested, the
allocation logic is skipped entirely. This behavior makes the error path
unnecessarily complex. To improve readability of the code, this patch
splits the allocation and freeing operations into separate functions.

No functional changes are introduced by this patch.

Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-11-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit 0d35bf3bf2da8d43fd12fea7699dc936999bf96e
 https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
 char-misc-next)
Change-Id: Iaf64f94564d2017c4633f2421c15b0bdee914738
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-01-09 01:15:49 +00:00
Carlos Llamas
c38a89805f FROMGIT: binder: keep vma addresses type as unsigned long
The vma addresses in binder are currently stored as void __user *. This
requires casting back and forth between the mm/ api which uses unsigned
long. Since we also do internal arithmetic on these addresses we end up
having to cast them _again_ to an integer type.

Lets stop all the unnecessary casting which kills code readability and
store the virtual addresses as the native unsigned long from mm/. Note
that this approach is preferred over uintptr_t as Linus explains in [1].

Opportunistically add a few cosmetic touchups.

Link: https://lore.kernel.org/all/CAHk-=wj2OHy-5e+srG1fy+ZU00TmZ1NFp6kFLbVLMXHe7A1d-g@mail.gmail.com/ [1]
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-10-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit df9aabead791d7a3d59938abe288720f5c1367f7
 https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
 char-misc-next)
Change-Id: Ib2fbaf0ad881973eb77957863f079f986fe0d926
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-01-09 01:15:49 +00:00
Carlos Llamas
ca5c7be9e0 FROMGIT: binder: remove extern from function prototypes
The kernel coding style does not require 'extern' in function prototypes
in .h files, so remove them from drivers/android/binder_alloc.h as they
are not needed.

No functional changes in this patch.

Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-9-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit da483f8b390546fbe36abd72f58d612a8032e2a8
 https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
 char-misc-next)
Change-Id: I75e4ee9cf08fada7378f448bc5992d125174132f
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-01-09 01:15:49 +00:00
Carlos Llamas
2a250a1528 FROMGIT: binder: fix comment on binder_alloc_new_buf() return value
Update the comments of binder_alloc_new_buf() to reflect that the return
value of the function is now ERR_PTR(-errno) on failure.

No functional changes in this patch.

Cc: stable@vger.kernel.org
Fixes: 57ada2fb22 ("binder: add log information for binder transaction failures")
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-8-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit e1090371e02b601cbfcea175c2a6cc7c955fa830
 https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
 char-misc-next)
Change-Id: If382b3c0fb4fdd52e37c7bb44e58f1f9f28ae7b4
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-01-09 01:15:49 +00:00
Carlos Llamas
26f0c01348 FROMGIT: binder: fix trivial typo of binder_free_buf_locked()
Fix minor misspelling of the function in the comment section.

No functional changes in this patch.

Cc: stable@vger.kernel.org
Fixes: 0f966cba95 ("binder: add flag to clear buffer on txn complete")
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-7-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit 122a3c1cb0ff304c2b8934584fcfea4edb2fe5e3
 https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
 char-misc-next)
Change-Id: I1e0525464f9f048196ccfe093464a37784ac03bc
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-01-09 01:15:49 +00:00
Carlos Llamas
11ca07657c FROMGIT: binder: fix unused alloc->free_async_space
Each transaction is associated with a 'struct binder_buffer' that stores
the metadata about its buffer area. Since commit 74310e06be ("android:
binder: Move buffer out of area shared with user space") this struct is
no longer embedded within the buffer itself but is instead allocated on
the heap to prevent userspace access to this driver-exclusive info.

Unfortunately, the space of this struct is still being accounted for in
the total buffer size calculation, specifically for async transactions.
This results in an additional 104 bytes added to every async buffer
request, and this area is never used.

This wasted space can be substantial. If we consider the maximum mmap
buffer space of SZ_4M, the driver will reserve half of it for async
transactions, or 0x200000. This area should, in theory, accommodate up
to 262,144 buffers of the minimum 8-byte size. However, after adding
the extra 'sizeof(struct binder_buffer)', the total number of buffers
drops to only 18,724, which is a sad 7.14% of the actual capacity.

This patch fixes the buffer size calculation to enable the utilization
of the entire async buffer space. This is expected to reduce the number
of -ENOSPC errors that are seen on the field.

Fixes: 74310e06be ("android: binder: Move buffer out of area shared with user space")
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-6-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 260584709
(cherry picked from commit c6d05e0762ab276102246d24affd1e116a46aa0c
 https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
 char-misc-next)
Change-Id: Ibea00de6a09bc583f648c1ee802c81332611d406
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-01-09 01:15:49 +00:00
Carlos Llamas
65cf1585ea FROMGIT: binder: fix async space check for 0-sized buffers
Move the padding of 0-sized buffers to an earlier stage to account for
this round up during the alloc->free_async_space check.

Fixes: 74310e06be ("android: binder: Move buffer out of area shared with user space")
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-5-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit 3091c21d3e9322428691ce0b7a0cfa9c0b239eeb
 https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
 char-misc-next)
Change-Id: I0c3a2378fb7e9c534ffe015dee849196f83aa252
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-01-09 01:15:49 +00:00
Carlos Llamas
1787dddd97 FROMGIT: binder: fix race between mmput() and do_exit()
Task A calls binder_update_page_range() to allocate and insert pages on
a remote address space from Task B. For this, Task A pins the remote mm
via mmget_not_zero() first. This can race with Task B do_exit() and the
final mmput() refcount decrement will come from Task A.

  Task A            | Task B
  ------------------+------------------
  mmget_not_zero()  |
                    |  do_exit()
                    |    exit_mm()
                    |      mmput()
  mmput()           |
    exit_mmap()     |
      remove_vma()  |
        fput()      |

In this case, the work of ____fput() from Task B is queued up in Task A
as TWA_RESUME. So in theory, Task A returns to userspace and the cleanup
work gets executed. However, Task A instead sleep, waiting for a reply
from Task B that never comes (it's dead).

This means the binder_deferred_release() is blocked until an unrelated
binder event forces Task A to go back to userspace. All the associated
death notifications will also be delayed until then.

In order to fix this use mmput_async() that will schedule the work in
the corresponding mm->async_put_work WQ instead of Task A.

Fixes: 457b9a6f09 ("Staging: android: add binder driver")
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-4-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 293845143
(cherry picked from commit 9a9ab0d963621d9d12199df9817e66982582d5a5
 https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
 char-misc-next)
Change-Id: I2ec43b375e115c0daf21df3893da634dbefeed3e
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-01-09 01:15:49 +00:00
Carlos Llamas
8dce2880bc FROMGIT: binder: fix use-after-free in shinker's callback
The mmap read lock is used during the shrinker's callback, which means
that using alloc->vma pointer isn't safe as it can race with munmap().
As of commit dd2283f260 ("mm: mmap: zap pages with read mmap_sem in
munmap") the mmap lock is downgraded after the vma has been isolated.

I was able to reproduce this issue by manually adding some delays and
triggering page reclaiming through the shrinker's debug sysfs. The
following KASAN report confirms the UAF:

  ==================================================================
  BUG: KASAN: slab-use-after-free in zap_page_range_single+0x470/0x4b8
  Read of size 8 at addr ffff356ed50e50f0 by task bash/478

  CPU: 1 PID: 478 Comm: bash Not tainted 6.6.0-rc5-00055-g1c8b86a3799f-dirty #70
  Hardware name: linux,dummy-virt (DT)
  Call trace:
   zap_page_range_single+0x470/0x4b8
   binder_alloc_free_page+0x608/0xadc
   __list_lru_walk_one+0x130/0x3b0
   list_lru_walk_node+0xc4/0x22c
   binder_shrink_scan+0x108/0x1dc
   shrinker_debugfs_scan_write+0x2b4/0x500
   full_proxy_write+0xd4/0x140
   vfs_write+0x1ac/0x758
   ksys_write+0xf0/0x1dc
   __arm64_sys_write+0x6c/0x9c

  Allocated by task 492:
   kmem_cache_alloc+0x130/0x368
   vm_area_alloc+0x2c/0x190
   mmap_region+0x258/0x18bc
   do_mmap+0x694/0xa60
   vm_mmap_pgoff+0x170/0x29c
   ksys_mmap_pgoff+0x290/0x3a0
   __arm64_sys_mmap+0xcc/0x144

  Freed by task 491:
   kmem_cache_free+0x17c/0x3c8
   vm_area_free_rcu_cb+0x74/0x98
   rcu_core+0xa38/0x26d4
   rcu_core_si+0x10/0x1c
   __do_softirq+0x2fc/0xd24

  Last potentially related work creation:
   __call_rcu_common.constprop.0+0x6c/0xba0
   call_rcu+0x10/0x1c
   vm_area_free+0x18/0x24
   remove_vma+0xe4/0x118
   do_vmi_align_munmap.isra.0+0x718/0xb5c
   do_vmi_munmap+0xdc/0x1fc
   __vm_munmap+0x10c/0x278
   __arm64_sys_munmap+0x58/0x7c

Fix this issue by performing instead a vma_lookup() which will fail to
find the vma that was isolated before the mmap lock downgrade. Note that
this option has better performance than upgrading to a mmap write lock
which would increase contention. Plus, mmap_write_trylock() has been
recently removed anyway.

Fixes: dd2283f260 ("mm: mmap: zap pages with read mmap_sem in munmap")
Cc: stable@vger.kernel.org
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Minchan Kim <minchan@kernel.org>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-3-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 304651042
(cherry picked from commit 3f489c2067c5824528212b0fc18b28d51332d906
 https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
 char-misc-next)
Change-Id: I206096ab47666eaee1651a4e102a01e6b7b4e5fb
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-01-09 01:15:49 +00:00
Carlos Llamas
3c4732563e FROMGIT: binder: use EPOLLERR from eventpoll.h
Use EPOLLERR instead of POLLERR to make sure it is cast to the correct
__poll_t type. This fixes the following sparse issue:

  drivers/android/binder.c:5030:24: warning: incorrect type in return expression (different base types)
  drivers/android/binder.c:5030:24:    expected restricted __poll_t
  drivers/android/binder.c:5030:24:    got int

Fixes: f88982679f ("binder: check for binder_thread allocation failure in binder_poll()")
Cc: stable@vger.kernel.org
Cc: Eric Biggers <ebiggers@google.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-2-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit 6ac061db9c58ca5b9270b1b3940d2464fb3ff183
 https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
 char-misc-next)
Change-Id: Ib77d9a6e9059ceab2539c7ef6c93ad74f1eabed8
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-01-09 01:15:49 +00:00
qinglin.li
486b17a096 ANDROID: GKI: Update symbol list for Amlogic
2 function symbol(s) added
  'struct drm_private_state* drm_atomic_get_new_private_obj_state(struct drm_atomic_state*, struct drm_private_obj*)'
  'uint64_t drm_format_info_min_pitch(const struct drm_format_info*, int, unsigned int)'

Bug: 319070243
Change-Id: Ia4e314bcb98c7355b208be60e9ee9e5d92ceca21
Signed-off-by: Qinglin Li <qinglin.li@amlogic.com>
2024-01-08 19:44:23 +00:00
Jiri Olsa
bfefe25dfa UPSTREAM: bpf: Fix prog_array_map_poke_run map poke update
commit 4b7de801606e504e69689df71475d27e35336fb3 upstream.

Lee pointed out issue found by syscaller [0] hitting BUG in prog array
map poke update in prog_array_map_poke_run function due to error value
returned from bpf_arch_text_poke function.

There's race window where bpf_arch_text_poke can fail due to missing
bpf program kallsym symbols, which is accounted for with check for
-EINVAL in that BUG_ON call.

The problem is that in such case we won't update the tail call jump
and cause imbalance for the next tail call update check which will
fail with -EBUSY in bpf_arch_text_poke.

I'm hitting following race during the program load:

  CPU 0                             CPU 1

  bpf_prog_load
    bpf_check
      do_misc_fixups
        prog_array_map_poke_track

                                    map_update_elem
                                      bpf_fd_array_map_update_elem
                                        prog_array_map_poke_run

                                          bpf_arch_text_poke returns -EINVAL

    bpf_prog_kallsyms_add

After bpf_arch_text_poke (CPU 1) fails to update the tail call jump, the next
poke update fails on expected jump instruction check in bpf_arch_text_poke
with -EBUSY and triggers the BUG_ON in prog_array_map_poke_run.

Similar race exists on the program unload.

Fixing this by moving the update to bpf_arch_poke_desc_update function which
makes sure we call __bpf_arch_text_poke that skips the bpf address check.

Each architecture has slightly different approach wrt looking up bpf address
in bpf_arch_text_poke, so instead of splitting the function or adding new
'checkip' argument in previous version, it seems best to move the whole
map_poke_run update as arch specific code.

  [0] https://syzkaller.appspot.com/bug?extid=97a4fe20470e9bc30810

Bug: 309551558
Fixes: ebf7d1f508 ("bpf, x64: rework pro/epilogue and tailcall handling in JIT")
Reported-by: syzbot+97a4fe20470e9bc30810@syzkaller.appspotmail.com
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Cc: Lee Jones <lee@kernel.org>
Cc: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20231206083041.1306660-2-jolsa@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 13578b4ea4)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I1291f0589e84f627ee44d07acb24196fab166c29
2024-01-08 16:58:17 +00:00
Aran Dalton
274748592e ANDROID: gki_defconfig: Set CONFIG_IDLE_INJECT and CONFIG_CPU_IDLE_THERMAL into y
Under certain circumstances a SoC can reach a critical temperaturelimit
and is unable to stabilize the temperature around a temperaturecontrol.
The system may ask for a specific power budget butbecause of the OPP
density, we can only choose an OPP with a powerbudget lower than the
requested one and under-utilize the CPU, thuslosing performance. In
other words, one OPP under-utilizes the CPUwith a power less than the
requested power budget and the next OPPexceeds the power budget. The
cpu idle cooling can solve this problem.

Bug: 299411923
Signed-off-by: Aran Dalton <arda@allwinnertech.com>
Change-Id: I1c17b340617e88be075097dc47f30ce94be2a4d7
2024-01-08 16:46:15 +00:00
Sebastian Ene
597362d44f ANDROID: KVM: arm64: Don't prepopulate MMIO regions for host stage-2
As we reserve only 1GB of memory for the MMIO region don't prepopulate
the entire remaining address space with MMIO as this is prone to failure.
Instead, let the MMIO regions to be created lazily on the fault path and
keep only the RAM regions prepopulated.

Bug: 307805059
Test: Boot pKVM with CONFIG_ARM64_16K_PAGES
Change-Id: I6327f42eb17c6588335a1e04736393c9032114ab
Signed-off-by: Sebastian Ene <sebastianene@google.com>
2024-01-05 11:58:05 +00:00
Vincent Donnefort
45d542adb4 ANDROID: KVM: arm64: Fix host_smc print typo
From pKVM point of view, unknown SMCs are simply forwarded, we can't
consider them invalid or not. This was probably a typo following a copy
of the host_hcall event.

Bug: 299430621
Change-Id: Ieb53f985a5187a8b5a9feb4a95982b15cdc1b04a
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
(cherry picked from commit 717d1f8f91)
2024-01-04 15:43:26 +00:00
Vincent Donnefort
38bb85f2fb ANDROID: KVM: arm64: Fix hyp event alignment
The structures that define hyp events must be packed so they match
their format definitions in the tracefs file
hyp/events/hyp/<event>/format.

Bug: 299430621
Change-Id: Ia7e1a686744d5c9c3f8a21881f03228c8acecade
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
2024-01-04 10:02:37 +00:00
Vincent Donnefort
c09c7c05d0 ANDROID: KVM: arm64: Document module_change_host_prot_range
When this pKVM module ops has been introduced, the documentation has
been omitted.

Bug: 308373293
Change-Id: I9e471414e72a1ee04c132de4ed95d77e815ae8c9
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
2024-01-03 18:19:04 +00:00
Roy Luo
80d91f64ba BACKPORT: USB: gadget: core: adjust uevent timing on gadget unbind
The KOBJ_CHANGE uevent is sent before gadget unbind is actually
executed, resulting in inaccurate uevent emitted at incorrect timing
(the uevent would have USB_UDC_DRIVER variable set while it would
soon be removed).
Move the KOBJ_CHANGE uevent to the end of the unbind function so that
uevent is sent only after the change has been made.

Fixes: 2ccea03a8f ("usb: gadget: introduce UDC Class")
Cc: stable@vger.kernel.org
Signed-off-by: Roy Luo <royluo@google.com>
Link: https://lore.kernel.org/r/20231128221756.2591158-1-royluo@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 312543856
Change-Id: Ida7fa7e1cfae3d1b3f3348512a67fe91065f25af
(cherry picked from commit 73ea73affe8622bdf292de898da869d441da6a9d)
[royluo: resolved conflicts in drivers/usb/gadget/udc/core.c]
Signed-off-by: Roy Luo <royluo@google.com>
2024-01-02 21:34:11 +00:00
James Tai
19a4494b2b ANDROID: GKI: Update RTK STB KMI symbol list
15 function symbol(s) added
  'int __devm_iio_trigger_register(struct device*, struct iio_trigger*, struct module*)'
  'void __dynamic_netdev_dbg(struct _ddebug*, const struct net_device*, const char*, ...)'
  'struct iio_trigger* devm_iio_trigger_alloc(struct device*, const char*, ...)'
  'irqreturn_t iio_pollfunc_store_time(int, void*)'
  'void netdev_stats_to_stats64(struct rtnl_link_stats64*, const struct net_device_stats*)'
  'int phy_connect_direct(struct net_device*, struct phy_device*, void(*)(struct net_device*), phy_interface_t)'
  'int phy_driver_register(struct phy_driver*, struct module*)'
  'void phy_get_pause(struct phy_device*, bool*, bool*)'
  'int phy_restart_aneg(struct phy_device*)'
  'int phy_resume(struct phy_device*)'
  'void phy_set_asym_pause(struct phy_device*, bool, bool)'
  'int phy_set_max_speed(struct phy_device*, u32)'
  'int phy_speed_down(struct phy_device*, bool)'
  'int phy_speed_up(struct phy_device*)'
  'void phy_support_asym_pause(struct phy_device*)'

Bug: 318044500
Change-Id: Ia377ae3c1a1ad1d56a526ab6b90021995f04110d
Signed-off-by: James Tai <james.tai@realtek.com>
2023-12-29 17:34:31 +08:00
Wu Bo
55f6a96975 UPSTREAM: dm verity: don't perform FEC for failed readahead IO
We found an issue under Android OTA scenario that many BIOs have to do
FEC where the data under dm-verity is 100% complete and no corruption.

Android OTA has many dm-block layers, from upper to lower:
dm-verity
dm-snapshot
dm-origin & dm-cow
dm-linear
ufs

DM tables have to change 2 times during Android OTA merging process.
When doing table change, the dm-snapshot will be suspended for a while.
During this interval, many readahead IOs are submitted to dm_verity
from filesystem. Then the kverity works are busy doing FEC process
which cost too much time to finish dm-verity IO. This causes needless
delay which feels like system is hung.

After adding debugging it was found that each readahead IO needed
around 10s to finish when this situation occurred. This is due to IO
amplification:

dm-snapshot suspend
erofs_readahead     // 300+ io is submitted
	dm_submit_bio (dm_verity)
		dm_submit_bio (dm_snapshot)
		bio return EIO
		bio got nothing, it's empty
	verity_end_io
	verity_verify_io
	forloop range(0, io->n_blocks)    // each io->nblocks ~= 20
		verity_fec_decode
		fec_decode_rsb
		fec_read_bufs
		forloop range(0, v->fec->rsn) // v->fec->rsn = 253
			new_read
			submit_bio (dm_snapshot)
		end loop
	end loop
dm-snapshot resume

Readahead BIOs get nothing while dm-snapshot is suspended, so all of
them will cause verity's FEC.
Each readahead BIO needs to verify ~20 (io->nblocks) blocks.
Each block needs to do FEC, and every block needs to do 253
(v->fec->rsn) reads.
So during the suspend interval(~200ms), 300 readahead BIOs trigger
~1518000 (300*20*253) IOs to dm-snapshot.

As readahead IO is not required by userspace, and to fix this issue,
it is best to pass readahead errors to upper layer to handle it.

Cc: stable@vger.kernel.org
Fixes: a739ff3f54 ("dm verity: add support for forward error correction")
Bug: 316972624
Link: https://lore.kernel.org/dm-devel/b84fb49-bf63-3442-8c99-d565e134f2@redhat.com
Signed-off-by: Wu Bo <bo.wu@vivo.com>
Reviewed-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Akilesh Kailash <akailash@google.com>
(cherry picked from commit 0193e3966ceeeef69e235975918b287ab093082b)
Change-Id: I73560e5660cebdc1997e1f9926cbb8888789eb46
2023-12-21 22:45:44 +00:00
Zhengchao Shao
781393c0a2 UPSTREAM: ipv4: igmp: fix refcnt uaf issue when receiving igmp query packet
[ Upstream commit e2b706c691905fe78468c361aaabc719d0a496f1 ]

When I perform the following test operations:
1.ip link add br0 type bridge
2.brctl addif br0 eth0
3.ip addr add 239.0.0.1/32 dev eth0
4.ip addr add 239.0.0.1/32 dev br0
5.ip addr add 224.0.0.1/32 dev br0
6.while ((1))
    do
        ifconfig br0 up
        ifconfig br0 down
    done
7.send IGMPv2 query packets to port eth0 continuously. For example,
./mausezahn ethX -c 0 "01 00 5e 00 00 01 00 72 19 88 aa 02 08 00 45 00 00
1c 00 01 00 00 01 02 0e 7f c0 a8 0a b7 e0 00 00 01 11 64 ee 9b 00 00 00 00"

The preceding tests may trigger the refcnt uaf issue of the mc list. The
stack is as follows:
	refcount_t: addition on 0; use-after-free.
	WARNING: CPU: 21 PID: 144 at lib/refcount.c:25 refcount_warn_saturate (lib/refcount.c:25)
	CPU: 21 PID: 144 Comm: ksoftirqd/21 Kdump: loaded Not tainted 6.7.0-rc1-next-20231117-dirty #80
	Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
	RIP: 0010:refcount_warn_saturate (lib/refcount.c:25)
	RSP: 0018:ffffb68f00657910 EFLAGS: 00010286
	RAX: 0000000000000000 RBX: ffff8a00c3bf96c0 RCX: ffff8a07b6160908
	RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffff8a07b6160900
	RBP: ffff8a00cba36862 R08: 0000000000000000 R09: 00000000ffff7fff
	R10: ffffb68f006577c0 R11: ffffffffb0fdcdc8 R12: ffff8a00c3bf9680
	R13: ffff8a00c3bf96f0 R14: 0000000000000000 R15: ffff8a00d8766e00
	FS:  0000000000000000(0000) GS:ffff8a07b6140000(0000) knlGS:0000000000000000
	CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
	CR2: 000055f10b520b28 CR3: 000000039741a000 CR4: 00000000000006f0
	Call Trace:
	<TASK>
	igmp_heard_query (net/ipv4/igmp.c:1068)
	igmp_rcv (net/ipv4/igmp.c:1132)
	ip_protocol_deliver_rcu (net/ipv4/ip_input.c:205)
	ip_local_deliver_finish (net/ipv4/ip_input.c:234)
	__netif_receive_skb_one_core (net/core/dev.c:5529)
	netif_receive_skb_internal (net/core/dev.c:5729)
	netif_receive_skb (net/core/dev.c:5788)
	br_handle_frame_finish (net/bridge/br_input.c:216)
	nf_hook_bridge_pre (net/bridge/br_input.c:294)
	__netif_receive_skb_core (net/core/dev.c:5423)
	__netif_receive_skb_list_core (net/core/dev.c:5606)
	__netif_receive_skb_list (net/core/dev.c:5674)
	netif_receive_skb_list_internal (net/core/dev.c:5764)
	napi_gro_receive (net/core/gro.c:609)
	e1000_clean_rx_irq (drivers/net/ethernet/intel/e1000/e1000_main.c:4467)
	e1000_clean (drivers/net/ethernet/intel/e1000/e1000_main.c:3805)
	__napi_poll (net/core/dev.c:6533)
	net_rx_action (net/core/dev.c:6735)
	__do_softirq (kernel/softirq.c:554)
	run_ksoftirqd (kernel/softirq.c:913)
	smpboot_thread_fn (kernel/smpboot.c:164)
	kthread (kernel/kthread.c:388)
	ret_from_fork (arch/x86/kernel/process.c:153)
	ret_from_fork_asm (arch/x86/entry/entry_64.S:250)
	</TASK>

The root causes are as follows:
Thread A					Thread B
...						netif_receive_skb
br_dev_stop					...
    br_multicast_leave_snoopers			...
        __ip_mc_dec_group			...
            __igmp_group_dropped		igmp_rcv
                igmp_stop_timer			    igmp_heard_query         //ref = 1
                ip_ma_put			        igmp_mod_timer
                    refcount_dec_and_test	            igmp_start_timer //ref = 0
			...                                     refcount_inc //ref increases from 0
When the device receives an IGMPv2 Query message, it starts the timer
immediately, regardless of whether the device is running. If the device is
down and has left the multicast group, it will cause the mc list refcount
uaf issue.

Bug: 316932391
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 94445d9583)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I277be2304e564994e05b981ccd6cd8cbb9dc85be
2023-12-21 12:59:34 +00:00
Florian Westphal
97c69470fe UPSTREAM: netfilter: nft_set_pipapo: skip inactive elements during set walk
commit 317eb9685095678f2c9f5a8189de698c5354316a upstream.

Otherwise set elements can be deactivated twice which will cause a crash.

Bug: 316310313
Reported-by: Xingyuan Mo <hdthky0@gmail.com>
Fixes: 3c4287f620 ("nf_tables: Add set type for arbitrary concatenation of ranges")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 189c2a8293)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I27fb6ee806642e23ca02700763a387341dd463e6
2023-12-21 11:16:28 +00:00
Paul Lawrence
16ea59408c ANDROID: fuse-bpf: Follow mounts in lookups
Bug: 292925770
Test: fuse_test run. The following steps on Android also now pass:

	Create /data/123 and /data/media/0/Android/data/45 directories
	Mount /data/123 directory to /data/media/0/Android/data/45 directory
	Create 1.txt under the /data/123 directory

	File 1.txt should appear in /storage/emulated/0/Android/data/45
Signed-off-by: Paul Lawrence <paullawrence@google.com>
(cherry picked from https://android-review.googlesource.com/q/commit:9323938705b42cb4dd863d5cf8022ba8f2282952)
Merged-In: I1fe27d743ca2981e624a9aa87d9ab6deb313aadc
Change-Id: I1fe27d743ca2981e624a9aa87d9ab6deb313aadc
2023-12-21 00:48:14 +00:00
Lee Jones
abea1cb16e ANDROID: Snapshot Mainline's version of checkpatch.pl
Nothing fancy here.  Keeping full history is not required.

  `git checkout mainline/master -- scripts/checkpatch.pl`

This may need to be done periodically.

Bug: 316492624
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I4c90b50197ca7277c59e96bf332ecf795c4f3d12
2023-12-15 20:40:26 +00:00
qinglin.li
9f6f0c1de5 ANDROID: GKI: Update symbol list for Amlogic
Update amlogic abi symbol list

Bug: 316433326
Change-Id: If7464a27ea3feb18f0b68178ae839424ba77142e
Signed-off-by: Qinglin Li <qinglin.li@amlogic.com>
2023-12-15 18:03:14 +00:00
Keir Fraser
a5123cff8d ANDROID: KVM: arm64: Skip prefaulting ptes which will be modified later
Block mappings can be split as part of a page table update. When
prefaulting entries during the split, it is pointless to install
valid ptes which will later be modified by the same walk.

At the same time, push the check for pte_is_counted into the
prefault handler, where it logically belongs.

Bug: 308373293
Change-Id: If4599b2860aa62d82ce8db019a8410c2d883de71
Signed-off-by: Keir Fraser <keirf@google.com>
2023-12-15 08:46:17 +00:00
Keir Fraser
97bbbf4497 ANDROID: KVM: arm64: Introduce module_change_host_prot_range
This allows protection attributes to be changed for a range of
pages via a single module API call.

The original API call modifying a single page is now implemented
as a shim on top of the new range-based call.

The ABI STG is also fixed up:

type 'struct pkvm_module_ops' changed
  member 'union { int(* host_stage2_mod_prot_range)(u64, enum kvm_pgtable_prot, u64); struct { u64 android_kabi_reserved1; }; union { }; }' was added
  member 'u64 android_kabi_reserved1' was removed

Bug: 308373293
Change-Id: I6fbb2e0b325aa972148f48746565dcc10d74edaf
Signed-off-by: Keir Fraser <keirf@google.com>
2023-12-15 08:46:17 +00:00
Keir Fraser
1dc2dbbb57 ANDROID: KVM: arm64: Relax checks in module_change_host_page_prot
Modules can only relax permissions to RWX. This seems rather arbitrary.
Instead, allow any valid permissions to be set, as long as the page is
a pristine host page, or already module owned.

Bug: 308373293
Change-Id: I905786fad6543f47a00bd9b9f07e17dd660d457c
Signed-off-by: Keir Fraser <keirf@google.com>
2023-12-15 08:46:17 +00:00
Keir Fraser
b43baa770b ANDROID: KVM: arm64: Optimise module_change_host_page_prot
Merge the relaxation and restriction paths to both only need to adjust
permissions. This avoids un-map + re-map on the restriction path; and
avoids installing an annotated entry on the relaxation path (which
will cause a translation fault on first access by the host).

Bug: 308373293
Change-Id: I9c7a6ac149aad64b19a5ce7808334188475b27cc
Signed-off-by: Keir Fraser <keirf@google.com>
2023-12-15 08:46:17 +00:00