Commit Graph

1165241 Commits

Author SHA1 Message Date
Carlos Llamas
7e855d1492 Reapply "ANDROID: Add vendor hooks for binder perf tuning"
This reverts commit eeb899e9f54bef5286fd5044db481ecc01e417b4.

Change-Id: I810727a6872c16ccb484023bfbc587daca8a2515
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-05-29 20:51:37 +00:00
Carlos Llamas
a091f9709e UPSTREAM: binder: switch alloc->mutex to spinlock_t
The alloc->mutex is a highly contended lock that causes performance
issues on Android devices. When a low-priority task is given this lock
and it sleeps, it becomes difficult for the task to wake up and complete
its work. This delays other tasks that are also waiting on the mutex.

The problem gets worse when there is memory pressure in the system,
because this increases the contention on the alloc->mutex while the
shrinker reclaims binder pages.

Switching to a spinlock helps to keep the waiters running and avoids the
overhead of waking up tasks. This significantly improves the transaction
latency when the problematic scenario occurs.

The performance impact of this patchset was measured by stress-testing
the binder alloc contention. In this test, several clients of different
priorities send thousands of transactions of different sizes to a single
server. In parallel, pages get reclaimed using the shinker's debugfs.

The test was run on a Pixel 8, Pixel 6 and qemu machine. The results
were similar on all three devices:

after:
  | sched  | prio | average | max       | min     |
  |--------+------+---------+-----------+---------|
  | fifo   |   99 | 0.135ms |   1.197ms | 0.022ms |
  | fifo   |   01 | 0.136ms |   5.232ms | 0.018ms |
  | other  |  -20 | 0.180ms |   7.403ms | 0.019ms |
  | other  |   19 | 0.241ms |  58.094ms | 0.018ms |

before:
  | sched  | prio | average | max       | min     |
  |--------+------+---------+-----------+---------|
  | fifo   |   99 | 0.350ms | 248.730ms | 0.020ms |
  | fifo   |   01 | 0.357ms | 248.817ms | 0.024ms |
  | other  |  -20 | 0.399ms | 249.906ms | 0.020ms |
  | other  |   19 | 0.477ms | 297.756ms | 0.022ms |

The key metrics above are the average and max latencies (wall time).
These improvements should roughly translate to p95-p99 latencies on real
workloads. The response time is up to 200x faster in these scenarios and
there is no penalty in the regular path.

Note that it is only possible to convert this lock after a series of
changes made by previous patches. These mainly include refactoring the
sections that might_sleep() and changing the locking order with the
mmap_lock amongst others.

Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-29-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit 7710e2cca32e7f3958480e8bd44f50e29d0c2509)
Change-Id: I67121be071d5f072ac0e5eb719c95c0f1dee5eb5
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-05-29 20:51:37 +00:00
Carlos Llamas
a4abaacdab UPSTREAM: binder: reverse locking order in shrinker callback
The locking order currently requires the alloc->mutex to be acquired
first followed by the mmap lock. However, the alloc->mutex is converted
into a spinlock in subsequent commits so the order needs to be reversed
to avoid nesting the sleeping mmap lock under the spinlock.

The shrinker's callback binder_alloc_free_page() is the only place that
needs to be reordered since other functions have been refactored and no
longer nest these locks.

Some minor cosmetic changes are also included in this patch.

Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-28-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit e50f4e6cc9bfaca655d3b6a3506d27cf2caa1d40)
Change-Id: I7f7501945a477ac5571082a5dd2a7934f484b8ab
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-05-29 20:51:37 +00:00
Carlos Llamas
112ca28d26 UPSTREAM: binder: avoid user addresses in debug logs
Prefer logging vma offsets instead of addresses or simply drop the debug
log altogether if not useful. Note this covers the instances affected by
the switch to store addresses as unsigned long. However, there are other
sections in the driver that could do the same.

Signed-off-by: Carlos Llamas <cmllamas@google.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-27-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit 162c79731448a5a052e93af7753df579dfe0bf7a)
Change-Id: I92b7f409e45d9006492d56302e911ccdd8efc950
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-05-29 20:51:37 +00:00
Carlos Llamas
0402339efb UPSTREAM: binder: refactor binder_delete_free_buffer()
Skip the freelist call immediately as needed, instead of continuing the
pointless checks. Also, drop the debug logs that we don't really need.

Signed-off-by: Carlos Llamas <cmllamas@google.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-26-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit f07b83a48e944c8a1cc1e9f6703fae5e34df2ba4)
Change-Id: I035bd6cd5c06ec984cd6eb3c3b53e0958c64df4f
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-05-29 20:51:37 +00:00
Carlos Llamas
aacaa36eaa UPSTREAM: binder: collapse print_binder_buffer() into caller
The code in print_binder_buffer() is quite small so it can be collapsed
into its single caller binder_alloc_print_allocated().

No functional change in this patch.

Signed-off-by: Carlos Llamas <cmllamas@google.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-25-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit 8e905217c4543af9cf1754809846157a7dbbb261)
Change-Id: Ic3e2522b4702e60e09be3d5940f88ec8252ac793
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-05-29 20:51:37 +00:00
Carlos Llamas
baef4637fc UPSTREAM: binder: document the final page calculation
The code to determine the page range for binder_lru_freelist_del() is
quite obscure. It leverages the buffer_size calculated before doing an
oversized buffer split. This is used to figure out if the last page is
being shared with another active buffer. If so, the page gets trimmed
out of the range as it has been previously removed from the freelist.

This would be equivalent to getting the start page of the next in-use
buffer explicitly. However, the code for this is much larger as we can
see in binder_free_buf_locked() routine. Instead, lets settle on
documenting the tricky step and using better names for now.

I believe an ideal solution would be to count the binder_page->users to
determine when a page should be added or removed from the freelist.
However, this is a much bigger change than what I'm willing to risk at
this time.

Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-24-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit 67dcc880780569ec40391cae4d8299adc1e7a44e)
Change-Id: Iec2466605fe7f8aa338c8313f586cdb7519a36e7
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-05-29 20:51:37 +00:00
Carlos Llamas
19d966c1c6 BACKPORT: UPSTREAM: binder: rename lru shrinker utilities
Now that the page allocation step is done separately we should rename
the binder_free_page_range() and binder_allocate_page_range() functions
to provide a more accurate description of what they do. Lets borrow the
freelist concept used in other parts of the kernel for this.

No functional change here.

Signed-off-by: Carlos Llamas <cmllamas@google.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-23-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit ea9cdbf0c7273b55e251b2ed8f85794cfadab5d5)
Change-Id: I0d0dfcc6f72d54209da310be2ad5e30f3d722652
[cmllamas: fixed trivial conflicts due to missing commits e33c267ab7
 95a542da5322e]
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-05-29 20:51:37 +00:00
Carlos Llamas
aac2b2c731 UPSTREAM: binder: make oversized buffer code more readable
The sections in binder_alloc_new_buf_locked() dealing with oversized
buffers are scattered which makes them difficult to read. Instead,
consolidate this code into a single block to improve readability.

No functional change here.

Signed-off-by: Carlos Llamas <cmllamas@google.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-22-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit de0e6573125f8ea7a01a9b05a45b0c73116c73b2)
Change-Id: I62c2cec7341e13d9174b4f0839a1345df7cfd808
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-05-29 20:51:37 +00:00
Carlos Llamas
de86cd1e00 UPSTREAM: binder: remove redundant debug log
The debug information in this statement is already logged earlier in the
same function. We can get rid of this duplicate log.

Signed-off-by: Carlos Llamas <cmllamas@google.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-21-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit 258ce20ede33c551002705fa1488864fb287752c)
Change-Id: Ie533a55ea10b2af927004f1d0e244b386ba25360
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-05-29 20:51:37 +00:00
Carlos Llamas
78dfa69547 UPSTREAM: binder: perform page installation outside of locks
Split out the insertion of pages to be outside of the alloc->mutex in a
separate binder_install_buffer_pages() routine. Since this is no longer
serialized, we must look at the full range of pages used by the buffers.
The installation is protected with mmap_sem in write mode since multiple
tasks might race to install the same page.

Besides avoiding unnecessary nested locking this helps in preparation of
switching the alloc->mutex into a spinlock_t in subsequent patches.

Signed-off-by: Carlos Llamas <cmllamas@google.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-20-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit 37ebbb4f73a0d299fa0c7dd043932a2f5fbbb779)
Change-Id: I7b0684310b8824194d7e4a51a1fd67944f8ec06a
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-05-29 20:51:37 +00:00
Carlos Llamas
2b6af2f338 UPSTREAM: binder: initialize lru pages in mmap callback
Rather than repeatedly initializing some of the binder_lru_page members
during binder_alloc_new_buf(), perform this initialization just once in
binder_alloc_mmap_handler(), after the pages have been created.

Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-19-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit 68aef12d094e4c96d972790f1620415460a4f3cf)
Change-Id: I3197038683f76a5cb98a79d017d1515429df2d73
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-05-29 20:51:37 +00:00
Carlos Llamas
48554214a7 UPSTREAM: binder: malloc new_buffer outside of locks
Preallocate new_buffer before acquiring the alloc->mutex and hand it
down to binder_alloc_new_buf_locked(). The new buffer will be used in
the vast majority of requests (measured at 98.2% in field data). The
buffer is discarded otherwise. This change is required in preparation
for transitioning alloc->mutex into a spinlock in subsequent commits.

Signed-off-by: Carlos Llamas <cmllamas@google.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-18-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit c7ac30fad18231a1637d38aa8a97d6b4788ed8ad)
Change-Id: Ib7c8eb3c53e8383694a118fabc776a6a22783c75
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-05-29 20:51:37 +00:00
Carlos Llamas
796a1cae7b UPSTREAM: binder: refactor page range allocation
Instead of looping through the page range twice to first determine if
the mmap lock is required, simply do it per-page as needed. Split out
all this logic into a separate binder_install_single_page() function.

Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-17-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit ea2735ce19c1c6ce0f6011f813a1eea0272c231d)
Change-Id: Ic057e9cfaeb22754f99bdec2a51076cf58c86855
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-05-29 20:51:37 +00:00
Carlos Llamas
568a63be9a UPSTREAM: binder: relocate binder_alloc_clear_buf()
Move this function up along with binder_alloc_get_page() so that their
prototypes aren't necessary.

No functional change in this patch.

Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-16-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit cbc174a64b8d0ab542752c167dc1334b52b88624)
Change-Id: I0d3c69c9a26c7415308202c4b7868a36b83d089c
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-05-29 20:51:37 +00:00
Carlos Llamas
e4ee73a682 UPSTREAM: binder: relocate low space calculation
Move the low async space calculation to debug_low_async_space_locked().
This logic not only fits better here but also offloads some of the many
tasks currently done in binder_alloc_new_buf_locked().

No functional change in this patch.

Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-15-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit c13500eaabd2343aa4cbb76b54ec624cb0c0ef8d)
Change-Id: I1b396f59f2a5b6640d8664767f2d45a675af7197
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-05-29 20:51:37 +00:00
Carlos Llamas
919daae2b6 UPSTREAM: binder: separate the no-space debugging logic
Move the no-space debugging logic into a separate function. Lets also
mark this branch as unlikely in binder_alloc_new_buf_locked() as most
requests will fit without issue.

Also add a few cosmetic changes and suggestions from checkpatch.

Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-14-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit 9409af24e4503d14093b27db9425f7c99e64fef4)
Change-Id: I4ff8ced5728a63815f7d47df9eb9ac85aa0a362d
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-05-29 20:51:37 +00:00
Carlos Llamas
a37caf8d4c UPSTREAM: binder: remove pid param in binder_alloc_new_buf()
Binder attributes the buffer allocation to the current->tgid everytime.
There is no need to pass this as a parameter so drop it.

Also add a few touchups to follow the coding guidelines. No functional
changes are introduced in this patch.

Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-13-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit 89f71743bf42217dd4092fda703a8e4f6f4e55ac)
Change-Id: Ib21fdc5afd7eeb4723b08913ba40ded762421b0b
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-05-29 20:51:37 +00:00
Carlos Llamas
a880c450b7 UPSTREAM: binder: do unlocked work in binder_alloc_new_buf()
Extract non-critical sections from binder_alloc_new_buf_locked() that
don't require holding the alloc->mutex. While we are here, consolidate
the checks for size overflow and zero-sized padding into a separate
sanitized_size() helper function.

Also add a few touchups to follow the coding guidelines.

Signed-off-by: Carlos Llamas <cmllamas@google.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-12-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit 377e1684db7a1e23261f3c3ebf76523c0554d512)
Change-Id: I8fc18c06563ad2c26536633034fb3e94b0aaf510
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-05-29 20:51:37 +00:00
Carlos Llamas
e5fae62ffb UPSTREAM: binder: split up binder_update_page_range()
The binder_update_page_range() function performs both allocation and
freeing of binder pages. However, these two operations are unrelated and
have no common logic. In fact, when a free operation is requested, the
allocation logic is skipped entirely. This behavior makes the error path
unnecessarily complex. To improve readability of the code, this patch
splits the allocation and freeing operations into separate functions.

No functional changes are introduced by this patch.

Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-11-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit 0d35bf3bf2da8d43fd12fea7699dc936999bf96e)
Change-Id: Iaf64f94564d2017c4633f2421c15b0bdee914738
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-05-29 20:51:37 +00:00
Carlos Llamas
b66dacff3d UPSTREAM: binder: keep vma addresses type as unsigned long
The vma addresses in binder are currently stored as void __user *. This
requires casting back and forth between the mm/ api which uses unsigned
long. Since we also do internal arithmetic on these addresses we end up
having to cast them _again_ to an integer type.

Lets stop all the unnecessary casting which kills code readability and
store the virtual addresses as the native unsigned long from mm/. Note
that this approach is preferred over uintptr_t as Linus explains in [1].

Opportunistically add a few cosmetic touchups.

Link: https://lore.kernel.org/all/CAHk-=wj2OHy-5e+srG1fy+ZU00TmZ1NFp6kFLbVLMXHe7A1d-g@mail.gmail.com/ [1]
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-10-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit df9aabead791d7a3d59938abe288720f5c1367f7)
Change-Id: Ib2fbaf0ad881973eb77957863f079f986fe0d926
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-05-29 20:51:37 +00:00
Carlos Llamas
8b2c4f7ab3 UPSTREAM: binder: remove extern from function prototypes
The kernel coding style does not require 'extern' in function prototypes
in .h files, so remove them from drivers/android/binder_alloc.h as they
are not needed.

No functional changes in this patch.

Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20231201172212.1813387-9-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 254650075
(cherry picked from commit da483f8b390546fbe36abd72f58d612a8032e2a8)
Change-Id: I75e4ee9cf08fada7378f448bc5992d125174132f
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-05-29 20:51:37 +00:00
Carlos Llamas
6a29f5fb4b Revert "ANDROID: Add vendor hooks for binder perf tuning"
This reverts commit 17fff41db8.

The alloc->mutex to spinlock_t patches from [1] are being backported
into this branch. The vendor hooks will be reapplied on top of these
backports in a way that matches the new structure of the code.

Link: https://lore.kernel.org/all/20231201172212.1813387-1-cmllamas@google.com/ [1]
Change-Id: Ic1acdd3401f985614d2d7383bdaabd6d71bb0c44
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-05-29 20:51:37 +00:00
Carlos Llamas
9e2c184da1 Revert "ANDROID: vendor_hooks: Add hook for binder_detect_low_async_space_locked"
This reverts commit 7ce117301e.

The alloc->mutex to spinlock_t patches from [1] are being backported
into this branch. The vendor hooks will be reapplied on top of these
backports in a way that matches the new structure of the code.

Link: https://lore.kernel.org/all/20231201172212.1813387-1-cmllamas@google.com/ [1]
Change-Id: I7f4aaab31b4462a40881c596abdcbef835a32e4a
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-05-29 20:51:37 +00:00
Carlos Llamas
6c17e786e7 Revert "ANDROID: vendor_hook: rename the the name of hooks"
This reverts commit db91c5d31a.

The alloc->mutex to spinlock_t patches from [1] are being backported
into this branch. The vendor hooks will be reapplied on top of these
backports in a way that matches the new structure of the code.

Link: https://lore.kernel.org/all/20231201172212.1813387-1-cmllamas@google.com/ [1]
Change-Id: I39dd50bb58a08f39942322ee014dd08ebbd83168
Signed-off-by: Carlos Llamas <cmllamas@google.com>
2024-05-29 20:51:37 +00:00
Bian Jin chen
dd875b6366 ANDROID: GKI: Update rockchip symbols for some usb wifi bt.
3 function symbol(s) added
  'int usb_anchor_empty(struct usb_anchor*)'
  'void usb_disable_autosuspend(struct usb_device*)'
  'void usb_reset_endpoint(struct usb_device*, unsigned int)'

Bug: 300024866
Signed-off-by: Bian Jin chen <kenjc.bian@rock-chips.com>
Change-Id: Ib1c613e2aca4ab7f4c29f044829505efd4544ef3
2024-05-29 17:22:31 +00:00
John Stultz
d3c340f987 UPSTREAM: selftests: timers: Fix valid-adjtimex signed left-shift undefined behavior
[ Upstream commit 076361362122a6d8a4c45f172ced5576b2d4a50d ]

The struct adjtimex freq field takes a signed value who's units are in
shifted (<<16) parts-per-million.

Unfortunately for negative adjustments, the straightforward use of:

  freq = ppm << 16 trips undefined behavior warnings with clang:

valid-adjtimex.c:66:6: warning: shifting a negative signed value is undefined [-Wshift-negative-value]
        -499<<16,
        ~~~~^
valid-adjtimex.c:67:6: warning: shifting a negative signed value is undefined [-Wshift-negative-value]
        -450<<16,
        ~~~~^
..

Fix it by using a multiply by (1 << 16) instead of shifting negative values
in the valid-adjtimex test case. Align the values for better readability.

Bug: 339526723
Reported-by: Lee Jones <joneslee@google.com>
Reported-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Change-Id: Ied611c13a802acf9c7a2427f0a61eb358b571a3d
Signed-off-by: John Stultz <jstultz@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Link: https://lore.kernel.org/r/20240409202222.2830476-1-jstultz@google.com
Link: https://lore.kernel.org/lkml/0c6d4f0d-2064-4444-986b-1d1ed782135f@collabora.com/
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 1f3484dec9)
Signed-off-by: Edward Liaw <edliaw@google.com>
2024-05-29 15:50:33 +00:00
Paul Lawrence
e302f3a21b ANDROID: incremental-fs: Make work with 16k pages
Bug: 260919895
Test: incfs_test passes
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Change-Id: Ia4fbb6011930b085bc00a36851e9b0e8559d3dc5
(cherry picked from commit 5ac10739bcf2dae9220a7a39392aa41235bc64c2)
2024-05-29 13:21:53 +00:00
Yifan Hong
3f13972470 Revert "BACKPORT: FROMGIT: module: allow UNUSED_KSYMS_WHITELIST ..."
Revert submission 3101887-android14-ksyms-wl

Reason for revert: Restore green in release builds

Reverted changes: /q/submissionid:3101887-android14-ksyms-wl

Change-Id: If86a1a6c7875bace543381575544590823cd092c
2024-05-28 17:13:04 +00:00
Yifan Hong
29f2af3ce7 BACKPORT: FROMGIT: module: allow UNUSED_KSYMS_WHITELIST to be relative against objtree.
If UNUSED_KSYMS_WHITELIST is a file generated
before Kbuild runs, and the source tree is in
a read-only filesystem, the developer must put
the file somewhere and specify an absolute
path to UNUSED_KSYMS_WHITELIST. This worked,
but if IKCONFIG=y, an absolute path is embedded
into .config and eventually into vmlinux, causing
the build to be less reproducible when building
on a different machine.

This patch makes the handling of
UNUSED_KSYMS_WHITELIST to be similar to
MODULE_SIG_KEY.

First, check if UNUSED_KSYMS_WHITELIST is an
absolute path, just as before this patch. If so,
use the path as is.

If it is a relative path, use wildcard to check
the existence of the file below objtree first.
If it does not exist, fall back to the original
behavior of adding $(srctree)/ before the value.

After this patch, the developer can put the generated
file in objtree, then use a relative path against
objtree in .config, eradicating any absolute paths
that may be evaluated differently on different machines.

Signed-off-by: Yifan Hong <elsk@google.com>
Reviewed-by: Elliot Berman <quic_eberman@quicinc.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>

(cherry picked from commit a2e3c811938b4902725e259c03b2d6c539613992
 https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git modules-next)
Bug: 333769605

Change-Id: I0696ac8f686329795034ada5a4587af4ecbb774f
[elsk: apply change to gen_autoksyms.sh instead because
 CONFIG_UNUSED_KSYMS_WHITELIST is parsed there. Revert change
 to Makefile.modpost.]
Bug: 342390208
Signed-off-by: Yifan Hong <elsk@google.com>
2024-05-28 16:18:05 +00:00
Matthias Maennich
6820762b5e FROMLIST: kheaders: explicitly define file modes for archived headers
Build environments might be running with different umask settings
resulting in indeterministic file modes for the files contained in
kheaders.tar.xz. The file itself is served with 444, i.e. world
readable. Archive the files explicitly with 744,a+X to improve
reproducibility across build environments.

--mode=0444 is not suitable as directories need to be executable. Also,
444 makes it hard to delete all the readonly files after extraction.

Cc: <stable@vger.kernel.org>
Cc: <linux-kbuild@vger.kernel.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Joel Fernandes <joel@joelfernandes.org>
Link: https://lore.kernel.org/r/20240528113243.827490-2-maennich@google.com
Bug: 342094847
Bug: 342393806
Change-Id: Ib206a6e0abfacf8132bfad8c43a62982062175fa
Signed-off-by: Matthias Maennich <maennich@google.com>
2024-05-28 14:08:23 +00:00
Giuliano Procida
47a00e599b ANDROID: pahole -J -j1 for reproducible BTF
Versions of pahole from 1.22 support multi-threaded operation with
separate CUs being processed independently. This results in
non-deterministic and effectively non-reproducible output for kernel
objects. Later versions of pahole aim to support determinism by
retiring CUs in order.

We regain determinism by restricting parallelism to 1 at the cost of
some performance.

The default parallelism of `pahole -J` is the number of online
processors * 1.1. Experiments on a workstation with 36 cores reveal
that performance is actually worse for `vmlinux` at `-j` (8.9s) than
at `-j3` (7.8s) and the optimum is around `-j9` (4.9s). No parallelism
is slowest (18.8s), but still acceptable for GKI.

Bug: 342094847
Change-Id: Ibd72ac638faa1826f6655b336cc7001591ea70f1
Signed-off-by: Giuliano Procida <gprocida@google.com>
2024-05-28 12:54:49 +00:00
Greg Kroah-Hartman
88690811da Linux 6.1.92
Link: https://lore.kernel.org/r/20240523130332.496202557@linuxfoundation.org
Tested-by: SeongJae Park <sj@kernel.org>
Tested-by: Mark Brown <broonie@kernel.org>
Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
Tested-by: Pavel Machek (CIP) <pavel@denx.de>
Tested-by: Shuah Khan <skhan@linuxfoundation.org>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Salvatore Bonaccorso <carnil@debian.org>
Tested-by: Mateusz Jończyk <mat.jonczyk@o2.pl>
Tested-by: Ron Economos <re@w6rz.net>
Tested-by: Kelsey Steele <kelseysteele@linux.microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-25 16:21:36 +02:00
Akira Yokosawa
b1c74dad43 docs: kernel_include.py: Cope with docutils 0.21
commit d43ddd5c91802a46354fa4c4381416ef760676e2 upstream.

Running "make htmldocs" on a newly installed Sphinx 7.3.7 ends up in
a build error:

    Sphinx parallel build error:
    AttributeError: module 'docutils.nodes' has no attribute 'reprunicode'

docutils 0.21 has removed nodes.reprunicode, quote from release note [1]:

  * Removed objects:

    docutils.nodes.reprunicode, docutils.nodes.ensure_str()
        Python 2 compatibility hacks

Sphinx 7.3.0 supports docutils 0.21 [2]:

kernel_include.py, whose origin is misc.py of docutils, uses reprunicode.

Upstream docutils removed the offending line from the corresponding file
(docutils/docutils/parsers/rst/directives/misc.py) in January 2022.
Quoting the changelog [3]:

    Deprecate `nodes.reprunicode` and `nodes.ensure_str()`.

    Drop uses of the deprecated constructs (not required with Python 3).

Do the same for kernel_include.py.

Tested against:
  - Sphinx 2.4.5 (docutils 0.17.1)
  - Sphinx 3.4.3 (docutils 0.17.1)
  - Sphinx 5.3.0 (docutils 0.18.1)
  - Sphinx 6.2.1 (docutils 0.19)
  - Sphinx 7.2.6 (docutils 0.20.1)
  - Sphinx 7.3.7 (docutils 0.21.2)

Link: http://www.docutils.org/RELEASE-NOTES.html#release-0-21-2024-04-09 [1]
Link: https://www.sphinx-doc.org/en/master/changes.html#release-7-3-0-released-apr-16-2024 [2]
Link: https://github.com/docutils/docutils/commit/c8471ce47a24 [3]
Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/faf5fa45-2a9d-4573-9d2e-3930bdc1ed65@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-25 16:21:35 +02:00
Thomas Weißschuh
cd82e9620e admin-guide/hw-vuln/core-scheduling: fix return type of PR_SCHED_CORE_GET
commit 8af2d1ab78f2342f8c4c3740ca02d86f0ebfac5a upstream.

sched_core_share_pid() copies the cookie to userspace with
put_user(id, (u64 __user *)uaddr), expecting 64 bits of space.
The "unsigned long" datatype that is documented in core-scheduling.rst
however is only 32 bits large on 32 bit architectures.

Document "unsigned long long" as the correct data type that is always
64bits large.

This matches what the selftest cs_prctl_test.c has been doing all along.

Fixes: 0159bb020c ("Documentation: Add usecases, design and interface for core scheduling")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/util-linux/df7a25a0-7923-4f8b-a527-5e6f0064074d@t-8ch.de/
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Reviewed-by: Chris Hyser <chris.hyser@oracle.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20240423-core-scheduling-cookie-v1-1-5753a35f8dfc@weissschuh.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-25 16:21:35 +02:00
Jarkko Sakkinen
681935009f KEYS: trusted: Do not use WARN when encode fails
commit 050bf3c793a07f96bd1e2fd62e1447f731ed733b upstream.

When asn1_encode_sequence() fails, WARN is not the correct solution.

1. asn1_encode_sequence() is not an internal function (located
   in lib/asn1_encode.c).
2. Location is known, which makes the stack trace useless.
3. Results a crash if panic_on_warn is set.

It is also noteworthy that the use of WARN is undocumented, and it
should be avoided unless there is a carefully considered rationale to
use it.

Replace WARN with pr_err, and print the return value instead, which is
only useful piece of information.

Cc: stable@vger.kernel.org # v5.13+
Fixes: f221974525 ("security: keys: trusted: use ASN.1 TPM2 key format for the blobs")
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-25 16:21:35 +02:00
AngeloGioacchino Del Regno
1d9e2de245 remoteproc: mediatek: Make sure IPI buffer fits in L2TCM
commit 331f91d86f71d0bb89a44217cc0b2a22810bbd42 upstream.

The IPI buffer location is read from the firmware that we load to the
System Companion Processor, and it's not granted that both the SRAM
(L2TCM) size that is defined in the devicetree node is large enough
for that, and while this is especially true for multi-core SCP, it's
still useful to check on single-core variants as well.

Failing to perform this check may make this driver perform R/W
operations out of the L2TCM boundary, resulting (at best) in a
kernel panic.

To fix that, check that the IPI buffer fits, otherwise return a
failure and refuse to boot the relevant SCP core (or the SCP at
all, if this is single core).

Fixes: 3efa0ea743 ("remoteproc/mediatek: read IPI buffer offset from FW")
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240321084614.45253-2-angelogioacchino.delregno@collabora.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-25 16:21:35 +02:00
Daniel Thompson
a6b9c5de4a serial: kgdboc: Fix NMI-safety problems from keyboard reset code
commit b2aba15ad6f908d1a620fd97f6af5620c3639742 upstream.

Currently, when kdb is compiled with keyboard support, then we will use
schedule_work() to provoke reset of the keyboard status.  Unfortunately
schedule_work() gets called from the kgdboc post-debug-exception
handler.  That risks deadlock since schedule_work() is not NMI-safe and,
even on platforms where the NMI is not directly used for debugging, the
debug trap can have NMI-like behaviour depending on where breakpoints
are placed.

Fix this by using the irq work system, which is NMI-safe, to defer the
call to schedule_work() to a point when it is safe to call.

Reported-by: Liuye <liu.yeC@h3c.com>
Closes: https://lore.kernel.org/all/20240228025602.3087748-1-liu.yeC@h3c.com/
Cc: stable@vger.kernel.org
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20240424-kgdboc_fix_schedule_work-v2-1-50f5a490aec5@linaro.org
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-25 16:21:35 +02:00
Javier Carrasco
3f4be9dbef usb: typec: tipd: fix event checking for tps6598x
commit 409c1cfb5a803f3cf2d17aeaf75c25c4be951b07 upstream.

The current interrupt service routine of the tps6598x only reads the
first 64 bits of the INT_EVENT1 and INT_EVENT2 registers, which means
that any event above that range will be ignored, leaving interrupts
unattended. Moreover, those events will not be cleared, and the device
will keep the interrupt enabled.

This issue has been observed while attempting to load patches, and the
'ReadyForPatch' field (bit 81) of INT_EVENT1 was set.

Given that older versions of the tps6598x (1, 2 and 6) provide 8-byte
registers, a mechanism based on the upper byte of the version register
(0x0F) has been included. The manufacturer has confirmed [1] that this
byte is always 0 for older versions, and either 0xF7 (DH parts) or 0xF9
(DK parts) is returned in newer versions (7 and 8).

Read the complete INT_EVENT registers to handle all interrupts generated
by the device and account for the hardware version to select the
register size.

Link: https://e2e.ti.com/support/power-management-group/power-management/f/power-management-forum/1346521/tps65987d-register-command-to-distinguish-between-tps6591-2-6-and-tps65987-8 [1]
Fixes: 0a4c005bd1 ("usb: typec: driver for TI TPS6598x USB Power Delivery controllers")
Cc: stable@vger.kernel.org
Signed-off-by: Javier Carrasco <javier.carrasco@wolfvision.net>
Link: https://lore.kernel.org/r/20240429-tps6598x_fix_event_handling-v3-2-4e8e58dce489@wolfvision.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-25 16:21:35 +02:00
Heikki Krogerus
f099b8127d usb: typec: ucsi: displayport: Fix potential deadlock
commit b791a67f68121d69108640d4a3e591d210ffe850 upstream.

The function ucsi_displayport_work() does not access the
connector, so it also must not acquire the connector lock.

This fixes a potential deadlock scenario:

ucsi_displayport_work() -> lock(&con->lock)
typec_altmode_vdm()
dp_altmode_vdm()
dp_altmode_work()
typec_altmode_enter()
ucsi_displayport_enter() -> lock(&con->lock)

Reported-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Fixes: af8622f6a5 ("usb: typec: ucsi: Support for DisplayPort alt mode")
Cc: stable@vger.kernel.org
Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Link: https://lore.kernel.org/r/20240507134316.161999-1-heikki.krogerus@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-25 16:21:35 +02:00
Jose Ignacio Tornos Martinez
17466488ae net: usb: ax88179_178a: fix link status when link is set to down/up
commit ecf848eb934b03959918f5269f64c0e52bc23998 upstream.

The idea was to keep only one reset at initialization stage in order to
reduce the total delay, or the reset from usbnet_probe or the reset from
usbnet_open.

I have seen that restarting from usbnet_probe is necessary to avoid doing
too complex things. But when the link is set to down/up (for example to
configure a different mac address) the link is not correctly recovered
unless a reset is commanded from usbnet_open.

So, detect the initialization stage (first call) to not reset from
usbnet_open after the reset from usbnet_probe and after this stage, always
reset from usbnet_open too (when the link needs to be rechecked).

Apply to all the possible devices, the behavior now is going to be the same.

cc: stable@vger.kernel.org # 6.6+
Fixes: 56f78615bcb1 ("net: usb: ax88179_178a: avoid writing the mac address before first reading")
Reported-by: Isaac Ganoung <inventor500@vivaldi.net>
Reported-by: Yongqin Liu <yongqin.liu@linaro.org>
Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240510090846.328201-1-jtornosm@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-25 16:21:34 +02:00
Prashanth K
341eb08dbc usb: dwc3: Wait unconditionally after issuing EndXfer command
commit 1d26ba0944d398f88aaf997bda3544646cf21945 upstream.

Currently all controller IP/revisions except DWC3_usb3 >= 310a
wait 1ms unconditionally for ENDXFER completion when IOC is not
set. This is because DWC_usb3 controller revisions >= 3.10a
supports GUCTL2[14: Rst_actbitlater] bit which allows polling
CMDACT bit to know whether ENDXFER command is completed.

Consider a case where an IN request was queued, and parallelly
soft_disconnect was called (due to ffs_epfile_release). This
eventually calls stop_active_transfer with IOC cleared, hence
send_gadget_ep_cmd() skips waiting for CMDACT cleared during
EndXfer. For DWC3 controllers with revisions >= 310a, we don't
forcefully wait for 1ms either, and we proceed by unmapping the
requests. If ENDXFER didn't complete by this time, it leads to
SMMU faults since the controller would still be accessing those
requests.

Fix this by ensuring ENDXFER completion by adding 1ms delay in
__dwc3_stop_active_transfer() unconditionally.

Cc: stable@vger.kernel.org
Fixes: b353eb6dc2 ("usb: dwc3: gadget: Skip waiting for CMDACT cleared during endxfer")
Signed-off-by: Prashanth K <quic_prashk@quicinc.com>
Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
Link: https://lore.kernel.org/r/20240502044103.1066350-1-quic_prashk@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-25 16:21:34 +02:00
Carlos Llamas
e78531e8ca binder: fix max_thread type inconsistency
commit 42316941335644a98335f209daafa4c122f28983 upstream.

The type defined for the BINDER_SET_MAX_THREADS ioctl was changed from
size_t to __u32 in order to avoid incompatibility issues between 32 and
64-bit kernels. However, the internal types used to copy from user and
store the value were never updated. Use u32 to fix the inconsistency.

Fixes: a9350fc859 ("staging: android: binder: fix BINDER_SET_MAX_THREADS declaration")
Reported-by: Arve Hjønnevåg <arve@android.com>
Cc: stable@vger.kernel.org
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/r/20240421173750.3117808-1-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-25 16:21:34 +02:00
Srinivasan Shanmugam
92cb363d16 drm/amdgpu: Fix possible NULL dereference in amdgpu_ras_query_error_status_helper()
commit b8d55a90fd55b767c25687747e2b24abd1ef8680 upstream.

Return invalid error code -EINVAL for invalid block id.

Fixes the below:

drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:1183 amdgpu_ras_query_error_status_helper() error: we previously assumed 'info' could be null (see line 1176)

Suggested-by: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: Tao Zhou <tao.zhou1@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Ajay: applied AMDGPU_RAS_BLOCK_COUNT condition to amdgpu_ras_query_error_status()
       as amdgpu_ras_query_error_status_helper() not present in v6.6, v6.1
       amdgpu_ras_query_error_status_helper() was introduced in 8cc0f5669eb6]
Signed-off-by: Ajay Kaher <ajay.kaher@broadcom.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-25 16:21:34 +02:00
Mark Rutland
a94cf76604 arm64: atomics: lse: remove stale dependency on JUMP_LABEL
commit 657eef0a54 upstream.

Currently CONFIG_ARM64_USE_LSE_ATOMICS depends upon CONFIG_JUMP_LABEL,
as the inline atomics were indirected with a static branch.

However, since commit:

  21fb26bfb0 ("arm64: alternatives: add alternative_has_feature_*()")

... we use an alternative_branch (which is always available) rather than
a static branch, and hence the dependency is unnecessary.

Remove the stale dependency, along with the stale include. This will
allow the use of LSE atomics in kernels built with CONFIG_JUMP_LABEL=n,
and reduces the risk of circular header dependencies via <asm/lse.h>.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20221114125424.2998268-1-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Oleksandr Tymoshenko <ovt@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-25 16:21:34 +02:00
Eric Sandeen
d9a85a8d82 xfs: short circuit xfs_growfs_data_private() if delta is zero
[ Upstream commit 84712492e6dab803bf595fb8494d11098b74a652 ]

Although xfs_growfs_data() doesn't call xfs_growfs_data_private()
if in->newblocks == mp->m_sb.sb_dblocks, xfs_growfs_data_private()
further massages the new block count so that we don't i.e. try
to create a too-small new AG.

This may lead to a delta of "0" in xfs_growfs_data_private(), so
we end up in the shrink case and emit the EXPERIMENTAL warning
even if we're not changing anything at all.

Fix this by returning straightaway if the block delta is zero.

(nb: in older kernels, the result of entering the shrink case
with delta == 0 may actually let an -ENOSPC escape to userspace,
which is confusing for users.)

Fixes: fb2fc17201 ("xfs: support shrinking unused space in the last AG")
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-25 16:21:34 +02:00
Hironori Shiina
fbdf080691 xfs: get root inode correctly at bulkstat
[ Upstream commit 817644fa45 ]

The root inode number should be set to `breq->startino` for getting stat
information of the root when XFS_BULK_IREQ_SPECIAL_ROOT is used.
Otherwise, the inode search is started from 1
(XFS_BULK_IREQ_SPECIAL_ROOT) and the inode with the lowest number in a
filesystem is returned.

Fixes: bf3cb39447 ("xfs: allow single bulkstat of special inodes")
Signed-off-by: Hironori Shiina <shiina.hironori@fujitsu.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-25 16:21:34 +02:00
Darrick J. Wong
7430ff84c2 xfs: fix log recovery when unknown rocompat bits are set
[ Upstream commit 74ad4693b6 ]

Log recovery has always run on read only mounts, even where the primary
superblock advertises unknown rocompat bits.  Due to a misunderstanding
between Eric and Darrick back in 2018, we accidentally changed the
superblock write verifier to shutdown the fs over that exact scenario.
As a result, the log cleaning that occurs at the end of the mounting
process fails if there are unknown rocompat bits set.

As we now allow writing of the superblock if there are unknown rocompat
bits set on a RO mount, we no longer want to turn off RO state to allow
log recovery to succeed on a RO mount.  Hence we also remove all the
(now unnecessary) RO state toggling from the log recovery path.

Fixes: 9e037cb797 ("xfs: check for unknown v5 feature bits in superblock write verifier"
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-25 16:21:33 +02:00
Darrick J. Wong
4db0e08ef9 xfs: allow inode inactivation during a ro mount log recovery
[ Upstream commit 76e589013f ]

In the next patch, we're going to prohibit log recovery if the primary
superblock contains an unrecognized rocompat feature bit even on
readonly mounts.  This requires removing all the code in the log
mounting process that temporarily disables the readonly state.

Unfortunately, inode inactivation disables itself on readonly mounts.
Clearing the iunlinked lists after log recovery needs inactivation to
run to free the unreferenced inodes, which (AFAICT) is the only reason
why log mounting plays games with the readonly state in the first place.

Therefore, change the inactivation predicates to allow inactivation
during log recovery of a readonly mount.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-25 16:21:33 +02:00
Darrick J. Wong
2cc027623e xfs: invalidate xfs_bufs when allocating cow extents
[ Upstream commit ddfdd530e4 ]

While investigating test failures in xfs/17[1-3] in alwayscow mode, I
noticed through code inspection that xfs_bmap_alloc_userdata isn't
setting XFS_ALLOC_USERDATA when allocating extents for a file's CoW
fork.  COW staging extents should be flagged as USERDATA, since user
data are persisted to these blocks before being remapped into a file.

This mis-classification has a few impacts on the behavior of the system.
First, the filestreams allocator is supposed to keep allocating from a
chosen AG until it runs out of space in that AG.  However, it only does
that for USERDATA allocations, which means that COW allocations aren't
tied to the filestreams AG.  Fortunately, few people use filestreams, so
nobody's noticed.

A more serious problem is that xfs_alloc_ag_vextent_small looks for a
buffer to invalidate *if* the USERDATA flag is set and the AG is so full
that the allocation had to come from the AGFL because the cntbt is
empty.  The consequences of not invalidating the buffer are severe --
if the AIL incorrectly checkpoints a buffer that is now being used to
store user data, that action will clobber the user's written data.

Fix filestreams and yet another data corruption vector by flagging COW
allocations as USERDATA.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-25 16:21:33 +02:00