Commit Graph

1155932 Commits

Author SHA1 Message Date
Gao Xiang
a18efa4e4a FROMGIT: erofs: fix ztailpacking for subpage compressed blocks
`pageofs_in` should be the compressed data offset of the page rather
than of the block.

Acked-by: Chao Yu <chao@kernel.org>
Reviewed-by: Yue Hu <huyue2@coolpad.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20231214161337.753049-1-hsiangkao@linux.alibaba.com

Bug: 318378021
Change-Id: I0997a69b22b0f42c327c810359f55f5fa6a76275
(cherry picked from commit e5aba911dee5e20fa82efbe13e0af8f38ea459e7
 https://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs.git dev)
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
2024-01-03 18:37:43 +00:00
Gao Xiang
0c6a18c75b BACKPORT: FROMGIT: erofs: fix up compacted indexes for block size < 4096
Previously, the block size always equaled to PAGE_SIZE, therefore
`lclusterbits` couldn't be less than 12.

Since sub-page compressed blocks are now considered, `lobits` for
a lcluster in each pack cannot always be `lclusterbits` as before.
Otherwise, there is no enough room for the special value
`Z_EROFS_VLE_DI_D0_CBLKCNT`.

To support smaller block sizes, `lobits` for each compacted lcluster is
now calculated as:
   lobits = max(lclusterbits, ilog2(Z_EROFS_VLE_DI_D0_CBLKCNT) + 1)

Reviewed-by: Yue Hu <huyue2@coolpad.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20231206091057.87027-4-hsiangkao@linux.alibaba.com

Bug: 318378021
Change-Id: Iacd89e2b33ddf39ea40b90e88a2bf99bb5a83b31
(cherry picked from commit 8d2517aaeea3ab8651bb517bca8f3c8664d318ea
 https: //git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs.git dev)
[dhavale: resolved conflicts in zmap.c due to older naming of constants
and updated commit message also to use the older names]
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
2024-01-03 18:37:43 +00:00
Gao Xiang
d7bb85f1cb FROMGIT: erofs: record pclustersize in bytes instead of pages
Currently, compressed sizes are recorded in pages using `pclusterpages`,
However, for tailpacking pclusters, `tailpacking_size` is used instead.

This approach doesn't work when dealing with sub-page blocks. To address
this, let's switch them to the unified `pclustersize` in bytes.

Reviewed-by: Yue Hu <huyue2@coolpad.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20231206091057.87027-3-hsiangkao@linux.alibaba.com

Bug: 318378021
Change-Id: Ia8c50a7b4adcf6cd161b1d6f8bfc5a7fd3371079
(cherry picked from commit 54ed3fdd66055d073cb1cd2c6c65bbc0683c40cf
 https: //git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs.git dev)
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
2024-01-03 18:37:43 +00:00
Gao Xiang
9d259220ac FROMGIT: erofs: support I/O submission for sub-page compressed blocks
Add a basic I/O submission path first to support sub-page blocks:

 - Temporary short-lived pages will be used entirely;

 - In-place I/O pages can be used partially, but compressed pages need
   to be able to be mapped in contiguous virtual memory.

As a start, currently cache decompression is explicitly disabled for
sub-page blocks, which will be supported in the future.

Reviewed-by: Yue Hu <huyue2@coolpad.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20231206091057.87027-2-hsiangkao@linux.alibaba.com

Bug: 318378021
Change-Id: Ib2cb6120805ab479a450580fc8774af131271791
(cherry picked from commit 192351616a9dde686492bcb9d1e4895a1411a527
 https: //git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs.git dev)
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
2024-01-03 18:37:43 +00:00
Gao Xiang
8a49ea9441 FROMGIT: erofs: fix lz4 inplace decompression
Currently EROFS can map another compressed buffer for inplace
decompression, that was used to handle the cases that some pages of
compressed data are actually not in-place I/O.

However, like most simple LZ77 algorithms, LZ4 expects the compressed
data is arranged at the end of the decompressed buffer and it
explicitly uses memmove() to handle overlapping:
  __________________________________________________________
 |_ direction of decompression --> ____ |_ compressed data _|

Although EROFS arranges compressed data like this, it typically maps two
individual virtual buffers so the relative order is uncertain.
Previously, it was hardly observed since LZ4 only uses memmove() for
short overlapped literals and x86/arm64 memmove implementations seem to
completely cover it up and they don't have this issue.  Juhyung reported
that EROFS data corruption can be found on a new Intel x86 processor.
After some analysis, it seems that recent x86 processors with the new
FSRM feature expose this issue with "rep movsb".

Let's strictly use the decompressed buffer for lz4 inplace
decompression for now.  Later, as an useful improvement, we could try
to tie up these two buffers together in the correct order.

Reported-and-tested-by: Juhyung Park <qkrwngud825@gmail.com>
Closes: https://lore.kernel.org/r/CAD14+f2AVKf8Fa2OO1aAUdDNTDsVzzR6ctU_oJSmTyd6zSYR2Q@mail.gmail.com
Fixes: 0ffd71bcc3 ("staging: erofs: introduce LZ4 decompression inplace")
Fixes: 598162d050 ("erofs: support decompress big pcluster for lz4 backend")
Cc: stable <stable@vger.kernel.org> # 5.4+
Tested-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20231206045534.3920847-1-hsiangkao@linux.alibaba.com

Bug: 318378021
Change-Id: Ifd2981320f9f79b27bc7484d8906501a2fa05359
(cherry picked from commit 3c12466b6b7bf1e56f9b32c366a3d83d87afb4de
 https://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs.git dev)
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
2024-01-03 18:37:43 +00:00
Gao Xiang
bdc5d268ba FROMGIT: erofs: fix memory leak on short-lived bounced pages
Both MicroLZMA and DEFLATE algorithms can use short-lived pages on
demand for the overlapped inplace I/O decompression.

However, those short-lived pages are actually added to
`be->compressed_pages`.  Thus, it should be checked instead of
`pcl->compressed_bvecs`.

The LZ4 algorithm doesn't work like this, so it won't be impacted.

Fixes: 67139e36d9 ("erofs: introduce `z_erofs_parse_in_bvecs'")
Reviewed-by: Yue Hu <huyue2@coolpad.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20231128180431.4116991-1-hsiangkao@linux.alibaba.com

Bug: 318378021
Change-Id: Ia1f602e9944b884022a3e20db12af568304fd80c
(cherry picked from commit 93d6fda7f926451a0fa1121b9558d75ca47e861e
https://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs.git dev)
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
2024-01-03 18:37:43 +00:00
Gao Xiang
0d329bbe5c BACKPORT: erofs: tidy up z_erofs_do_read_page()
- Fix a typo: spiltted => split;

 - Move !EROFS_MAP_MAPPED and EROFS_MAP_FRAGMENT upwards;

 - Increase `split` in advance to avoid unnecessary repeats.

Reviewed-by: Yue Hu <huyue2@coolpad.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230817082813.81180-4-hsiangkao@linux.alibaba.com

Bug: 318378021
Change-Id: I465fd33c7cbbe91d5da4b4ee2343a7b319534148
(cherry picked from commit e4c1cf523d)
[dhavale: resolved small conflict in zdata.c in z_erofs_do_read_page()]
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
2024-01-03 18:37:43 +00:00
Gao Xiang
dc94c3cc6b UPSTREAM: erofs: move preparation logic into z_erofs_pcluster_begin()
Some preparation logic should be part of z_erofs_pcluster_begin()
instead of z_erofs_do_read_page().  Let's move now.

Reviewed-by: Yue Hu <huyue2@coolpad.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230817082813.81180-3-hsiangkao@linux.alibaba.com

Bug: 318378021
(cherry picked from commit aeebae9d77)
Change-Id: I4bf438d719742a18a6f3065a78bf027de5dae293
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
2024-01-03 18:37:43 +00:00
Gao Xiang
7751567a71 BACKPORT: erofs: avoid obsolete {collector,collection} terms
{collector,collection} were once reserved in order to indicate different
runtime logical extent instance of multi-reference pclusters.

However, de-duplicated decompression has been landed in a more flexable
way, thus `struct z_erofs_collection` was formally removed in commit
87ca34a706 ("erofs: get rid of `struct z_erofs_collection'").

Let's handle the remaining leftovers, for example:
    `z_erofs_collector_begin` => `z_erofs_pcluster_begin`
    `z_erofs_collector_end` => `z_erofs_pcluster_end`

as well as some comments.  No logic changes.

Reviewed-by: Yue Hu <huyue2@coolpad.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230817082813.81180-2-hsiangkao@linux.alibaba.com

Bug: 318378021
Change-Id: I61b812b5ae3dd564e52012d082415b1fc198383d
(cherry picked from commit dcba1b232e)
[dhavale: fixed minor conflict zdata.c in z_erofs_do_read_page()]
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
2024-01-03 18:37:43 +00:00
Gao Xiang
d0dbf74792 BACKPORT: erofs: simplify z_erofs_read_fragment()
A trivial cleanup to make the fragment handling logic more clear.

Reviewed-by: Yue Hu <huyue2@coolpad.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230817082813.81180-1-hsiangkao@linux.alibaba.com

Bug: 318378021
Change-Id: I50c09c65b7d3da5022cfc2ede27aa31a1b331d29
(cherry picked from commit 8b00be163f)
[dhavale: resolved conflict around erofs_bread() in zdata.c]
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
2024-01-03 18:37:43 +00:00
Gao Xiang
4067dd9969 UPSTREAM: erofs: get rid of the remaining kmap_atomic()
It's unnecessary to use kmap_atomic() compared with kmap_local_page().
In addition, kmap_atomic() is deprecated now.

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Reviewed-by: Yue Hu <huyue2@coolpad.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Link: https://lore.kernel.org/r/20230627161240.331-1-hsiangkao@linux.alibaba.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>

Bug: 318378021
(cherry picked from commit 123ec246eb)
Change-Id: I7efee861bb4f079fe6b79123d554be2e1867d13b
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
2024-01-03 18:37:43 +00:00
Gao Xiang
365ca16da2 UPSTREAM: erofs: simplify z_erofs_transform_plain()
Use memcpy_to_page() instead of open-coding them.

In addition, add a missing flush_dcache_page() even though almost all
modern architectures clear `PG_dcache_clean` flag for new file cache
pages so that it doesn't change anything in practice.

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Reviewed-by: Yue Hu <huyue2@coolpad.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Link: https://lore.kernel.org/r/20230627161240.331-2-hsiangkao@linux.alibaba.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>

Bug: 318378021
(cherry picked from commit c5539762f3)
Change-Id: I4cb665b592936502ca95e2aee20e1c3a56103ff5
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
2024-01-03 18:37:43 +00:00
Gao Xiang
187d034575 BACKPORT: erofs: adapt managed inode operations into folios
This patch gets rid of erofs_try_to_free_cached_page() and fold it
into .release_folio().

It also moves managed inode operations into zdata.c, which simplifies
the code a bit.  No logic changes.

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Reviewed-by: Yue Hu <huyue2@coolpad.com>
Link: https://lore.kernel.org/r/20230526201459.128169-5-hsiangkao@linux.alibaba.com

Bug: 318378021
Change-Id: I5cb1e44769f68edce788cb4f8084bb3d45b594b3
(cherry picked from commit 7b4e372c36)
[dhavale: changes to internal.h applied manually]
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
2024-01-03 18:37:43 +00:00
Gao Xiang
3d93182661 UPSTREAM: erofs: avoid on-stack pagepool directly passed by arguments
On-stack pagepool is used so that short-lived temporary pages could be
shared within a single I/O request (e.g. among multiple pclusters).

Moving the remaining frontend-related uses into
z_erofs_decompress_frontend to avoid too many arguments.

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Reviewed-by: Yue Hu <huyue2@coolpad.com>
Link: https://lore.kernel.org/r/20230526201459.128169-3-hsiangkao@linux.alibaba.com

Bug: 318378021
(cherry picked from commit 6ab5eed600)
Change-Id: I57d3ba6087904bb40c55b780aca50c16bfba2c0f
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
2024-01-03 18:37:43 +00:00
Gao Xiang
5c1827383a UPSTREAM: erofs: allocate extra bvec pages directly instead of retrying
If non-bootstrap bvecs cannot be kept in place (very rarely), an extra
short-lived page is allocated.

Let's just allocate it immediately rather than do unnecessary -EAGAIN
return first and retry as a cleanup.  Also it's unnecessary to use
__GFP_NOFAIL here since we could gracefully fail out this case instead.

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Reviewed-by: Yue Hu <huyue2@coolpad.com>
Link: https://lore.kernel.org/r/20230526201459.128169-2-hsiangkao@linux.alibaba.com

Bug: 318378021
(cherry picked from commit 05b63d2beb)
Change-Id: I2ac45a943060406bcbb741c5f7aa1094f783f906
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
2024-01-03 18:37:43 +00:00
Yue Hu
bed20ed1d3 UPSTREAM: erofs: clean up z_erofs_pcluster_readmore()
`end` parameter is no needed since it's pointless for !backmost, we can
handle it with backmost internally.  And we only expand the trailing
edge, so the newstart can be replaced with ->headoffset.

Also, remove linux/prefetch.h inclusion since that is not used anymore
after commit 386292919c ("erofs: introduce readmore decompression
strategy").

Signed-off-by: Yue Hu <huyue2@coolpad.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230525072605.17857-1-zbestahu@gmail.com
[ Gao Xiang: update commit description. ]
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>

Bug: 318378021
(cherry picked from commit 796e9149a2)
Change-Id: I9412c4111800077c876a43c4256ce9760a7d902e
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
2024-01-03 18:37:43 +00:00
Yue Hu
5e861fa97e UPSTREAM: erofs: remove the member readahead from struct z_erofs_decompress_frontend
The struct member is only used to add REQ_RAHEAD during I/O submission.
So it is cleaner to pass it as a parameter than keep it in the struct.

Also, rename function z_erofs_get_sync_decompress_policy() to
z_erofs_is_sync_decompress() for better clarity and conciseness.

Signed-off-by: Yue Hu <huyue2@coolpad.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230524063944.1655-1-zbestahu@gmail.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>

Bug: 318378021
(cherry picked from commit ef4b4b46c6)
Change-Id: I59cc13e7499968a1e93e13df1cb43a5123d510d9
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
2024-01-03 18:37:43 +00:00
Yue Hu
66595bb17c UPSTREAM: erofs: fold in z_erofs_decompress()
No need this helper since it's just a simple wrapper for decompress
method and only one caller.  So, let's fold in directly instead.

Signed-off-by: Yue Hu <huyue2@coolpad.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230426084449.12781-1-zbestahu@gmail.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>

Bug: 318378021
(cherry picked from commit 597e2953ae)
Change-Id: I849360f088016cf97542858e8a5a9cee671a2f61
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
2024-01-03 18:37:43 +00:00
Jingbo Xu
88a1939504 UPSTREAM: erofs: enable large folios for iomap mode
Enable large folios for iomap mode.  Then the readahead routine will
pass down large folios containing multiple pages.

Let's enable this for non-compressed format for now, until the
compression part supports large folios later.

When large folios supported, the iomap routine will allocate iomap_page
for each large folio and thus we need iomap_release_folio() and
iomap_invalidate_folio() to free iomap_page when these folios get
reclaimed or invalidated.

Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Link: https://lore.kernel.org/r/20221130060455.44532-1-jefflexu@linux.alibaba.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>

Bug: 318378021
Change-Id: Iedbb9a2daf132399b7a1b5ea6905977ba123ba3c
(cherry picked from commit ce529cc25b)
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
2024-01-03 18:37:43 +00:00
leonardian
2c085909e7 ANDROID: Update the ABI symbol list
Adding the following symbols:
  - _dev_alert

Bug: 311337219
Change-Id: Iaf6710842c45921ccfbacd1361e0b57401cf65d9
Signed-off-by: leonardian <leonardian@google.com>
2024-01-03 11:28:59 +00:00
Roy Luo
d16a15fde5 UPSTREAM: USB: gadget: core: adjust uevent timing on gadget unbind
The KOBJ_CHANGE uevent is sent before gadget unbind is actually
executed, resulting in inaccurate uevent emitted at incorrect timing
(the uevent would have USB_UDC_DRIVER variable set while it would
soon be removed).
Move the KOBJ_CHANGE uevent to the end of the unbind function so that
uevent is sent only after the change has been made.

Fixes: 2ccea03a8f ("usb: gadget: introduce UDC Class")
Cc: stable@vger.kernel.org
Signed-off-by: Roy Luo <royluo@google.com>
Link: https://lore.kernel.org/r/20231128221756.2591158-1-royluo@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 312543856
Change-Id: Ida7fa7e1cfae3d1b3f3348512a67fe91065f25af
(cherry picked from commit 73ea73affe8622bdf292de898da869d441da6a9d)
Signed-off-by: Roy Luo <royluo@google.com>
2024-01-02 21:26:12 +00:00
xieliujie
d3006fb944 ANDROID: ABI: Update oplus symbol list
1 function symbol(s) added
  'int __traceiter_android_vh_rt_mutex_steal(void*, int, int, bool*)'

1 variable symbol(s) added
  'struct tracepoint __tracepoint_android_vh_rt_mutex_steal'

Bug: 317670024
Change-Id: I28f0379adaec041400e49cbd1e497b2f8c5c893d
Signed-off-by: xeiliujie <xieliujie@oppo.com>
2023-12-25 15:22:53 +08:00
xieliujie
bc97d5019a ANDROID: vendor_hooks: Add hooks for rt_mutex steal
Add hooks at rt_mutex_steal function so that oems can decide
whether tasks with the same priority steal the rt_mutex or
not. We did experiments and found that rt_mutex throughput
can benefit a lot when threads with the same priority can
steal the rt_mutex lock.

Bug: 317670024
Change-Id: Id60a7a41c6c77a67808982d3667946cabe4acc8f
Signed-off-by: xeiliujie <xieliujie@oppo.com>
2023-12-25 15:22:46 +08:00
Wu Bo
401a2769d9 UPSTREAM: dm verity: don't perform FEC for failed readahead IO
We found an issue under Android OTA scenario that many BIOs have to do
FEC where the data under dm-verity is 100% complete and no corruption.

Android OTA has many dm-block layers, from upper to lower:
dm-verity
dm-snapshot
dm-origin & dm-cow
dm-linear
ufs

DM tables have to change 2 times during Android OTA merging process.
When doing table change, the dm-snapshot will be suspended for a while.
During this interval, many readahead IOs are submitted to dm_verity
from filesystem. Then the kverity works are busy doing FEC process
which cost too much time to finish dm-verity IO. This causes needless
delay which feels like system is hung.

After adding debugging it was found that each readahead IO needed
around 10s to finish when this situation occurred. This is due to IO
amplification:

dm-snapshot suspend
erofs_readahead     // 300+ io is submitted
	dm_submit_bio (dm_verity)
		dm_submit_bio (dm_snapshot)
		bio return EIO
		bio got nothing, it's empty
	verity_end_io
	verity_verify_io
	forloop range(0, io->n_blocks)    // each io->nblocks ~= 20
		verity_fec_decode
		fec_decode_rsb
		fec_read_bufs
		forloop range(0, v->fec->rsn) // v->fec->rsn = 253
			new_read
			submit_bio (dm_snapshot)
		end loop
	end loop
dm-snapshot resume

Readahead BIOs get nothing while dm-snapshot is suspended, so all of
them will cause verity's FEC.
Each readahead BIO needs to verify ~20 (io->nblocks) blocks.
Each block needs to do FEC, and every block needs to do 253
(v->fec->rsn) reads.
So during the suspend interval(~200ms), 300 readahead BIOs trigger
~1518000 (300*20*253) IOs to dm-snapshot.

As readahead IO is not required by userspace, and to fix this issue,
it is best to pass readahead errors to upper layer to handle it.

Cc: stable@vger.kernel.org
Fixes: a739ff3f54 ("dm verity: add support for forward error correction")
Bug: 316972624
Link: https://lore.kernel.org/dm-devel/b84fb49-bf63-3442-8c99-d565e134f2@redhat.com
Signed-off-by: Wu Bo <bo.wu@vivo.com>
Reviewed-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Akilesh Kailash <akailash@google.com>
(cherry picked from commit 0193e3966ceeeef69e235975918b287ab093082b)
Change-Id: I73560e5660cebdc1997e1f9926cbb8888789eb46
2023-12-21 22:46:28 +00:00
Florian Westphal
30bca9e278 UPSTREAM: netfilter: nft_set_pipapo: skip inactive elements during set walk
commit 317eb9685095678f2c9f5a8189de698c5354316a upstream.

Otherwise set elements can be deactivated twice which will cause a crash.

Bug: 316310313
Reported-by: Xingyuan Mo <hdthky0@gmail.com>
Fixes: 3c4287f620 ("nf_tables: Add set type for arbitrary concatenation of ranges")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 189c2a8293)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I27fb6ee806642e23ca02700763a387341dd463e6
2023-12-21 11:15:42 +00:00
Charan Teja Kalla
44702d8fa1 FROMLIST: mm: migrate high-order folios in swap cache correctly
Large folios occupy N consecutive entries in the swap cache instead of
using multi-index entries like the page cache.  However, if a large folio
is re-added to the LRU list, it can be migrated.  The migration code was
not aware of the difference between the swap cache and the page cache and
assumed that a single xas_store() would be sufficient.

This leaves potentially many stale pointers to the now-migrated folio in
the swap cache, which can lead to almost arbitrary data corruption in the
future.  This can also manifest as infinite loops with the RCU read lock
held.

Bug: 315281107
Change-Id: I455f964a9f21c13089890073777388236b6669d7
[willy@infradead.org: modifications to the changelog & tweaked the fix]
Fixes: 3417013e0d ("mm/migrate: Add folio_migrate_mapping()")
Link: https://lkml.kernel.org/r/20231214045841.961776-1-willy@infradead.org
Link: https://lore.kernel.org/linux-mm/20231214045841.961776-1-willy@infradead.org/
Signed-off-by: Charan Teja Kalla <quic_charante@quicinc.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reported-by: Charan Teja Kalla <quic_charante@quicinc.com>
Closes: https://lkml.kernel.org/r/1700569840-17327-1-git-send-email-quic_charante@quicinc.com
Cc: David Hildenbrand <david@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Charan Teja Kalla <quic_charante@quicinc.com>
2023-12-21 00:41:24 +00:00
Paul Lawrence
613d8368e3 ANDROID: fuse-bpf: Follow mounts in lookups
Bug: 292925770
Test: fuse_test run. The following steps on Android also now pass:

	Create /data/123 and /data/media/0/Android/data/45 directories
	Mount /data/123 directory to /data/media/0/Android/data/45 directory
	Create 1.txt under the /data/123 directory

	File 1.txt should appear in /storage/emulated/0/Android/data/45
Change-Id: I1fe27d743ca2981e624a9aa87d9ab6deb313aadc
Signed-off-by: Paul Lawrence <paullawrence@google.com>
2023-12-20 23:12:56 +00:00
Kever Yang
07775f9683 ANDROID: GKI: Add symbols for rockchip sata
INFO: 24 function symbol(s) added
  'size_t __scsi_format_command(char*, size_t, const unsigned char*, size_t)'
  'int attribute_container_register(struct attribute_container*)'
  'int attribute_container_unregister(struct attribute_container*)'
  'void pci_intx(struct pci_dev*, int)'
  'int pcim_iomap_regions_request_all(struct pci_dev*, int, const char*)'
  'void pcim_pin_device(struct pci_dev*)'
  'int reset_control_rearm(struct reset_control*)'
  'enum scsi_disposition scsi_check_sense(struct scsi_cmnd*)'
  'int scsi_device_set_state(struct scsi_device*, enum scsi_device_state)'
  'void scsi_eh_finish_cmd(struct scsi_cmnd*, struct list_head*)'
  'void scsi_eh_flush_done_q(struct list_head*)'
  'int scsi_rescan_device(struct scsi_device*)'
  'void scsi_schedule_eh(struct Scsi_Host*)'
  'const u8* scsi_sense_desc_find(const u8*, int, int)'
  'int scsi_set_sense_field_pointer(u8*, int, u16, u8, bool)'
  'void sdev_evt_send_simple(struct scsi_device*, enum scsi_device_event, gfp_t)'
  'bool system_entering_hibernation()'
  'int transport_add_device(struct device*)'
  'int transport_class_register(struct transport_class*)'
  'void transport_class_unregister(struct transport_class*)'
  'void transport_configure_device(struct device*)'
  'void transport_destroy_device(struct device*)'
  'void transport_remove_device(struct device*)'
  'void transport_setup_device(struct device*)'

Bug: 300024866
Change-Id: I6a505d48d0d199a710b0d93b6a8df189735a7b89
Signed-off-by: Kever Yang <kever.yang@rock-chips.com>
2023-12-19 18:44:06 +00:00
Rick Yiu
f44d373b32 ANDROID: sched: Add trace_android_rvh_setscheduler
Sync to android13-5.10. This vendor hook is declared already.

Bug: 245675204
Change-Id: Ib081b52542380d22317f225a50b553cda5f2634c
Signed-off-by: Rick Yiu <rickyiu@google.com>
(cherry picked from commit f9688670ca)
2023-12-19 09:04:16 +00:00
John Scheible
efa8f34b5a ANDROID: Update the ABI symbol list
Adding the following symbols:
  - dma_fence_enable_sw_signaling
  - dma_fence_unwrap_first
  - __dma_fence_unwrap_merge
  - dma_fence_unwrap_next

3 function symbol(s) added
  'struct dma_fence* __dma_fence_unwrap_merge(unsigned int, struct dma_fence**, struct dma_fence_unwrap*)'
  'struct dma_fence* dma_fence_unwrap_first(struct dma_fence*, struct dma_fence_unwrap*)'
  'struct dma_fence* dma_fence_unwrap_next(struct dma_fence_unwrap*)'

Bug: 316212868
Change-Id: I41a4d906e98c983c4b612f65127bd7ef7ac5cb85
Signed-off-by: John Scheible <johnscheible@google.com>
2023-12-19 03:09:43 +00:00
cuiyangpei
cee8ebf7c5 ANDROID: GKI: build damon for monitoring virtual address spaces
Enable damon related configs in gki_defconfig.

Bug: 300502883
Change-Id: Ie00a923464d2f1fff8f12a8804cbac040f0cacdf
Signed-off-by: cuiyangpei <cuiyangpei@xiaomi.com>
2023-12-16 01:38:42 +00:00
SeongJae Park
31c59d59c7 UPSTREAM: mm/damon/sysfs-schemes: handle tried region directory allocation failure
DAMON sysfs interface's before_damos_apply callback
(damon_sysfs_before_damos_apply()), which creates the DAMOS tried regions
for each DAMOS action applied region, is not handling the allocation
failure for the sysfs directory data.  As a result, NULL pointer
derefeence is possible.  Fix it by handling the case.

Link: https://lkml.kernel.org/r/20231106233408.51159-4-sj@kernel.org
Fixes: f1d13cacab ("mm/damon/sysfs: implement DAMOS tried regions update command")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org>	[6.2+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

(cherry picked from commit ae636ae2bbfd9279f5681dbf320d1da817e52b68)

Bug: 300502883
Change-Id: I98568f4b0cee9fea82f4fe6d3e7a505370c3c304
Signed-off-by: cui yangpei <cuiyangpei@xiaomi.com>
2023-12-16 01:38:42 +00:00
SeongJae Park
1cedfc05e9 UPSTREAM: mm/damon/sysfs-schemes: handle tried regions sysfs directory allocation failure
DAMOS tried regions sysfs directory allocation function
(damon_sysfs_scheme_regions_alloc()) is not handling the memory allocation
failure.  In the case, the code will dereference NULL pointer.  Handle the
failure to avoid such invalid access.

Link: https://lkml.kernel.org/r/20231106233408.51159-3-sj@kernel.org
Fixes: 9277d0367b ("mm/damon/sysfs-schemes: implement scheme region directory")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org>	[6.2+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

(cherry picked from commit 84055688b6bc075c92a88e2d6c3ad26ab93919f9)

Bug: 300502883
Change-Id: I86ecb2f3cf1604199b5567576b1fa583914f7f36
Signed-off-by: cui yangpei <cuiyangpei@xiaomi.com>
2023-12-16 01:38:42 +00:00
SeongJae Park
7fbeab3c65 UPSTREAM: mm/damon/sysfs: check error from damon_sysfs_update_target()
Patch series "mm/damon/sysfs: fix unhandled return values".

Some of DAMON sysfs interface code is not handling return values from some
functions.  As a result, confusing user input handling or NULL-dereference
is possible.  Check those properly.

This patch (of 3):

damon_sysfs_update_target() returns error code for failures, but its
caller, damon_sysfs_set_targets() is ignoring that.  The update function
seems making no critical change in case of such failures, but the behavior
will look like DAMON sysfs is silently ignoring or only partially
accepting the user input.  Fix it.

Link: https://lkml.kernel.org/r/20231106233408.51159-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20231106233408.51159-2-sj@kernel.org
Fixes: 19467a950b49 ("mm/damon/sysfs: remove requested targets when online-commit inputs")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org>	[5.19+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

(cherry picked from commit b4936b544b08ed44949055b92bd25f77759ebafc)

Bug: 300502883
Change-Id: I9bfea66f76ad094ed73defee5ff3fdb3794e8162
Signed-off-by: cui yangpei <cuiyangpei@xiaomi.com>
2023-12-16 01:38:42 +00:00
Dan Carpenter
606444fd06 UPSTREAM: mm/damon/sysfs: eliminate potential uninitialized variable warning
The "err" variable is not initialized if damon_target_has_pid(ctx) is
false and sys_target->regions->nr is zero.

Link: https://lkml.kernel.org/r/739e6aaf-a634-4e33-98a8-16546379ec9f@moroto.mountain
Fixes: 0bcd216c4741 ("mm/damon/sysfs: update monitoring target regions for online input commit")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

(cherry picked from commit 85c2ceaafbd306814a3a4740bf4d95ac26a8b36a)

Bug: 300502883
Change-Id: I235ea1bfc9d8bf0fef426dbc21881d755e3a5d67
Signed-off-by: cui yangpei <cuiyangpei@xiaomi.com>
2023-12-16 01:38:42 +00:00
SeongJae Park
c132d077eb UPSTREAM: mm/damon/sysfs: update monitoring target regions for online input commit
When user input is committed online, DAMON sysfs interface is ignoring the
user input for the monitoring target regions.  Such request is valid and
useful for fixed monitoring target regions-based monitoring ops like
'paddr' or 'fvaddr'.

Update the region boundaries as user specified, too.  Note that the
monitoring results of the regions that overlap between the latest
monitoring target regions and the new target regions are preserved.

Treat empty monitoring target regions user request as a request to just
make no change to the monitoring target regions.  Otherwise, users should
set the monitoring target regions same to current one for every online
input commit, and it could be challenging for dynamic monitoring target
regions update DAMON ops like 'vaddr'.  If the user really need to remove
all monitoring target regions, they can simply remove the target and then
create the target again with empty target regions.

Link: https://lkml.kernel.org/r/20231031170131.46972-1-sj@kernel.org
Fixes: da87878010 ("mm/damon/sysfs: support online inputs update")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org>	[5.19+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

(cherry picked from commit 9732336006764e2ee61225387e3c70eae9139035)

Bug: 300502883
Change-Id: I6857482470951382c9be36f2099da76e9b71d502
Signed-off-by: cui yangpei <cuiyangpei@xiaomi.com>
2023-12-16 01:38:42 +00:00
SeongJae Park
6b7c4cc262 UPSTREAM: mm/damon/sysfs: remove requested targets when online-commit inputs
damon_sysfs_set_targets(), which updates the targets of the context for
online commitment, do not remove targets that removed from the
corresponding sysfs files.  As a result, more than intended targets of the
context can exist and hence consume memory and monitoring CPU resource
more than expected.

Fix it by removing all targets of the context and fill up again using the
user input.  This could cause unnecessary memory dealloc and realloc
operations, but this is not a hot code path.  Also, note that damon_target
is stateless, and hence no data is lost.

[sj@kernel.org: fix unnecessary monitoring results removal]
  Link: https://lkml.kernel.org/r/20231028213353.45397-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20231022210735.46409-2-sj@kernel.org
Fixes: da87878010 ("mm/damon/sysfs: support online inputs update")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: <stable@vger.kernel.org>	[5.19.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

(cherry picked from commit 19467a950b49432a84bf6dbadbbb17bdf89418b7)

Bug: 300502883
Change-Id: Icf094f138e6810182d23d2c412fbabe3ecd960fe
Signed-off-by: cui yangpei <cuiyangpei@xiaomi.com>
2023-12-16 01:38:42 +00:00
SeongJae Park
1e19db10e7 UPSTREAM: mm/damon/sysfs: avoid empty scheme tried regions for large apply interval
DAMON_SYSFS assumes all schemes will be applied for at least one DAMON
monitoring results snapshot within one aggregation interval, or makes no
sense to wait for it while DAMON is deactivated by the watermarks.  That
for deactivated status still makes sense, but the aggregation interval
based assumption is invalid now because each scheme can has its own apply
interval.  For schemes having larger than the aggregation or watermarks
check interval, DAMOS tried regions update request can be finished without
the update.  Avoid the case by explicitly checking the status of the
schemes tried regions update and watermarks based DAMON deactivation.

Link: https://lkml.kernel.org/r/20231012192256.33556-3-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

(cherry picked from commit 76126332c7606ba25a4ae5db37145fd526985b45)

Bug: 300502883
Change-Id: I8283709a023123d7a89fd37a1d4a834888c15c7e
Signed-off-by: cui yangpei <cuiyangpei@xiaomi.com>
2023-12-16 01:38:42 +00:00
SeongJae Park
c194e597cb UPSTREAM: mm/damon/sysfs-schemes: do not update tried regions more than one DAMON snapshot
Patch series "mm/damon/sysfs-schemes: Do DAMOS tried regions update for
only one apply interval".

DAMOS tried regions update feature of DAMON sysfs interface is doing the
update for one aggregation interval after the request is made.  Since the
per-scheme apply interval is supported, that behavior makes no much sense.
That is, the tried regions directory will have regions from multiple
DAMON monitoring results snapshots, or no region for apply intervals that
much shorter than, or longer than the aggregation interval, respectively.
Update the behavior to update the regions for each scheme for only its
apply interval, and update the document.

Since DAMOS apply interval is the aggregation by default, this change
makes no visible behavioral difference to old users who don't explicitly
set the apply intervals.

Patches Sequence
----------------

The first two patches makes schemes of apply intervals that much shorter
or longer than the aggregation interval to keep the maximum and minimum
times for continuing the update.  After the two patches, the update aligns
with the each scheme's apply interval.

Finally, the third patch updates the document to reflect the behavior.

This patch (of 3):

DAMON_SYSFS exposes every DAMON-found region that eligible for applying
the scheme action for one aggregation interval.  However, each DAMON-based
operation scheme has its own apply interval.  Hence, for a scheme that
having its apply interval much smaller than the aggregation interval,
DAMON_SYSFS will expose the scheme regions that applied to more than one
DAMON monitoring results snapshots.  Since the purpose of DAMON tried
regions is exposing single snapshot, this makes no much sense.  Track
progress of each scheme's tried regions update and avoid the case.

Link: https://lkml.kernel.org/r/20231012192256.33556-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20231012192256.33556-2-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

(cherry picked from commit 4d4e41b682990b1dc5bba2bc313800340bf5c2d4)

Bug: 300502883
Change-Id: I78602a6810a9b4d8d131c3ace69f255ac1349d13
Signed-off-by: cui yangpei <cuiyangpei@xiaomi.com>
2023-12-16 01:38:42 +00:00
SeongJae Park
f5a0a8bc43 UPSTREAM: mm/damon/sysfs: check DAMOS regions update progress from before_terminate()
DAMON_SYSFS can receive DAMOS tried regions update request while kdamond
is already out of the main loop and before_terminate callback
(damon_sysfs_before_terminate() in this case) is not yet called.  And
damon_sysfs_handle_cmd() can further be finished before the callback is
invoked.  Then, damon_sysfs_before_terminate() unlocks damon_sysfs_lock,
which is not locked by anyone.  This happens because the callback function
assumes damon_sysfs_cmd_request_callback() should be called before it.
Check if the assumption was true before doing the unlock, to avoid this
problem.

Link: https://lkml.kernel.org/r/20231007200432.3110-1-sj@kernel.org
Fixes: f1d13cacab ("mm/damon/sysfs: implement DAMOS tried regions update command")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org>	[6.2.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

(cherry picked from commit 76b7069bcc)

Bug: 300502883
Change-Id: I7cd5e00c0d0226dc8d7856d103f88a26307cafce
Signed-off-by: cui yangpei <cuiyangpei@xiaomi.com>
2023-12-16 01:38:42 +00:00
SeongJae Park
b46391e092 UPSTREAM: mm/damon/sysfs: implement a command for updating only schemes tried total bytes
Using tried_regions/total_bytes file, users can efficiently retrieve the
total size of memory regions having specific access pattern.  However,
DAMON sysfs interface in kernel still populates all the infomration on the
tried_regions subdirectories.  That means the kernel part overhead for the
construction of tried regions directories still exists.  To remove the
overhead, implement yet another command input for 'state' DAMON sysfs
file.  Writing the input to the file makes DAMON sysfs interface to update
only the total_bytes file.

Link: https://lkml.kernel.org/r/20230802213222.109841-3-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

(cherry picked from commit 6ad243b83b)

Bug: 300502883
Change-Id: Id0bdf13858d6a92de0eeef22f59a65ee884e9d20
Signed-off-by: cui yangpei <cuiyangpei@xiaomi.com>
2023-12-16 01:38:42 +00:00
SeongJae Park
7d48e19f74 UPSTREAM: mm/damon/sysfs-schemes: implement DAMOS tried total bytes file
Patch series "mm/damon/sysfs-schemes: implement DAMOS tried total bytes
file".

The tried_regions directory of DAMON sysfs interface is useful for
retrieving monitoring results snapshot or DAMOS debugging.  However, for
common use case that need to monitor only the total size of the scheme
tried regions (e.g., monitoring working set size), the kernel overhead for
directory construction and user overhead for reading the content could be
high if the number of monitoring region is not small.  This patchset
implements DAMON sysfs files for efficient support of the use case.

The first patch implements the sysfs file to reduce the user space
overhead, and the second patch implements a command for reducing the
kernel space overhead.

The third patch adds a selftest for the new file, and following two
patches update documents.

[1] https://lore.kernel.org/damon/20230728201817.70602-1-sj@kernel.org/

This patch (of 5):

The tried_regions directory can be used for retrieving the monitoring
results snapshot for regions of specific access pattern, by setting the
scheme's action as 'stat' and the access pattern as required.  While the
interface provides every detail of the monitoring results, some use cases
including working set size monitoring requires only the total size of the
regions.  For such cases, users should read all the information and
calculate the total size of the regions.  However, it could incur high
overhead if the number of regions is high.  Add a file for retrieving only
the information, namely 'total_bytes' file.  It allows users to get the
total size by reading only the file.

Link: https://lkml.kernel.org/r/20230802213222.109841-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20230802213222.109841-2-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

(cherry picked from commit b69f92a741)

Bug: 300502883
Change-Id: I49c225d15ba09a9b896341da14cc9f2b45578da7
Signed-off-by: cui yangpei <cuiyangpei@xiaomi.com>
2023-12-16 01:38:42 +00:00
Ryan Roberts
a548d90994 UPSTREAM: mm/damon/ops-common: refactor to use {pte|pmd}p_clear_young_notify()
With the fix in place to atomically test and clear young on ptes and pmds,
simplify the code to handle the clearing for both the primary mmu and the
mmu notifier with a single API call.

Link: https://lkml.kernel.org/r/20230602092949.545577-4-ryan.roberts@arm.com
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Acked-by: Yu Zhao <yuzhao@google.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

(cherry picked from commit fa8c919dac)

Bug: 300502883
Change-Id: I4414604788996e338ac638c3eb3ec1ef7959223e
Signed-off-by: cui yangpei <cuiyangpei@xiaomi.com>
2023-12-16 01:38:42 +00:00
Huaisheng Ye
ea215c9a10 UPSTREAM: mm/damon/core: skip apply schemes if empty
Sometimes there is no scheme in damon's context, for example just use damo
record to monitor workload's data access pattern.

If current damon context doesn't have any scheme in the list, kdamond has
no need to iterate over list of all targets and regions but do nothing.

So, skip apply schemes when ctx->schemes is empty.

Link: https://lkml.kernel.org/r/20230116062347.1148553-1-huaisheng.ye@intel.com
Signed-off-by: Huaisheng Ye <huaisheng.ye@intel.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

(cherry picked from commit 64517d6e12)

Bug: 300502883
Change-Id: Ic76ca90c85dbb24205b17dd914f91a8dd4cf7345
Signed-off-by: cui yangpei <cuiyangpei@xiaomi.com>
2023-12-16 01:38:42 +00:00
Christophe JAILLET
3ca21ef5fa UPSTREAM: mm/damon: use kstrtobool() instead of strtobool()
strtobool() is the same as kstrtobool().  However, the latter is more used
within the kernel.

In order to remove strtobool() and slightly simplify kstrtox.h, switch to
the other function name.

While at it, include the corresponding header file (<linux/kstrtox.h>)

Link: https://lkml.kernel.org/r/ed2b46489a513988688decb53850339cc228940c.1667336095.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

(cherry picked from commit e6aff38b2e)

Bug: 300502883
Change-Id: I21df914f9ba754921bdc00d8e9a33e77b2606360
Signed-off-by: cui yangpei <cuiyangpei@xiaomi.com>
2023-12-16 01:38:42 +00:00
SeongJae Park
5bf7b56860 UPSTREAM: mm/damon/sysfs-schemes: implement DAMOS-tried regions clear command
When there are huge number of DAMON regions that specific scheme actions
are tried to be applied, directories and files under 'tried_regions'
scheme directory could waste some memory.  Add another special input
keyword ('clear_schemes_tried_regions') for 'state' file of each kdamond
sysfs directory that can be used for cleanup of the 'tried_regions'
sub-directories.

[sj@kernel.org: skip regions clearing if the scheme directory was removed]
  Link: https://lkml.kernel.org/r/20221114182954.4745-3-sj@kernel.org
Link: https://lkml.kernel.org/r/20221101220328.95765-6-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

(cherry picked from commit 772c15e5ad)

Bug: 300502883
Change-Id: I969e05ce1fa4599bae50454633b61b5320eaa67d
Signed-off-by: cui yangpei <cuiyangpei@xiaomi.com>
2023-12-16 01:38:42 +00:00
SeongJae Park
80ccab9b0e UPSTREAM: mm/damon/sysfs: implement DAMOS tried regions update command
Implement the code for filling the data of 'tried_regions' DAMON sysfs
directory.  With this commit, DAMON sysfs interface users can write a
special keyword, 'update_schemes_tried_regions' to the corresponding
'state' file of the kdamond.  Then, DAMON sysfs interface will collect the
tried regions information using the 'before_damos_apply()' callback for
one aggregation interval and populate scheme region directories with the
values.

[sj@kernel.org: skip tried regions update if the scheme directory was removed]
  Link: https://lkml.kernel.org/r/20221114182954.4745-2-sj@kernel.org
Link: https://lkml.kernel.org/r/20221101220328.95765-5-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

(cherry picked from commit f1d13cacab)

Bug: 300502883
Change-Id: I6749b8dc75023a9a3f3dc64902196b07fa523267
Signed-off-by: cui yangpei <cuiyangpei@xiaomi.com>
2023-12-16 01:38:42 +00:00
SeongJae Park
3421250b35 UPSTREAM: mm/damon/sysfs-schemes: implement scheme region directory
Implement region directories under 'tried_regions' directory of each
scheme DAMON sysfs directory.  This directory will provide the address
range, the monitored access frequency ('nr_accesses'), and the age of each
DAMON region that corresponding DAMON-based operation scheme has tried to
be applied.  Note that this commit doesn't implement the code for filling
the data but only the sysfs directory.

Link: https://lkml.kernel.org/r/20221101220328.95765-4-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

(cherry picked from commit 9277d0367b)

Bug: 300502883
Change-Id: I69c9010a8fce2fa61a1d27f2964ac7bc7b85dd44
Signed-off-by: cui yangpei <cuiyangpei@xiaomi.com>
2023-12-16 01:38:42 +00:00
SeongJae Park
b4c34cc168 UPSTREAM: mm/damon/sysfs-schemes: implement schemes/tried_regions directory
For efficient and simple query-like DAMON monitoring results readings and
deep level investigations of DAMOS, DAMON kernel API
(include/linux/damon.h) users can use 'before_damos_apply' DAMON callback.
However, DAMON sysfs interface users don't have such option.

Add a directory, namely 'tried_regions', under each scheme directory to
use it as the interface for the purpose.  Note that this commit is
implementing only the directory but the data filling.

After the data filling change is made, users will be able to signal DAMON
to fill the directory with the regions that corresponding scheme has tried
to be applied.  By setting the access pattern of the scheme, users could
do the efficient query-like monitoring.

Link: https://lkml.kernel.org/r/20221101220328.95765-3-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

(cherry picked from commit 5181b75f43)

Bug: 300502883
Change-Id: Idc7a1fca201b90f8fea62899f1e6b500bb8e14e1
Signed-off-by: cui yangpei <cuiyangpei@xiaomi.com>
2023-12-16 01:38:42 +00:00
SeongJae Park
b5d1f3576b UPSTREAM: mm/damon/core: add a callback for scheme target regions check
Patch series "efficiently expose damos action tried regions information".

DAMON users can retrieve the monitoring results via 'after_aggregation'
callbacks if the user is using the kernel API, or 'damon_aggregated'
tracepoint if the user is in the user space.  Those are useful if full
monitoring results are necessary.  However, if the user has interest in
only a snapshot of the results for some regions having specific access
pattern, the interfaces could be inefficient.  For example, some users
only want to know which memory regions are not accessed for more than a
specific time at the moment.

Also, some DAMOS users would want to know exactly to what memory regions
the schemes' actions tried to be applied, for a debugging or a tuning.  As
DAMOS has its internal mechanism for quota and regions prioritization, the
users would need to simulate DAMOS' mechanism against the monitoring
results.  That's unnecessarily complex.

This patchset implements DAMON kernel API callbacks and sysfs directory
for efficient exposure of the information for the use cases.  The new
callback will be called for each region when a DAMOS action is gonna tried
to be applied to it.  The sysfs directory will be called 'tried_regions'
and placed under each scheme sysfs directory.  Users can write a special
keyworkd, 'update_schemes_regions', to the 'state' file of a kdamond sysfs
directory.  Then, DAMON sysfs interface will fill the directory with the
information of regions that corresponding scheme action was tried to be
applied for next one aggregation interval.

Patches Sequence
----------------

The first one (patch 1) implements the callback for the kernel space
users.  Following two patches (patches 2 and 3) implements sysfs
directories for the information and its sub directories.  Two patches
(patches 4 and 5) for implementing the special keywords for filling the
data to and cleaning up the directories follow.  Patch 6 adds a selftest
for the new sysfs directory.  Finally, two patches (patches 7 and 8)
document the new feature in the administrator guide and the ABI document.

This patch (of 8):

Getting DAMON monitoring results of only specific access pattern (e.g.,
getting address ranges of memory that not accessed at all for two minutes)
can be useful for efficient monitoring of the system.  The information can
also be helpful for deep level investigation of DAMON-based operation
schemes.

For that, users need to record (in case of the user space users) or
iterate (in case of the kernel space users) full monitoring results and
filter it out for the specific access pattern.  In case of the DAMOS
investigation, users will even need to simulate DAMOS' quota and
prioritization mechanisms.  It's inefficient and complex.

Add a new DAMON callback that will be called before each scheme is applied
to each region.  DAMON kernel API users will be able to do the query-like
monitoring results collection, or DAMOS investigation in an efficient and
simple way using it.

Commits for providing the capability to the user space users will follow.

Link: https://lkml.kernel.org/r/20221101220328.95765-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20221101220328.95765-2-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

(cherry picked from commit 44467bbb7e)

Bug: 300502883
Change-Id: I21ff3c9cf6c30e113f78883e5063bcb898506b41
Signed-off-by: cui yangpei <cuiyangpei@xiaomi.com>
2023-12-16 01:38:42 +00:00