Patch series "DAMON: Support Physical Memory Address Space Monitoring:.
DAMON currently supports only virtual address spaces monitoring. It can
be easily extended for various use cases and address spaces by
configuring its monitoring primitives layer to use appropriate
primitives implementations, though. This patchset implements monitoring
primitives for the physical address space monitoring using the
structure.
The first 3 patches allow the user space users manually set the
monitoring regions. The 1st patch implements the feature in the
'damon-dbgfs'. Then, patches for adding a unit tests (the 2nd patch)
and updating the documentation (the 3rd patch) follow.
Following 4 patches implement the physical address space monitoring
primitives. The 4th patch makes some primitive functions for the
virtual address spaces primitives reusable. The 5th patch implements
the physical address space monitoring primitives. The 6th patch links
the primitives to the 'damon-dbgfs'. Finally, 7th patch documents this
new features.
This patch (of 7):
Some 'damon-dbgfs' users would want to monitor only a part of the entire
virtual memory address space. The program interface users in the kernel
space could use '->before_start()' callback or set the regions inside
the context struct as they want, but 'damon-dbgfs' users cannot.
For that reason, this introduces a new debugfs file called
'init_region'. 'damon-dbgfs' users can specify which initial monitoring
target address regions they want by writing special input to the file.
The input should describe each region in each line in the below form:
<pid> <start address> <end address>
Note that the regions will be updated to cover entire memory mapped
regions after a 'regions update interval' is passed. If you want the
regions to not be updated after the initial setting, you could set the
interval as a very long time, say, a few decades.
Link: https://lkml.kernel.org/r/20211012205711.29216-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20211012205711.29216-2-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Amit Shah <amit@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Woodhouse <dwmw@amazon.com>
Cc: Marco Elver <elver@google.com>
Cc: Leonard Foerster <foersleo@amazon.de>
Cc: Greg Thelen <gthelen@google.com>
Cc: Markus Boehme <markubo@amazon.de>
Cc: David Rienjes <rientjes@google.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 90bebce9fc)
Bug: 228223814
Signed-off-by: zhijun wan <wanzhijun@oppo.com>
Change-Id: Idb8961fbe0d851f9b4a1da6b42dfff291d86eae2
To tune the DAMON-based operation schemes, knowing how many and how
large regions are affected by each of the schemes will be helful. Those
stats could be used for not only the tuning, but also monitoring of the
working set size and the number of regions, if the scheme does not
change the program behavior too much.
For the reason, this implements the statistics for the schemes. The
total number and size of the regions that each scheme is applied are
exported to users via '->stat_count' and '->stat_sz' of 'struct damos'.
Admins can also check the number by reading 'schemes' debugfs file. The
last two integers now represents the stats. To allow collecting the
stats without changing the program behavior, this also adds new scheme
action, 'DAMOS_STAT'. Note that 'DAMOS_STAT' is not only making no
memory operation actions, but also does not reset the age of regions.
Link: https://lkml.kernel.org/r/20211001125604.29660-6-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Amit Shah <amit@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rienjes <rientjes@google.com>
Cc: David Woodhouse <dwmw@amazon.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leonard Foerster <foersleo@amazon.de>
Cc: Marco Elver <elver@google.com>
Cc: Markus Boehme <markubo@amazon.de>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 2f0b548c9f)
Bug: 228223814
Signed-off-by: zhijun wan <wanzhijun@oppo.com>
Change-Id: Id485ee13922bd769075a77e7263380db32a15544
In many cases, users might use DAMON for simple data access aware memory
management optimizations such as applying an operation scheme to a
memory region of a specific size having a specific access frequency for
a specific time. For example, "page out a memory region larger than 100
MiB but having a low access frequency more than 10 minutes", or "Use THP
for a memory region larger than 2 MiB having a high access frequency for
more than 2 seconds".
Most simple form of the solution would be doing offline data access
pattern profiling using DAMON and modifying the application source code
or system configuration based on the profiling results. Or, developing
a daemon constructed with two modules (one for access monitoring and the
other for applying memory management actions via mlock(), madvise(),
sysctl, etc) is imaginable.
To avoid users spending their time for implementation of such simple
data access monitoring-based operation schemes, this makes DAMON to
handle such schemes directly. With this change, users can simply
specify their desired schemes to DAMON. Then, DAMON will automatically
apply the schemes to the user-specified target processes.
Each of the schemes is composed with conditions for filtering of the
target memory regions and desired memory management action for the
target. Specifically, the format is::
<min/max size> <min/max access frequency> <min/max age> <action>
The filtering conditions are size of memory region, number of accesses
to the region monitored by DAMON, and the age of the region. The age of
region is incremented periodically but reset when its addresses or
access frequency has significantly changed or the action of a scheme was
applied. For the action, current implementation supports a few of
madvise()-like hints, ``WILLNEED``, ``COLD``, ``PAGEOUT``, ``HUGEPAGE``,
and ``NOHUGEPAGE``.
Because DAMON supports various address spaces and application of the
actions to a monitoring target region is dependent to the type of the
target address space, the application code should be implemented by each
primitives and registered to the framework. Note that this only
implements the framework part. Following commit will implement the
action applications for virtual address spaces primitives.
Link: https://lkml.kernel.org/r/20211001125604.29660-3-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Amit Shah <amit@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rienjes <rientjes@google.com>
Cc: David Woodhouse <dwmw@amazon.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leonard Foerster <foersleo@amazon.de>
Cc: Marco Elver <elver@google.com>
Cc: Markus Boehme <markubo@amazon.de>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 1f366e421c)
Bug: 228223814
Signed-off-by: zhijun wan <wanzhijun@oppo.com>
Change-Id: Iae8c0d0ade588de0720140fcf6f97a1873f896a0
Patch series "Implement Data Access Monitoring-based Memory Operation Schemes".
Introduction
============
DAMON[1] can be used as a primitive for data access aware memory
management optimizations. For that, users who want such optimizations
should run DAMON, read the monitoring results, analyze it, plan a new
memory management scheme, and apply the new scheme by themselves. Such
efforts will be inevitable for some complicated optimizations.
However, in many other cases, the users would simply want the system to
apply a memory management action to a memory region of a specific size
having a specific access frequency for a specific time. For example,
"page out a memory region larger than 100 MiB keeping only rare accesses
more than 2 minutes", or "Do not use THP for a memory region larger than
2 MiB rarely accessed for more than 1 seconds".
To make the works easier and non-redundant, this patchset implements a
new feature of DAMON, which is called Data Access Monitoring-based
Operation Schemes (DAMOS). Using the feature, users can describe the
normal schemes in a simple way and ask DAMON to execute those on its
own.
[1] https://damonitor.github.io
Evaluations
===========
DAMOS is accurate and useful for memory management optimizations. An
experimental DAMON-based operation scheme for THP, 'ethp', removes
76.15% of THP memory overheads while preserving 51.25% of THP speedup.
Another experimental DAMON-based 'proactive reclamation' implementation,
'prcl', reduces 93.38% of residential sets and 23.63% of system memory
footprint while incurring only 1.22% runtime overhead in the best case
(parsec3/freqmine).
NOTE that the experimental THP optimization and proactive reclamation
are not for production but only for proof of concepts.
Please refer to the showcase web site's evaluation document[1] for
detailed evaluation setup and results.
[1] https://damonitor.github.io/doc/html/v34/vm/damon/eval.html
Long-term Support Trees
-----------------------
For people who want to test DAMON but using LTS kernels, there are
another couple of trees based on two latest LTS kernels respectively and
containing the 'damon/master' backports.
- For v5.4.y: https://git.kernel.org/sj/h/damon/for-v5.4.y
- For v5.10.y: https://git.kernel.org/sj/h/damon/for-v5.10.y
Sequence Of Patches
===================
The 1st patch accounts age of each region. The 2nd patch implements the
core of the DAMON-based operation schemes feature. The 3rd patch makes
the default monitoring primitives for virtual address spaces to support
the schemes. From this point, the kernel space users can use DAMOS.
The 4th patch exports the feature to the user space via the debugfs
interface. The 5th patch implements schemes statistics feature for
easier tuning of the schemes and runtime access pattern analysis, and
the 6th patch adds selftests for these changes. Finally, the 7th patch
documents this new feature.
This patch (of 7):
DAMON can be used for data access pattern aware memory management
optimizations. For that, users should run DAMON, read the monitoring
results, analyze it, plan a new memory management scheme, and apply the
new scheme by themselves. It would not be too hard, but still require
some level of effort. For complicated cases, this effort is inevitable.
That said, in many cases, users would simply want to apply an actions to
a memory region of a specific size having a specific access frequency
for a specific time. For example, "page out a memory region larger than
100 MiB but having a low access frequency more than 10 minutes", or "Use
THP for a memory region larger than 2 MiB having a high access frequency
for more than 2 seconds".
For such optimizations, users will need to first account the age of each
region themselves. To reduce such efforts, this implements a simple age
account of each region in DAMON. For each aggregation step, DAMON
compares the access frequency with that from last aggregation and reset
the age of the region if the change is significant. Else, the age is
incremented. Also, in case of the merge of regions, the region
size-weighted average of the ages is set as the age of merged new
region.
Link: https://lkml.kernel.org/r/20211001125604.29660-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20211001125604.29660-2-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Amit Shah <amit@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Woodhouse <dwmw@amazon.com>
Cc: Marco Elver <elver@google.com>
Cc: Leonard Foerster <foersleo@amazon.de>
Cc: Greg Thelen <gthelen@google.com>
Cc: Markus Boehme <markubo@amazon.de>
Cc: David Rienjes <rientjes@google.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit fda504fade)
Bug: 228223814
Signed-off-by: zhijun wan <wanzhijun@oppo.com>
Change-Id: Ia5ddb3b5ce9c0d14e098a0af55dabf4b6a609aaa
This vendor hook let us initialize payload of the request.
Bug: 188749221
Change-Id: I51d6a3010ac0ab36066dbe1368158592832112b7
Signed-off-by: Yang Yang <yang.yang@vivo.com>
(cherry picked from commit 2faed77792)
This vendor hook let us attach oem data as payload to the request.
The payload is used by oem driver for debugging purpose.
Bug: 188749221
Change-Id: Iac598bd9cce836dac0efe9198a3e7752928f351a
Signed-off-by: Yang Yang <yang.yang@vivo.com>
(cherry picked from commit eecc725a8e)
Provide a vendor hook to skip cma-pages to add in pcplist when
free_unref_page_commit.
The patch is revelant to skip drain_all_pages in alloc_contig_range,
the revelant hooks is android_vh_cma_drain_all_pages_bypass
which is to avoid to delay in drain pcppages when drain_all_pages.
In most case, pcp->high is small so that free-pages with other mt_types
can also fill with pcplist full.
Bug: 224732340
Bug: 234405962
Signed-off-by: Peifeng Li <lipeifeng@oppo.com>
Change-Id: Ifdeeed9f8934d87671ec3fa6787a02675b993082
The condition introduced by a patch adding a vendor hook to skip
drain_all_pages is invalid and changes the default behavior for CMA
allocations. Fix the condition to restore default behavior.
Fixes: a2485b8abd ("ANDROID: vendor_hooks: Add hooks to for alloc_contig_range")
Bug: 232357688
Bug: 234405962
Reported-by: Yong-Taek Lee <ytk.lee@samsung.com>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I686ad9dff57f604557f79cf4dc12cde55474e533
Provide a vendor hook to allow drain_all_pages to be skipped
during alloc_contig_range in some cases to avoid delays caused by
it in cases when the benefits of draining pcp lists are known
to be small.
Bug: 224732340
Bug: 234405962
Signed-off-by: Peifeng Li <lipeifeng@oppo.com>
Change-Id: I0a82f668cf985ad5344d666c0c6372a7e61c3798
Export shrink_slab to module for do shrink-memory action.
Bug: 221768451
Bug: 234405962
Signed-off-by: Peifeng Li <lipeifeng@oppo.com>
Change-Id: I5abe9ad419d64999b714d879c228625a243e90d1
Provide a vendor hook to allow drain_all_pages to be skipped
during direct reclaim in some cases to avoid delays caused by
it in cases when the benefits of draining pcp lists are known
to be small.
Bug: 220811627
Bug: 234405962
Signed-off-by: Liujie Xie <xieliujie@oppo.com>
Change-Id: I0805241f81e0a94afcf62c98e97cff125d4061e2
Provide a vendor hook to allow page_referenced to be skipped
during shrink_active_list to avoid heavy cpuloading caused by
it.
Bug: 220878851
Bug: 234405962
Signed-off-by: Liujie Xie <xieliujie@oppo.com>
Signed-off-by: Peifeng Li <lipeifeng@oppo.com>
Change-Id: Ie0e369f8f8739fea59a95470af20ab0e976869d1
Commit e201260081 ("arm64: perf: Add userspace counter access disable
switch") introduced a new 'perf_user_access' sysctl file to enable and
disable direct userspace access to the PMU counters. Sadly, Geert
reports that on his big.LITTLE SoC ('Renesas Salvator-XS w/ R-Car H3'),
the file is created for each PMU type probed, resulting in a splat
during boot:
| hw perfevents: enabled with armv8_cortex_a53 PMU driver, 7 counters available
| sysctl duplicate entry: /kernel//perf_user_access
| CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.16.0-rc3-arm64-renesas-00003-ge2012600810c #1420
| Hardware name: Renesas Salvator-X 2nd version board based on r8a77951 (DT)
| Call trace:
| dump_backtrace+0x0/0x190
| show_stack+0x14/0x20
| dump_stack_lvl+0x88/0xb0
| dump_stack+0x14/0x2c
| __register_sysctl_table+0x384/0x818
| register_sysctl+0x20/0x28
| armv8_pmu_init.constprop.0+0x118/0x150
| armv8_a57_pmu_init+0x1c/0x28
| arm_pmu_device_probe+0x1b4/0x558
| armv8_pmu_device_probe+0x18/0x20
| platform_probe+0x64/0xd0
| hw perfevents: enabled with armv8_cortex_a57 PMU driver, 7 counters available
Introduce a state variable to track creation of the sysctl file and
ensure that it is only created once.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Fixes: e201260081 ("arm64: perf: Add userspace counter access disable switch")
Link: https://lore.kernel.org/r/CAMuHMdVcDxR9sGzc5pcnORiotonERBgc6dsXZXMd6wTvLGA9iw@mail.gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
Bug: 230559577
(cherry picked from commit 3da4390bcd)
Change-Id: Ib958eb1ca2e992d5120b476a5dcfec5094dbf148
Signed-off-by: Srinivasarao Pathipati <quic_spathi@quicinc.com>
Allow the vendor module to know the target cpu for better decisions on
whether to enforce __ttwu_queue_wakelist() based wakeup.
Bug: 234483895
Change-Id: Ic27054a5f6adc040fa3cadbd57d37608bf353c5f
Signed-off-by: Abhijeet Dharmapurikar <quic_adharmap@quicinc.com>
The kernel-doc comment is formatted badly, resulting
in a warning:
include/net/cfg80211.h:1188: warning: bad line: [...]
Fix that.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Bug: 233160029
(cherry picked from commit ee0e2f51e2)
Change-Id: I4b8d264913489fb0345ce444200953c8494a77c5
Signed-off-by: Veerendranath Jakkam <quic_vjakkam@quicinc.com>
NL80211_ATTR_HE_BSS_COLOR attribute can be included in both
NL80211_CMD_START_AP and NL80211_CMD_SET_BEACON commands.
Move he_bss_color from cfg80211_ap_settings to cfg80211_beacon_data
and parse NL80211_ATTR_HE_BSS_COLOR as a part of nl80211_parse_beacon()
to have bss color settings parsed for both start ap and set beacon
commands.
Add a new flag he_bss_color_valid to indicate whether
NL80211_ATTR_HE_BSS_COLOR attribute is included.
Signed-off-by: Rameshkumar Sundaram <quic_ramess@quicinc.com>
Link: https://lore.kernel.org/r/1649867295-7204-2-git-send-email-quic_ramess@quicinc.com
[fix build ...]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Bug: 233160029
(cherry picked from commit 3d48cb7481)
Change-Id: Iceef7d7927fa3bbb49ced1583461a87b151f20e4
Signed-off-by: Veerendranath Jakkam <quic_vjakkam@quicinc.com>
Since internal_flags is only 8 bits, we can only have one
more internal flag. However, we can obviously never use all
of possible the combinations, in fact, we only use 14 of
them (including no flags).
Since we want more flags for MLO (multi-link operation) in
the future, refactor the code to use a flags selector, so
wrap all of the .internal_flags assignments in a IFLAGS()
macro which selects the combination according to the pre-
defined list of combinations.
When we need a new combination, we'll have to add it, but
again we will never use all possible combinations.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Link: https://lore.kernel.org/r/20220414140402.70ddf8af3eb0.I2cc38cb6a10bb4c3863ec9ee97edbcc70a07aa4b@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Bug: 233160029
(cherry picked from commit 2182db91e0)
Change-Id: I6ca31b633ce0af9829d70a377906115d23d1c4ad
Signed-off-by: Veerendranath Jakkam <quic_vjakkam@quicinc.com>
Add show_mem symbol which will be used by the hard-lockup
debugging module to debug_symbols driver.
Bug: 199478662
Signed-off-by: Woody Lin <woodylin@google.com>
Change-Id: I479700e9f1428b4e1192881b4e3b67c9e43afbeb
Some of the irq migration paths call chip set affinity, after
current CPU is marked offline in cpu_online_mask. These
chip set affinity calls do not invoke vendor trace hooks.
So, convert gic_v3_set_affinity() vendor hook to a restricted
hook, to allow trace hook to be called from these irq migration
paths.
Bug: 187161770
Change-Id: I8f45536deb1ba1dc6be861ca4fc2b32306a5c50a
Signed-off-by: Neeraj Upadhyay <neeraju@codeaurora.org>
(cherry picked from commit 3bd9ad7eb4)
Add ANDROID_OEM_DATA to block_device_operations which allows a new
vendor specific function call.
Bug: 193106408
Signed-off-by: Jaewon Kim <jaewon31.kim@samsung.com>
Change-Id: I472f1cc25698c841841822908c4827545b8593df
Building CONFIG_UAPI_HEADER_TEST=y with a Bionic (Android's libc) based
sysroot produces the following warning:
In file included from <built-in>:1:
./usr/include/linux/icmp.h:100:3: warning: declaration does not declare
anything [-Wmissing-declarations]
__be16 __unused;
^~~~~~
This is because Bionic defines __unused to expand to
__attribute__((__unused__)). Bionic pre-processes kernel headers and
redefines __unused to __linux_unused.
Do so here to avoid issues that only appear for Bionic based sysroot
UAPI header tests.
Link: 4ebdeebef7/libc/include/sys/cdefs.h (95)
Link: 4ebdeebef7/libc/kernel/tools/defaults.py (70)
Bug: 190019968
Bug: 234125788
Reported-by: Matthias Männich<maennich@google.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Change-Id: I2341953cbfce8e28b982c34df2df4b3b364d63a6
The usb_ep_disable() and usb_ep_enable() routines are being widely
used in atomic/interrupt context by function drivers. Hence, the
statement about it being able to only run in process context may
not be true. Add an explicit comment mentioning that it can be used
in atomic context.
Signed-off-by: Wesley Cheng <quic_wcheng@quicinc.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Bug: 204343836
(cherry picked from commit b0d5d2a716)
Change-Id: I1adb5d074fe2f9e33ebfdb30d335283c56bc7b39
Signed-off-by: Wesley Cheng <quic_wcheng@quicinc.com>
db845c is not a mixed build yet, so need to add GKI
modules to it's module_outs for kleaf builds to
resolve hard failures in kleaf build for module copy.
Bug: 230519159
Test: tools/bazel run //common:db845c_dist
Signed-off-by: Ramji Jiyani <ramjiyani@google.com>
Change-Id: If3ce64a2b5f6b2f019a393f4674de30ac7437069
Enable zram and zsmalloc (dependency for zram) as
unprotected modules for aarch64. These are already
being used as modules by the vendor currently; so
needs to be unprotected.
Bug: 230519159
Test: TH
Signed-off-by: Ramji Jiyani <ramjiyani@google.com>
Change-Id: I7c617c1a24f6e083301cbed67d0d323388cbd622