Huan Yang db3b80e22e UPSTREAM: mm/memcg: move mem_cgroup_init() ahead of cgroup_init()
Patch series "Use kmem_cache for memcg alloc", v3.

(willy tldr: "you've gone from allocating 8 objects per 32KiB to
allocating 13 objects per 32KiB, a 62% improvement in memory consumption"
[1])

The mem_cgroup_alloc function creates mem_cgroup struct and it's
associated structures including mem_cgroup_per_node.  Through detailed
analysis on our test machine (Arm64, 16GB RAM, 6.6 kernel, 1 NUMA node,
memcgv2 with nokmem,nosocket,cgroup_disable=pressure), we can observe the
memory allocation for these structures using the following shell commands:

  # Enable tracing
  echo 1 > /sys/kernel/tracing/events/kmem/kmalloc/enable
  echo 1 > /sys/kernel/tracing/tracing_on
  cat /sys/kernel/tracing/trace_pipe | grep kmalloc | grep mem_cgroup

  # Trigger allocation if cgroup subtree do not enable memcg
  echo +memory > /sys/fs/cgroup/cgroup.subtree_control

Ftrace Output:

  # mem_cgroup struct allocation
  sh-6312    [000] ..... 58015.698365: kmalloc:
    call_site=mem_cgroup_css_alloc+0xd8/0x5b4
    ptr=000000003e4c3799 bytes_req=2312 bytes_alloc=4096
    gfp_flags=GFP_KERNEL|__GFP_ZERO node=-1 accounted=false

  # mem_cgroup_per_node allocation
  sh-6312    [000] ..... 58015.698389: kmalloc:
    call_site=mem_cgroup_css_alloc+0x1d8/0x5b4
    ptr=00000000d798700c bytes_req=2896 bytes_alloc=4096
    gfp_flags=GFP_KERNEL|__GFP_ZERO node=0 accounted=false

Key Observations:

  1. Both structures use kmalloc with requested sizes between 2KB-4KB
  2. Allocation alignment forces 4KB slab usage due to pre-defined sizes
     (64B, 128B,..., 2KB, 4KB, 8KB)
  3. Memory waste per memcg instance:
      Base struct: 4096 - 2312 = 1784 bytes
      Per-node struct: 4096 - 2896 = 1200 bytes
      Total waste: 2984 bytes (1-node system)
      NUMA scaling: (1200 + 8) * nr_node_ids bytes

So, it's a little waste.

This patchset introduces dedicated kmem_cache:
  Patch2 - mem_cgroup kmem_cache - memcg_cachep
  Patch3 - mem_cgroup_per_node kmem_cache - memcg_pn_cachep

The benefits of this change can be observed with the following tracing
commands:

  # Enable tracing
  echo 1 > /sys/kernel/tracing/events/kmem/kmem_cache_alloc/enable
  echo 1 > /sys/kernel/tracing/tracing_on
  cat /sys/kernel/tracing/trace_pipe | grep kmem_cache_alloc | grep mem_cgroup
  # In another terminal:
  echo +memory > /sys/fs/cgroup/cgroup.subtree_control

The output might now look like this:

  # mem_cgroup struct allocation
  sh-9827     [000] .....   289.513598: kmem_cache_alloc:
    call_site=mem_cgroup_css_alloc+0xbc/0x5d4 ptr=00000000695c1806
    bytes_req=2312 bytes_alloc=2368 gfp_flags=GFP_KERNEL|__GFP_ZERO node=-1
    accounted=false
  # mem_cgroup_per_node allocation
  sh-9827     [000] .....   289.513602: kmem_cache_alloc:
    call_site=mem_cgroup_css_alloc+0x1b8/0x5d4 ptr=000000002989e63a
    bytes_req=2896 bytes_alloc=2944 gfp_flags=GFP_KERNEL|__GFP_ZERO node=0
    accounted=false

This indicates that the `mem_cgroup` struct now requests 2312 bytes and is
allocated 2368 bytes, while `mem_cgroup_per_node` requests 2896 bytes and
is allocated 2944 bytes.  The slight increase in allocated size is due to
`SLAB_HWCACHE_ALIGN` in the `kmem_cache`.

Without `SLAB_HWCACHE_ALIGN`, the allocation might appear as:

  # mem_cgroup struct allocation
  sh-9269     [003] .....    80.396366: kmem_cache_alloc:
    call_site=mem_cgroup_css_alloc+0xbc/0x5d4 ptr=000000005b12b475
    bytes_req=2312 bytes_alloc=2312 gfp_flags=GFP_KERNEL|__GFP_ZERO node=-1
    accounted=false

  # mem_cgroup_per_node allocation
  sh-9269     [003] .....    80.396411: kmem_cache_alloc:
    call_site=mem_cgroup_css_alloc+0x1b8/0x5d4 ptr=00000000f347adc6
    bytes_req=2896 bytes_alloc=2896 gfp_flags=GFP_KERNEL|__GFP_ZERO node=0
    accounted=false

While the `bytes_alloc` now matches the `bytes_req`, this patchset
defaults to using `SLAB_HWCACHE_ALIGN` as it is generally considered more
beneficial for performance.  Please let me know if there are any issues or
if I've misunderstood anything.

This patchset also move mem_cgroup_init ahead of cgroup_init() due to
cgroup_init() will allocate root_mem_cgroup, but each initcall invoke
after cgroup_init, so if each kmem_cache do not prepare, we need testing
NULL before use it.

This patch (of 3):

When cgroup_init() creates root_mem_cgroup through css_alloc callback,
some critical resources might not be fully initialized, forcing later
operations to perform conditional checks for resource availability.

This patch move mem_cgroup_init() to address the init order, it invoke
before cgroup_init, so, compare to subsys_initcall, it can use to prepare
some key resources before root_mem_cgroup alloc.

Link: https://lkml.kernel.org/r/aAsRCj-niMMTtmK8@casper.infradead.org [1]
Link: https://lkml.kernel.org/r/20250425031935.76411-1-link@vivo.com
Link: https://lkml.kernel.org/r/20250425031935.76411-2-link@vivo.com
Signed-off-by: Huan Yang <link@vivo.com>
Suggested-by: Shakeel Butt <shakeel.butt@linux.dev>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Francesco Valla <francesco@valla.it>
Cc: guoweikang <guoweikang.kernel@gmail.com>
Cc: Huang Shijie <shijie@os.amperecomputing.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: "Paul E . McKenney" <paulmck@kernel.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Raul E Rangel <rrangel@chromium.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: "Uladzislau Rezki (Sony)" <urezki@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
(cherry picked from commit bc9817bb7a21f64fbca2c4b83811d943036ec870)
Change-Id: I71005c1ad3d826b99fa698358bcf357ff7924c8c
Bug: 417296244
Signed-off-by: T.J. Mercier <tjmercier@google.com>
2025-06-02 13:00:45 -07:00
2025-05-07 06:27:07 +00:00
2022-09-28 09:02:20 +02:00
2025-05-09 12:00:36 +00:00
2023-06-15 09:54:33 +01:00

How do I submit patches to Android Common Kernels

  1. BEST: Make all of your changes to upstream Linux. If appropriate, backport to the stable releases. These patches will be merged automatically in the corresponding common kernels. If the patch is already in upstream Linux, post a backport of the patch that conforms to the patch requirements below.

    • Do not send patches upstream that contain only symbol exports. To be considered for upstream Linux, additions of EXPORT_SYMBOL_GPL() require an in-tree modular driver that uses the symbol -- so include the new driver or changes to an existing driver in the same patchset as the export.
    • When sending patches upstream, the commit message must contain a clear case for why the patch is needed and beneficial to the community. Enabling out-of-tree drivers or functionality is not not a persuasive case.
  2. LESS GOOD: Develop your patches out-of-tree (from an upstream Linux point-of-view). Unless these are fixing an Android-specific bug, these are very unlikely to be accepted unless they have been coordinated with kernel-team@android.com. If you want to proceed, post a patch that conforms to the patch requirements below.

Common Kernel patch requirements

  • All patches must conform to the Linux kernel coding standards and pass scripts/checkpatch.pl
  • Patches shall not break gki_defconfig or allmodconfig builds for arm, arm64, x86, x86_64 architectures (see https://source.android.com/setup/build/building-kernels)
  • If the patch is not merged from an upstream branch, the subject must be tagged with the type of patch: UPSTREAM:, BACKPORT:, FROMGIT:, FROMLIST:, or ANDROID:.
  • All patches must have a Change-Id: tag (see https://gerrit-review.googlesource.com/Documentation/user-changeid.html)
  • If an Android bug has been assigned, there must be a Bug: tag.
  • All patches must have a Signed-off-by: tag by the author and the submitter

Additional requirements are listed below based on patch type

Requirements for backports from mainline Linux: UPSTREAM:, BACKPORT:

  • If the patch is a cherry-pick from Linux mainline with no changes at all
    • tag the patch subject with UPSTREAM:.
    • add upstream commit information with a (cherry picked from commit ...) line
    • Example:
      • if the upstream commit message is
        important patch from upstream

        This is the detailed description of the important patch

        Signed-off-by: Fred Jones <fred.jones@foo.org>
  • then Joe Smith would upload the patch for the common kernel as
        UPSTREAM: important patch from upstream

        This is the detailed description of the important patch

        Signed-off-by: Fred Jones <fred.jones@foo.org>

        Bug: 135791357
        Change-Id: I4caaaa566ea080fa148c5e768bb1a0b6f7201c01
        (cherry picked from commit c31e73121f4c1ec41143423ac6ce3ce6dafdcec1)
        Signed-off-by: Joe Smith <joe.smith@foo.org>
  • If the patch requires any changes from the upstream version, tag the patch with BACKPORT: instead of UPSTREAM:.
    • use the same tags as UPSTREAM:
    • add comments about the changes under the (cherry picked from commit ...) line
    • Example:
        BACKPORT: important patch from upstream

        This is the detailed description of the important patch

        Signed-off-by: Fred Jones <fred.jones@foo.org>

        Bug: 135791357
        Change-Id: I4caaaa566ea080fa148c5e768bb1a0b6f7201c01
        (cherry picked from commit c31e73121f4c1ec41143423ac6ce3ce6dafdcec1)
        [joe: Resolved minor conflict in drivers/foo/bar.c ]
        Signed-off-by: Joe Smith <joe.smith@foo.org>

Requirements for other backports: FROMGIT:, FROMLIST:,

  • If the patch has been merged into an upstream maintainer tree, but has not yet been merged into Linux mainline
    • tag the patch subject with FROMGIT:
    • add info on where the patch came from as (cherry picked from commit <sha1> <repo> <branch>). This must be a stable maintainer branch (not rebased, so don't use linux-next for example).
    • if changes were required, use BACKPORT: FROMGIT:
    • Example:
      • if the commit message in the maintainer tree is
        important patch from upstream

        This is the detailed description of the important patch

        Signed-off-by: Fred Jones <fred.jones@foo.org>
  • then Joe Smith would upload the patch for the common kernel as
        FROMGIT: important patch from upstream

        This is the detailed description of the important patch

        Signed-off-by: Fred Jones <fred.jones@foo.org>

        Bug: 135791357
        (cherry picked from commit 878a2fd9de10b03d11d2f622250285c7e63deace
         https://git.kernel.org/pub/scm/linux/kernel/git/foo/bar.git test-branch)
        Change-Id: I4caaaa566ea080fa148c5e768bb1a0b6f7201c01
        Signed-off-by: Joe Smith <joe.smith@foo.org>
  • If the patch has been submitted to LKML, but not accepted into any maintainer tree
    • tag the patch subject with FROMLIST:
    • add a Link: tag with a link to the submittal on lore.kernel.org
    • add a Bug: tag with the Android bug (required for patches not accepted into a maintainer tree)
    • if changes were required, use BACKPORT: FROMLIST:
    • Example:
        FROMLIST: important patch from upstream

        This is the detailed description of the important patch

        Signed-off-by: Fred Jones <fred.jones@foo.org>

        Bug: 135791357
        Link: https://lore.kernel.org/lkml/20190619171517.GA17557@someone.com/
        Change-Id: I4caaaa566ea080fa148c5e768bb1a0b6f7201c01
        Signed-off-by: Joe Smith <joe.smith@foo.org>

Requirements for Android-specific patches: ANDROID:

  • If the patch is fixing a bug to Android-specific code
    • tag the patch subject with ANDROID:
    • add a Fixes: tag that cites the patch with the bug
    • Example:
        ANDROID: fix android-specific bug in foobar.c

        This is the detailed description of the important fix

        Fixes: 1234abcd2468 ("foobar: add cool feature")
        Change-Id: I4caaaa566ea080fa148c5e768bb1a0b6f7201c01
        Signed-off-by: Joe Smith <joe.smith@foo.org>
  • If the patch is a new feature
    • tag the patch subject with ANDROID:
    • add a Bug: tag with the Android bug (required for android-specific features)
Description
No description provided
Readme 7.9 GiB
Languages
C 97.7%
Assembly 1.6%
Makefile 0.3%
Perl 0.1%