UPSTREAM: mm/mglru: only clear kswapd_failures if reclaimable

lru_gen_shrink_node() unconditionally clears kswapd_failures, which can
prevent kswapd from sleeping and cause 100% kswapd cpu usage even when
kswapd repeatedly fails to make progress in reclaim.

Only clear kswap_failures in lru_gen_shrink_node() if reclaim makes some
progress, similar to shrink_node().

I happened to run into this problem in one of my tests recently.  It
requires a combination of several conditions: The allocator needs to
allocate a right amount of pages such that it can wake up kswapd
without itself being OOM killed; there is no memory for kswapd to
reclaim (My test disables swap and cleans page cache first); no other
process frees enough memory at the same time.

Bug: 254441685
Link: https://lkml.kernel.org/r/20241014221211.832591-1-weixugc@google.com
Fixes: e4dde56cd2 ("mm: multi-gen LRU: per-node lru_gen_folio lists")
Signed-off-by: Wei Xu <weixugc@google.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Brian Geffon <bgeffon@google.com>
Cc: Jan Alexander Steffens <heftig@archlinux.org>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Yu Zhao <yuzhao@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
(cherry picked from commit b130ba4a6259f6b64d8af15e9e7ab1e912bcb7ad)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: Ia2b4a0d71096d1e6cd0ee6054df3544724d4b665
This commit is contained in:
Wei Xu
2024-10-14 22:12:11 +00:00
committed by Treehugger Robot
parent 98cb57aeb3
commit 34f1eb9985

View File

@@ -5654,8 +5654,8 @@ static void lru_gen_shrink_node(struct pglist_data *pgdat, struct scan_control *
blk_finish_plug(&plug);
done:
/* kswapd should never fail */
pgdat->kswapd_failures = 0;
if (sc->nr_reclaimed > reclaimed)
pgdat->kswapd_failures = 0;
}
/******************************************************************************