linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-07 19:30:30 +09:00

Author	SHA1	Message	Date
Andrew Davis	bc9ca01ef8	ARM: dts: dra7: Fix DRA7 L3 NoC node register size [ Upstream commit 1e5caee2ba8f1426e8098afb4ca38dc40a0ca71b ] This node can access any part of the L3 configuration registers space, including CLK1 and CLK2 which are 0x800000 offset. Restore this area size to include these areas. Fixes: `7f2659ce65` ("ARM: dts: Move dra7 l3 noc to a separate node") Signed-off-by: Andrew Davis <afd@ti.com> Message-ID: <20231113181604.546444-1-afd@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-01-01 12:42:25 +00:00
Chukun Pan	b352ebe373	arm64: dts: allwinner: h616: update emac for Orange Pi Zero 3 [ Upstream commit b9622937d95809ef89904583191571a9fa326402 ] The current emac setting is not suitable for Orange Pi Zero 3, move it back to Orange Pi Zero 2 DT. Also update phy mode and delay values for emac on Orange Pi Zero 3. With these changes, Ethernet now looks stable. Fixes: `322bf10320` ("arm64: dts: allwinner: h616: Split Orange Pi Zero 2 DT") Signed-off-by: Chukun Pan <amadeus@jmu.edu.cn> Reviewed-by: Jernej Skrabec <jernej.skrabec@gmail.com> Link: https://lore.kernel.org/r/20231029074009.7820-2-amadeus@jmu.edu.cn Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-01-01 12:42:25 +00:00
Benjamin Bigler	a8b655ac35	spi: spi-imx: correctly configure burst length when using dma [ Upstream commit e9b220aeacf109684cce36a94fc24ed37be92b05 ] If DMA is used, burst length should be set to the bus width of the DMA. Otherwise, the SPI hardware will transmit/receive one word per DMA request. Since this issue affects both transmission and reception, it cannot be detected with a loopback test. Replace magic numbers 512 and 0xfff with MX51_ECSPI_CTRL_MAX_BURST. Reported-by Stefan Bigler <linux@bigler.io> Signed-off-by: Benjamin Bigler <benjamin@bigler.one> Fixes: `15a6af94a2` ("spi: Increase imx51 ecspi burst length based on transfer length") Link: https://lore.kernel.org/r/8a415902c751cdbb4b20ce76569216ed@mail.infomaniak.com Link: https://lore.kernel.org/r/20231209222338.5564-1-benjamin@bigler.one Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-01-01 12:42:25 +00:00
Lingkai Dong	c9b26d9e43	drm: Fix FD ownership check in drm_master_check_perm() [ Upstream commit 5a6c9a05e55cb2972396cc991af9d74c8c15029a ] The DRM subsystem keeps a record of the owner of a DRM device file descriptor using thread group ID (TGID) instead of process ID (PID), to ensures all threads within the same userspace process are considered the owner. However, the DRM master ownership check compares the current thread's PID against the record, so the thread is incorrectly considered to be not the FD owner if the PID is not equal to the TGID. This causes DRM ioctls to be denied master privileges, even if the same thread that opened the FD performs an ioctl. Fix this by checking TGID. Fixes: `4230cea89c` ("drm: Track clients by tgid and not tid") Signed-off-by: Lingkai Dong <lingkai.dong@arm.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: <stable@vger.kernel.org> # v6.4+ Link: https://patchwork.freedesktop.org/patch/msgid/PA6PR08MB107665920BE9A96658CDA04CE8884A@PA6PR08MB10766.eurprd08.prod.outlook.com Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-01-01 12:42:25 +00:00
Tvrtko Ursulin	031ddd2800	drm: Update file owner during use [ Upstream commit 1c7a387ffef894b1ab3942f0482dac7a6e0a909c ] With the typical model where the display server opens the file descriptor and then hands it over to the client(), we were showing stale data in debugfs. Fix it by updating the drm_file->pid on ioctl access from a different process. The field is also made RCU protected to allow for lockless readers. Update side is protected with dev->filelist_mutex. Before: $ cat /sys/kernel/debug/dri/0/clients command pid dev master a uid magic Xorg 2344 0 y y 0 0 Xorg 2344 0 n y 0 2 Xorg 2344 0 n y 0 3 Xorg 2344 0 n y 0 4 After: $ cat /sys/kernel/debug/dri/0/clients command tgid dev master a uid magic Xorg 830 0 y y 0 0 xfce4-session 880 0 n y 0 1 xfwm4 943 0 n y 0 2 neverball 1095 0 n y 0 3 ) More detailed and historically accurate description of various handover implementation kindly provided by Emil Velikov: """ The traditional model, the server was the orchestrator managing the primary device node. From the fd, to the master status and authentication. But looking at the fd alone, this has varied across the years. IIRC in the DRI1 days, Xorg (libdrm really) would have a list of open fd(s) and reuse those whenever needed, DRI2 the client was responsible for open() themselves and with DRI3 the fd was passed to the client. Around the inception of DRI3 and systemd-logind, the latter became another possible orchestrator. Whereby Xorg and Wayland compositors could ask it for the fd. For various reasons (hysterical and genuine ones) Xorg has a fallback path going the open(), whereas Wayland compositors are moving to solely relying on logind... some never had fallback even. Over the past few years, more projects have emerged which provide functionality similar (be that on API level, Dbus, or otherwise) to systemd-logind. """ v2: * Fixed typo in commit text and added a fine historical explanation from Emil. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: Daniel Vetter <daniel@ffwll.ch> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Tested-by: Rob Clark <robdclark@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230621094824.2348732-1-tvrtko.ursulin@linux.intel.com Signed-off-by: Christian König <christian.koenig@amd.com> Stable-dep-of: 5a6c9a05e55c ("drm: Fix FD ownership check in drm_master_check_perm()") Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-01-01 12:42:25 +00:00
Jani Nikula	a2abe532ec	drm/i915/edp: don't write to DP_LINK_BW_SET when using rate select [ Upstream commit e6861d8264cd43c5eb20196e53df36fd71ec5698 ] The eDP 1.5 spec adds a clarification for eDP 1.4x: > For eDP v1.4x, if the Source device chooses the Main-Link rate by way > of DPCD 00100h, the Sink device shall ignore DPCD 00115h[2:0]. We write 0 to DP_LINK_BW_SET (DPCD 100h) even when using DP_LINK_RATE_SET (DPCD 114h). Stop doing that, as it can cause the panel to ignore the rate set method. Moreover, 0 is a reserved value for DP_LINK_BW_SET, and should not be used. v2: Improve the comments (Ville) Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/9081 Tested-by: Animesh Manna <animesh.manna@intel.com> Reviewed-by: Uma Shankar <uma.shankar@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231205180551.2476228-1-jani.nikula@intel.com (cherry picked from commit 23b392b94acb0499f69706c5808c099f590ebcf4) Cc: stable@vger.kernel.org Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-01-01 12:42:25 +00:00
Ville Syrjälä	e4c16db9da	drm/i915: Introduce crtc_state->enhanced_framing [ Upstream commit 3072a24c778a7102d70692af5556e47363114c67 ] Track DP enhanced framing properly in the crtc state instead of relying just on the cached DPCD everywhere, and hook it up into the state check and dump. v2: Actually set enhanced_framing in .compute_config() Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230503113659.16305-1-ville.syrjala@linux.intel.com Reviewed-by: Luca Coelho <luciano.coelho@intel.com> Stable-dep-of: e6861d8264cd ("drm/i915/edp: don't write to DP_LINK_BW_SET when using rate select") Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-01-01 12:42:24 +00:00
Ville Syrjälä	1f17934815	drm/i915: Fix FEC state dump [ Upstream commit 3dfeb80b308882cc6e1f5f6c36fd9a7f4cae5fc6 ] Stop dumping state while reading it out. We have a proper place for that stuff. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230502143906.2401-7-ville.syrjala@linux.intel.com Reviewed-by: Luca Coelho <luciano.coelho@intel.com> Stable-dep-of: e6861d8264cd ("drm/i915/edp: don't write to DP_LINK_BW_SET when using rate select") Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-01-01 12:42:24 +00:00
Hamza Mahfooz	cd8b639700	drm/amd/display: fix hw rotated modes when PSR-SU is enabled [ Upstream commit f528ee145bd0076cd0ed7e7b2d435893e6329e98 ] We currently don't support dirty rectangles on hardware rotated modes. So, if a user is using hardware rotated modes with PSR-SU enabled, use PSR-SU FFU for all rotated planes (including cursor planes). Cc: stable@vger.kernel.org Fixes: `30ebe41582` ("drm/amd/display: add FB_DAMAGE_CLIPS support") Reported-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2952 Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Tested-by: Bin Li <binli@gnome.org> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-01-01 12:42:24 +00:00
Boris Burkov	624bc6f62c	btrfs: free qgroup pertrans reserve on transaction abort [ Upstream commit b321a52cce062ec7ed385333a33905d22159ce36 ] If we abort a transaction, we never run the code that frees the pertrans qgroup reservation. This results in warnings on unmount as that reservation has been leaked. The leak isn't a huge issue since the fs is read-only, but it's better to clean it up when we know we can/should. Do it during the cleanup_transaction step of aborting. CC: stable@vger.kernel.org # 5.15+ Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Boris Burkov <boris@bur.io> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-01-01 12:42:24 +00:00
Qu Wenruo	1a80999ba3	btrfs: qgroup: use qgroup_iterator in qgroup_convert_meta() [ Upstream commit 0913445082496c2b29668ee26521401b273838b8 ] With the new qgroup_iterator_add() and qgroup_iterator_clean(), we can get rid of the ulist and its GFP_ATOMIC memory allocation. Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Stable-dep-of: b321a52cce06 ("btrfs: free qgroup pertrans reserve on transaction abort") Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-01-01 12:42:24 +00:00
Qu Wenruo	1c9a5c4950	btrfs: qgroup: iterate qgroups without memory allocation for qgroup_reserve() [ Upstream commit 686c4a5a42635e0d2889e3eb461c554fd0b616b4 ] Qgroup heavily relies on ulist to go through all the involved qgroups, but since we're using ulist inside fs_info->qgroup_lock spinlock, this means we're doing a lot of GFP_ATOMIC allocations. This patch reduces the GFP_ATOMIC usage for qgroup_reserve() by eliminating the memory allocation completely. This is done by moving the needed memory to btrfs_qgroup::iterator list_head, so that we can put all the involved qgroup into a on-stack list, thus eliminating the need to allocate memory while holding spinlock. The only cost is the slightly higher memory usage, but considering the reduce GFP_ATOMIC during a hot path, it should still be acceptable. Function qgroup_reserve() is the perfect start point for this conversion. Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Stable-dep-of: b321a52cce06 ("btrfs: free qgroup pertrans reserve on transaction abort") Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-01-01 12:42:24 +00:00
SeongJae Park	e93bcaebda	mm/damon/core: make damon_start() waits until kdamond_fn() starts [ Upstream commit 6376a824595607e99d032a39ba3394988b4fce96 ] The cleanup tasks of kdamond threads including reset of corresponding DAMON context's ->kdamond field and decrease of global nr_running_ctxs counter is supposed to be executed by kdamond_fn(). However, commit `0f91d13366` ("mm/damon: simplify stop mechanism") made neither damon_start() nor damon_stop() ensure the corresponding kdamond has started the execution of kdamond_fn(). As a result, the cleanup can be skipped if damon_stop() is called fast enough after the previous damon_start(). Especially the skipped reset of ->kdamond could cause a use-after-free. Fix it by waiting for start of kdamond_fn() execution from damon_start(). Link: https://lkml.kernel.org/r/20231208175018.63880-1-sj@kernel.org Fixes: `0f91d13366` ("mm/damon: simplify stop mechanism") Signed-off-by: SeongJae Park <sj@kernel.org> Reported-by: Jakub Acs <acsjakub@amazon.de> Cc: Changbin Du <changbin.du@intel.com> Cc: Jakub Acs <acsjakub@amazon.de> Cc: <stable@vger.kernel.org> # 5.15.x Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-01-01 12:42:23 +00:00
SeongJae Park	c708a5e51b	mm/damon/core: use number of passed access sampling as a timer [ Upstream commit 4472edf63d6630e6cf65e205b4fc8c3c94d0afe5 ] DAMON sleeps for sampling interval after each sampling, and check if the aggregation interval and the ops update interval have passed using ktime_get_coarse_ts64() and baseline timestamps for the intervals. That design is for making the operations occur at deterministic timing regardless of the time that spend for each work. However, it turned out it is not that useful, and incur not-that-intuitive results. After all, timer functions, and especially sleep functions that DAMON uses to wait for specific timing, are not necessarily strictly accurate. It is legal design, so no problem. However, depending on such inaccuracies, the nr_accesses can be larger than aggregation interval divided by sampling interval. For example, with the default setting (5 ms sampling interval and 100 ms aggregation interval) we frequently show regions having nr_accesses larger than 20. Also, if the execution of a DAMOS scheme takes a long time, next aggregation could happen before enough number of samples are collected. This is not what usual users would intuitively expect. Since access check sampling is the smallest unit work of DAMON, using the number of passed sampling intervals as the DAMON-internal timer can easily avoid these problems. That is, convert aggregation and ops update intervals to numbers of sampling intervals that need to be passed before those operations be executed, count the number of passed sampling intervals, and invoke the operations as soon as the specific amount of sampling intervals passed. Make the change. Note that this could make a behavioral change to settings that using intervals that not aligned by the sampling interval. For example, if the sampling interval is 5 ms and the aggregation interval is 12 ms, DAMON effectively uses 15 ms as its aggregation interval, because it checks whether the aggregation interval after sleeping the sampling interval. This change will make DAMON to effectively use 10 ms as aggregation interval, since it uses 'aggregation interval / sampling interval * sampling interval' as the effective aggregation interval, and we don't use floating point types. Usual users would have used aligned intervals, so this behavioral change is not expected to make any meaningful impact, so just make this change. Link: https://lkml.kernel.org/r/20230914021523.60649-1-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Stable-dep-of: 6376a8245956 ("mm/damon/core: make damon_start() waits until kdamond_fn() starts") Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-01-01 12:42:23 +00:00
Jiri Olsa	f64b2dc8a4	bpf: Fix prog_array_map_poke_run map poke update commit 4b7de801606e504e69689df71475d27e35336fb3 upstream. Lee pointed out issue found by syscaller [0] hitting BUG in prog array map poke update in prog_array_map_poke_run function due to error value returned from bpf_arch_text_poke function. There's race window where bpf_arch_text_poke can fail due to missing bpf program kallsym symbols, which is accounted for with check for -EINVAL in that BUG_ON call. The problem is that in such case we won't update the tail call jump and cause imbalance for the next tail call update check which will fail with -EBUSY in bpf_arch_text_poke. I'm hitting following race during the program load: CPU 0 CPU 1 bpf_prog_load bpf_check do_misc_fixups prog_array_map_poke_track map_update_elem bpf_fd_array_map_update_elem prog_array_map_poke_run bpf_arch_text_poke returns -EINVAL bpf_prog_kallsyms_add After bpf_arch_text_poke (CPU 1) fails to update the tail call jump, the next poke update fails on expected jump instruction check in bpf_arch_text_poke with -EBUSY and triggers the BUG_ON in prog_array_map_poke_run. Similar race exists on the program unload. Fixing this by moving the update to bpf_arch_poke_desc_update function which makes sure we call __bpf_arch_text_poke that skips the bpf address check. Each architecture has slightly different approach wrt looking up bpf address in bpf_arch_text_poke, so instead of splitting the function or adding new 'checkip' argument in previous version, it seems best to move the whole map_poke_run update as arch specific code. [0] https://syzkaller.appspot.com/bug?extid=97a4fe20470e9bc30810 Fixes: `ebf7d1f508` ("bpf, x64: rework pro/epilogue and tailcall handling in JIT") Reported-by: syzbot+97a4fe20470e9bc30810@syzkaller.appspotmail.com Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yonghong.song@linux.dev> Cc: Lee Jones <lee@kernel.org> Cc: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/bpf/20231206083041.1306660-2-jolsa@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-01-01 12:42:23 +00:00
Greg Kroah-Hartman	4c9646a796	Linux 6.6.8 Link: https://lore.kernel.org/r/20231218135104.927894164@linuxfoundation.org Tested-by: Ronald Warsow <rwarsow@gmx.de> Tested-by: SeongJae Park <sj@kernel.org> Tested-by: Conor Dooley <conor.dooley@microchip.com> Tested-by: Ricardo B. Marliere <ricardo@marliere.net> Tested-by: Salvatore Bonaccorso <carnil@debian.org> Tested-by: Takeshi Ogasawara <takeshi.ogasawara@futuring-girl.com> Tested-by: Justin M. Forbes <jforbes@fedoraproject.org> Tested-by: Shuah Khan <skhan@linuxfoundation.org> Tested-by: Kelsey Steele <kelseysteele@linux.microsoft.com> Tested-by: Bagas Sanjaya <bagasdotme@gmail.com> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Tested-by: Ron Economos <re@w6rz.net> Tested-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-20 17:02:07 +01:00
Patrisious Haddad	885faf3c7e	RDMA/mlx5: Change the key being sent for MPV device affiliation commit 02e7d139e5e24abb5fde91934fc9dc0344ac1926 upstream. Change the key that we send from IB driver to EN driver regarding the MPV device affiliation, since at that stage the IB device is not yet initialized, so its index would be zero for different IB devices and cause wrong associations between unrelated master and slave devices. Instead use a unique value from inside the core device which is already initialized at this stage. Fixes: 0d293714ac32 ("RDMA/mlx5: Send events from IB driver about device affiliation state") Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Link: https://lore.kernel.org/r/ac7e66357d963fc68d7a419515180212c96d137d.1697705185.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-20 17:02:06 +01:00
Fangrui Song	06f61af802	x86/speculation, objtool: Use absolute relocations for annotations commit b8ec60e1186cdcfce41e7db4c827cb107e459002 upstream. .discard.retpoline_safe sections do not have the SHF_ALLOC flag. These sections referencing text sections' STT_SECTION symbols with PC-relative relocations like R_386_PC32 [0] is conceptually not suitable. Newer LLD will report warnings for REL relocations even for relocatable links [1]: ld.lld: warning: vmlinux.a(drivers/i2c/busses/i2c-i801.o):(.discard.retpoline_safe+0x120): has non-ABS relocation R_386_PC32 against symbol '' Switch to absolute relocations instead, which indicate link-time addresses. In a relocatable link, these addresses are also output section offsets, used by checks in tools/objtool/check.c. When linking vmlinux, these .discard.* sections will be discarded, therefore it is not a problem that R_X86_64_32 cannot represent a kernel address. Alternatively, we could set the SHF_ALLOC flag for .discard.* sections, but I think non-SHF_ALLOC for sections to be discarded makes more sense. Note: if we decide to never support REL architectures (e.g. arm, i386), we can utilize R__NONE relocations (.reloc ., BFD_RELOC_NONE, sym), making .discard. sections zero-sized. That said, the section content waste is 4 bytes per entry, much smaller than sizeof(Elf{32,64}_Rel). [0] commit `1c0c1faf56` ("objtool: Use relative pointers for annotations") [1] https://github.com/ClangBuiltLinux/linux/issues/1937 Signed-off-by: Fangrui Song <maskray@google.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/20230920001728.1439947-1-maskray@google.com Cc: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-20 17:02:06 +01:00
Steven Rostedt (Google)	3432f9686a	ring-buffer: Have rb_time_cmpxchg() set the msb counter too commit 0aa0e5289cfe984a8a9fdd79ccf46ccf080151f7 upstream. The rb_time_cmpxchg() on 32-bit architectures requires setting three 32-bit words to represent the 64-bit timestamp, with some salt for synchronization. Those are: msb, top, and bottom The issue is, the rb_time_cmpxchg() did not properly salt the msb portion, and the msb that was written was stale. Link: https://lore.kernel.org/linux-trace-kernel/20231215084114.20899342@rorschach.local.home Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Fixes: `f03f2abce4` ("ring-buffer: Have 32 bit time stamps use all 64 bits") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-20 17:02:06 +01:00
Steven Rostedt (Google)	b3778a2fa4	ring-buffer: Do not try to put back write_stamp commit dd939425707898da992e59ab0fcfae4652546910 upstream. If an update to an event is interrupted by another event between the time the initial event allocated its buffer and where it wrote to the write_stamp, the code try to reset the write stamp back to the what it had just overwritten. It knows that it was overwritten via checking the before_stamp, and if it didn't match what it wrote to the before_stamp before it allocated its space, it knows it was overwritten. To put back the write_stamp, it uses the before_stamp it read. The problem here is that by writing the before_stamp to the write_stamp it makes the two equal again, which means that the write_stamp can be considered valid as the last timestamp written to the ring buffer. But this is not necessarily true. The event that interrupted the event could have been interrupted in a way that it was interrupted as well, and can end up leaving with an invalid write_stamp. But if this happens and returns to this context that uses the before_stamp to update the write_stamp again, it can possibly incorrectly make it valid, causing later events to have in correct time stamps. As it is OK to leave this function with an invalid write_stamp (one that doesn't match the before_stamp), there's no reason to try to make it valid again in this case. If this race happens, then just leave with the invalid write_stamp and the next event to come along will just add a absolute timestamp and validate everything again. Bonus points: This gets rid of another cmpxchg64! Link: https://lore.kernel.org/linux-trace-kernel/20231214222921.193037a7@gandalf.local.home Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Vincent Donnefort <vdonnefort@google.com> Fixes: `a389d86f7f` ("ring-buffer: Have nested events still record running time stamp") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-20 17:02:06 +01:00
Steven Rostedt (Google)	bc17bc9643	ring-buffer: Fix a race in rb_time_cmpxchg() for 32 bit archs commit fff88fa0fbc7067ba46dde570912d63da42c59a9 upstream. Mathieu Desnoyers pointed out an issue in the rb_time_cmpxchg() for 32 bit architectures. That is: static bool rb_time_cmpxchg(rb_time_t t, u64 expect, u64 set) { unsigned long cnt, top, bottom, msb; unsigned long cnt2, top2, bottom2, msb2; u64 val; / The cmpxchg always fails if it interrupted an update / if (!__rb_time_read(t, &val, &cnt2)) return false; if (val != expect) return false; <<<< interrupted here! cnt = local_read(&t->cnt); The problem is that the synchronization counter in the rb_time_t is read after* the value of the timestamp is read. That means if an interrupt were to come in between the value being read and the counter being read, it can change the value and the counter and the interrupted process would be clueless about it! The counter needs to be read first and then the value. That way it is easy to tell if the value is stale or not. If the counter hasn't been updated, then the value is still good. Link: https://lore.kernel.org/linux-trace-kernel/20231211201324.652870-1-mathieu.desnoyers@efficios.com/ Link: https://lore.kernel.org/linux-trace-kernel/20231212115301.7a9c9a64@gandalf.local.home Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Fixes: `10464b4aa6` ("ring-buffer: Add rb_time_t 64 bit operations for speeding up 32 bit") Reported-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-20 17:02:06 +01:00
Steven Rostedt (Google)	ae76d9bdf1	ring-buffer: Fix writing to the buffer with max_data_size commit b3ae7b67b87fed771fa5bf95389df06b0433603e upstream. The maximum ring buffer data size is the maximum size of data that can be recorded on the ring buffer. Events must be smaller than the sub buffer data size minus any meta data. This size is checked before trying to allocate from the ring buffer because the allocation assumes that the size will fit on the sub buffer. The maximum size was calculated as the size of a sub buffer page (which is currently PAGE_SIZE minus the sub buffer header) minus the size of the meta data of an individual event. But it missed the possible adding of a time stamp for events that are added long enough apart that the event meta data can't hold the time delta. When an event is added that is greater than the current BUF_MAX_DATA_SIZE minus the size of a time stamp, but still less than or equal to BUF_MAX_DATA_SIZE, the ring buffer would go into an infinite loop, looking for a page that can hold the event. Luckily, there's a check for this loop and after 1000 iterations and a warning is emitted and the ring buffer is disabled. But this should never happen. This can happen when a large event is added first, or after a long period where an absolute timestamp is prefixed to the event, increasing its size by 8 bytes. This passes the check and then goes into the algorithm that causes the infinite loop. For events that are the first event on the sub-buffer, it does not need to add a timestamp, because the sub-buffer itself contains an absolute timestamp, and adding one is redundant. The fix is to check if the event is to be the first event on the sub-buffer, and if it is, then do not add a timestamp. This also fixes 32 bit adding a timestamp when a read of before_stamp or write_stamp is interrupted. There's still no need to add that timestamp if the event is going to be the first event on the sub buffer. Also, if the buffer has "time_stamp_abs" set, then also check if the length plus the timestamp is greater than the BUF_MAX_DATA_SIZE. Link: https://lore.kernel.org/all/20231212104549.58863438@gandalf.local.home/ Link: https://lore.kernel.org/linux-trace-kernel/20231212071837.5fdd6c13@gandalf.local.home Link: https://lore.kernel.org/linux-trace-kernel/20231212111617.39e02849@gandalf.local.home Cc: stable@vger.kernel.org Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Fixes: `a4543a2fa9` ("ring-buffer: Get timestamp after event is allocated") Fixes: `58fbc3c632` ("ring-buffer: Consolidate add_timestamp to remove some branches") Reported-by: Kent Overstreet <kent.overstreet@linux.dev> # (on IRC) Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-20 17:02:06 +01:00
Steven Rostedt (Google)	307ed139d7	ring-buffer: Have saved event hold the entire event commit b049525855fdd0024881c9b14b8fbec61c3f53d3 upstream. For the ring buffer iterator (non-consuming read), the event needs to be copied into the iterator buffer to make sure that a writer does not overwrite it while the user is reading it. If a write happens during the copy, the buffer is simply discarded. But the temp buffer itself was not big enough. The allocation of the buffer was only BUF_MAX_DATA_SIZE, which is the maximum data size that can be passed into the ring buffer and saved. But the temp buffer needs to hold the meta data as well. That would be BUF_PAGE_SIZE and not BUF_MAX_DATA_SIZE. Link: https://lore.kernel.org/linux-trace-kernel/20231212072558.61f76493@gandalf.local.home Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Fixes: `785888c544` ("ring-buffer: Have rb_iter_head_event() handle concurrent writer") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-20 17:02:06 +01:00
Steven Rostedt (Google)	5e58483677	ring-buffer: Do not update before stamp when switching sub-buffers commit 9e45e39dc249c970d99d2681f6bcb55736fd725c upstream. The ring buffer timestamps are synchronized by two timestamp placeholders. One is the "before_stamp" and the other is the "write_stamp" (sometimes referred to as the "after stamp" but only in the comments. These two stamps are key to knowing how to handle nested events coming in with a lockless system. When moving across sub-buffers, the before stamp is updated but the write stamp is not. There's an effort to put back the before stamp to something that seems logical in case there's nested events. But as the current event is about to cross sub-buffers, and so will any new nested event that happens, updating the before stamp is useless, and could even introduce new race conditions. The first event on a sub-buffer simply uses the sub-buffer's timestamp and keeps a "delta" of zero. The "before_stamp" and "write_stamp" are not used in the algorithm in this case. There's no reason to try to fix the before_stamp when this happens. As a bonus, it removes a cmpxchg() when crossing sub-buffers! Link: https://lore.kernel.org/linux-trace-kernel/20231211114420.36dde01b@gandalf.local.home Cc: stable@vger.kernel.org Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Fixes: `a389d86f7f` ("ring-buffer: Have nested events still record running time stamp") Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-20 17:02:05 +01:00
Steven Rostedt (Google)	5062b8c5ae	tracing: Update snapshot buffer on resize if it is allocated commit d06aff1cb13d2a0d52b48e605462518149c98c81 upstream. The snapshot buffer is to mimic the main buffer so that when a snapshot is needed, the snapshot and main buffer are swapped. When the snapshot buffer is allocated, it is set to the minimal size that the ring buffer may be at and still functional. When it is allocated it becomes the same size as the main ring buffer, and when the main ring buffer changes in size, it should do. Currently, the resize only updates the snapshot buffer if it's used by the current tracer (ie. the preemptirqsoff tracer). But it needs to be updated anytime it is allocated. When changing the size of the main buffer, instead of looking to see if the current tracer is utilizing the snapshot buffer, just check if it is allocated to know if it should be updated or not. Also fix typo in comment just above the code change. Link: https://lore.kernel.org/linux-trace-kernel/20231210225447.48476a6a@rorschach.local.home Cc: stable@vger.kernel.org Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Fixes: `ad909e21bb` ("tracing: Add internal tracing_snapshot() functions") Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-20 17:02:05 +01:00
Steven Rostedt (Google)	b02bf0d952	ring-buffer: Fix memory leak of free page commit 17d801758157bec93f26faaf5ff1a8b9a552d67a upstream. Reading the ring buffer does a swap of a sub-buffer within the ring buffer with a empty sub-buffer. This allows the reader to have full access to the content of the sub-buffer that was swapped out without having to worry about contention with the writer. The readers call ring_buffer_alloc_read_page() to allocate a page that will be used to swap with the ring buffer. When the code is finished with the reader page, it calls ring_buffer_free_read_page(). Instead of freeing the page, it stores it as a spare. Then next call to ring_buffer_alloc_read_page() will return this spare instead of calling into the memory management system to allocate a new page. Unfortunately, on freeing of the ring buffer, this spare page is not freed, and causes a memory leak. Link: https://lore.kernel.org/linux-trace-kernel/20231210221250.7b9cc83c@rorschach.local.home Cc: stable@vger.kernel.org Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Fixes: `73a757e631` ("ring-buffer: Return reader page back into existing ring buffer") Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-20 17:02:05 +01:00
Paulo Alcantara	e72ed491bc	smb: client: fix OOB in smb2_query_reparse_point() commit 3a42709fa909e22b0be4bb1e2795aa04ada732a3 upstream. Validate @ioctl_rsp->OutputOffset and @ioctl_rsp->OutputCount so that their sum does not wrap to a number that is smaller than @reparse_buf and we end up with a wild pointer as follows: BUG: unable to handle page fault for address: ffff88809c5cd45f #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 4a01067 P4D 4a01067 PUD 0 Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 2 PID: 1260 Comm: mount.cifs Not tainted 6.7.0-rc4 #2 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014 RIP: 0010:smb2_query_reparse_point+0x3e0/0x4c0 [cifs] Code: ff ff e8 f3 51 fe ff 41 89 c6 58 5a 45 85 f6 0f 85 14 fe ff ff 49 8b 57 48 8b 42 60 44 8b 42 64 42 8d 0c 00 49 39 4f 50 72 40 <8b> 04 02 48 8b 9d f0 fe ff ff 49 8b 57 50 89 03 48 8b 9d e8 fe ff RSP: 0018:ffffc90000347a90 EFLAGS: 00010212 RAX: 000000008000001f RBX: ffff88800ae11000 RCX: 00000000000000ec RDX: ffff88801c5cd440 RSI: 0000000000000000 RDI: ffffffff82004aa4 RBP: ffffc90000347bb0 R08: 00000000800000cd R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000024 R12: ffff8880114d4100 R13: ffff8880114d4198 R14: 0000000000000000 R15: ffff8880114d4000 FS: 00007f02c07babc0(0000) GS:ffff88806ba00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff88809c5cd45f CR3: 0000000011750000 CR4: 0000000000750ef0 PKRU: 55555554 Call Trace: <TASK> ? __die+0x23/0x70 ? page_fault_oops+0x181/0x480 ? search_module_extables+0x19/0x60 ? srso_alias_return_thunk+0x5/0xfbef5 ? exc_page_fault+0x1b6/0x1c0 ? asm_exc_page_fault+0x26/0x30 ? _raw_spin_unlock_irqrestore+0x44/0x60 ? smb2_query_reparse_point+0x3e0/0x4c0 [cifs] cifs_get_fattr+0x16e/0xa50 [cifs] ? srso_alias_return_thunk+0x5/0xfbef5 ? lock_acquire+0xbf/0x2b0 cifs_root_iget+0x163/0x5f0 [cifs] cifs_smb3_do_mount+0x5bd/0x780 [cifs] smb3_get_tree+0xd9/0x290 [cifs] vfs_get_tree+0x2c/0x100 ? capable+0x37/0x70 path_mount+0x2d7/0xb80 ? srso_alias_return_thunk+0x5/0xfbef5 ? _raw_spin_unlock_irqrestore+0x44/0x60 __x64_sys_mount+0x11a/0x150 do_syscall_64+0x47/0xf0 entry_SYSCALL_64_after_hwframe+0x6f/0x77 RIP: 0033:0x7f02c08d5b1e Fixes: `2e4564b31b` ("smb3: add support for stat of WSL reparse points for special file types") Cc: stable@vger.kernel.org Reported-by: Robert Morris <rtm@csail.mit.edu> Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-20 17:02:05 +01:00
Paulo Alcantara	ef748d4a62	smb: client: fix NULL deref in asn1_ber_decoder() commit 90d025c2e953c11974e76637977c473200593a46 upstream. If server replied SMB2_NEGOTIATE with a zero SecurityBufferOffset, smb2_get_data_area() sets @len to non-zero but return NULL, so decode_negTokeninit() ends up being called with a NULL @security_blob: BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 2 PID: 871 Comm: mount.cifs Not tainted 6.7.0-rc4 #2 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014 RIP: 0010:asn1_ber_decoder+0x173/0xc80 Code: 01 4c 39 2c 24 75 09 45 84 c9 0f 85 2f 03 00 00 48 8b 14 24 4c 29 ea 48 83 fa 01 0f 86 1e 07 00 00 48 8b 74 24 28 4d 8d 5d 01 <42> 0f b6 3c 2e 89 fa 40 88 7c 24 5c f7 d2 83 e2 1f 0f 84 3d 07 00 RSP: 0018:ffffc9000063f950 EFLAGS: 00010202 RAX: 0000000000000002 RBX: 0000000000000000 RCX: 000000000000004a RDX: 000000000000004a RSI: 0000000000000000 RDI: 0000000000000000 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000002 R11: 0000000000000001 R12: 0000000000000000 R13: 0000000000000000 R14: 000000000000004d R15: 0000000000000000 FS: 00007fce52b0fbc0(0000) GS:ffff88806ba00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000001ae64000 CR4: 0000000000750ef0 PKRU: 55555554 Call Trace: <TASK> ? __die+0x23/0x70 ? page_fault_oops+0x181/0x480 ? __stack_depot_save+0x1e6/0x480 ? exc_page_fault+0x6f/0x1c0 ? asm_exc_page_fault+0x26/0x30 ? asn1_ber_decoder+0x173/0xc80 ? check_object+0x40/0x340 decode_negTokenInit+0x1e/0x30 [cifs] SMB2_negotiate+0xc99/0x17c0 [cifs] ? smb2_negotiate+0x46/0x60 [cifs] ? srso_alias_return_thunk+0x5/0xfbef5 smb2_negotiate+0x46/0x60 [cifs] cifs_negotiate_protocol+0xae/0x130 [cifs] cifs_get_smb_ses+0x517/0x1040 [cifs] ? srso_alias_return_thunk+0x5/0xfbef5 ? srso_alias_return_thunk+0x5/0xfbef5 ? queue_delayed_work_on+0x5d/0x90 cifs_mount_get_session+0x78/0x200 [cifs] dfs_mount_share+0x13a/0x9f0 [cifs] ? srso_alias_return_thunk+0x5/0xfbef5 ? lock_acquire+0xbf/0x2b0 ? find_nls+0x16/0x80 ? srso_alias_return_thunk+0x5/0xfbef5 cifs_mount+0x7e/0x350 [cifs] cifs_smb3_do_mount+0x128/0x780 [cifs] smb3_get_tree+0xd9/0x290 [cifs] vfs_get_tree+0x2c/0x100 ? capable+0x37/0x70 path_mount+0x2d7/0xb80 ? srso_alias_return_thunk+0x5/0xfbef5 ? _raw_spin_unlock_irqrestore+0x44/0x60 __x64_sys_mount+0x11a/0x150 do_syscall_64+0x47/0xf0 entry_SYSCALL_64_after_hwframe+0x6f/0x77 RIP: 0033:0x7fce52c2ab1e Fix this by setting @len to zero when @off == 0 so callers won't attempt to dereference non-existing data areas. Reported-by: Robert Morris <rtm@csail.mit.edu> Cc: stable@vger.kernel.org Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-20 17:02:05 +01:00
Paulo Alcantara	17a0f64cc0	smb: client: fix potential OOBs in smb2_parse_contexts() commit af1689a9b7701d9907dfc84d2a4b57c4bc907144 upstream. Validate offsets and lengths before dereferencing create contexts in smb2_parse_contexts(). This fixes following oops when accessing invalid create contexts from server: BUG: unable to handle page fault for address: ffff8881178d8cc3 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 4a01067 P4D 4a01067 PUD 0 Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 3 PID: 1736 Comm: mount.cifs Not tainted 6.7.0-rc4 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014 RIP: 0010:smb2_parse_contexts+0xa0/0x3a0 [cifs] Code: f8 10 75 13 48 b8 93 ad 25 50 9c b4 11 e7 49 39 06 0f 84 d2 00 00 00 8b 45 00 85 c0 74 61 41 29 c5 48 01 c5 41 83 fd 0f 76 55 <0f> b7 7d 04 0f b7 45 06 4c 8d 74 3d 00 66 83 f8 04 75 bc ba 04 00 RSP: 0018:ffffc900007939e0 EFLAGS: 00010216 RAX: ffffc90000793c78 RBX: ffff8880180cc000 RCX: ffffc90000793c90 RDX: ffffc90000793cc0 RSI: ffff8880178d8cc0 RDI: ffff8880180cc000 RBP: ffff8881178d8cbf R08: ffffc90000793c22 R09: 0000000000000000 R10: ffff8880180cc000 R11: 0000000000000024 R12: 0000000000000000 R13: 0000000000000020 R14: 0000000000000000 R15: ffffc90000793c22 FS: 00007f873753cbc0(0000) GS:ffff88806bc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff8881178d8cc3 CR3: 00000000181ca000 CR4: 0000000000750ef0 PKRU: 55555554 Call Trace: <TASK> ? __die+0x23/0x70 ? page_fault_oops+0x181/0x480 ? search_module_extables+0x19/0x60 ? srso_alias_return_thunk+0x5/0xfbef5 ? exc_page_fault+0x1b6/0x1c0 ? asm_exc_page_fault+0x26/0x30 ? smb2_parse_contexts+0xa0/0x3a0 [cifs] SMB2_open+0x38d/0x5f0 [cifs] ? smb2_is_path_accessible+0x138/0x260 [cifs] smb2_is_path_accessible+0x138/0x260 [cifs] cifs_is_path_remote+0x8d/0x230 [cifs] cifs_mount+0x7e/0x350 [cifs] cifs_smb3_do_mount+0x128/0x780 [cifs] smb3_get_tree+0xd9/0x290 [cifs] vfs_get_tree+0x2c/0x100 ? capable+0x37/0x70 path_mount+0x2d7/0xb80 ? srso_alias_return_thunk+0x5/0xfbef5 ? _raw_spin_unlock_irqrestore+0x44/0x60 __x64_sys_mount+0x11a/0x150 do_syscall_64+0x47/0xf0 entry_SYSCALL_64_after_hwframe+0x6f/0x77 RIP: 0033:0x7f8737657b1e Reported-by: Robert Morris <rtm@csail.mit.edu> Cc: stable@vger.kernel.org Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-20 17:02:05 +01:00
Paulo Alcantara	534733397d	smb: client: fix OOB in receive_encrypted_standard() commit eec04ea119691e65227a97ce53c0da6b9b74b0b7 upstream. Fix potential OOB in receive_encrypted_standard() if server returned a large shdr->NextCommand that would end up writing off the end of @next_buffer. Fixes: `b24df3e30c` ("cifs: update receive_encrypted_standard to handle compounded responses") Cc: stable@vger.kernel.org Reported-by: Robert Morris <rtm@csail.mit.edu> Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-20 17:02:05 +01:00
Ville Syrjälä	4029b025be	drm/i915: Fix remapped stride with CCS on ADL+ commit 0ccd963fe555451b1f84e6d14d2b3ef03dd5c947 upstream. On ADL+ the hardware automagically calculates the CCS AUX surface stride from the main surface stride, so when remapping we can't really play a lot of tricks with the main surface stride, or else the AUX surface stride would get miscalculated and no longer match the actual data layout in memory. Supposedly we could remap in 256 main surface tile units (AUX page(4096)/cachline(64)*4(4x1 main surface tiles per AUX cacheline)=256 main surface tiles), but the extra complexity is probably not worth the hassle. So let's just make sure our mapping stride is calculated from the full framebuffer stride (instead of the framebuffer width). This way the stride we program into PLANE_STRIDE will be the original framebuffer stride, and thus there will be no change to the AUX stride/layout. Cc: stable@vger.kernel.org Cc: Imre Deak <imre.deak@intel.com> Cc: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231205180308.7505-1-ville.syrjala@linux.intel.com Reviewed-by: Imre Deak <imre.deak@intel.com> (cherry picked from commit 2c12eb36f849256f5eb00ffaee9bf99396fd3814) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-20 17:02:04 +01:00
Ville Syrjälä	a9d951b007	drm/i915: Fix intel_atomic_setup_scalers() plane_state handling commit c3070f080f9ba18dea92eaa21730f7ab85b5c8f4 upstream. Since the plane_state variable is declared outside the scaler_users loop in intel_atomic_setup_scalers(), and it's never reset back to NULL inside the loop we may end up calling intel_atomic_setup_scaler() with a non-NULL plane state for the pipe scaling case. That is bad because intel_atomic_setup_scaler() determines whether we are doing plane scaling or pipe scaling based on plane_state!=NULL. The end result is that we may miscalculate the scaler mode for pipe scaling. The hardware becomes somewhat upset if we end up in this situation when scanning out a planar format on a SDR plane. We end up programming the pipe scaler into planar mode as well, and the result is a screenfull of garbage. Fix the situation by making sure we pass the correct plane_state==NULL when calculating the scaler mode for pipe scaling. Cc: stable@vger.kernel.org Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231207193441.20206-2-ville.syrjala@linux.intel.com Reviewed-by: Jani Nikula <jani.nikula@intel.com> (cherry picked from commit e81144106e21271c619f0c722a09e27ccb8c043d) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-20 17:02:04 +01:00
Ville Syrjälä	b6295a167f	drm/i915: Fix ADL+ tiled plane stride when the POT stride is smaller than the original commit 324b70e997aab0a7deab8cb90711faccda4e98c8 upstream. plane_view_scanout_stride() currently assumes that we had to pad the mapping stride with dummy pages in order to align it. But that is not the case if the original fb stride exceeds the aligned stride used to populate the remapped view, which is calculated from the user specified framebuffer width rather than the user specified framebuffer stride. Ignore the original fb stride in this case and just stick to the POT aligned stride. Getting this wrong will cause the plane to fetch the wrong data, and can lead to fault errors if the page tables at the bogus location aren't even populated. TODO: figure out if this is OK for CCS, or if we should instead increase the width of the view to cover the entire user specified fb stride instead... Cc: Imre Deak <imre.deak@intel.com> Cc: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231204202443.31247-1-ville.syrjala@linux.intel.com Reviewed-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> (cherry picked from commit 01a39f1c4f1220a4e6a25729fae87ff5794cbc52) Cc: stable@vger.kernel.org Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-20 17:02:04 +01:00
Mario Limonciello	a8f922ad2f	drm/amd/display: Disable PSR-SU on Parade 0803 TCON again commit e7ab758741672acb21c5d841a9f0309d30e48a06 upstream. When screen brightness is rapidly changed and PSR-SU is enabled the display hangs on panels with this TCON even on the latest DCN 3.1.4 microcode (0x8002a81 at this time). This was disabled previously as commit `072030b178` ("drm/amd: Disable PSR-SU on Parade 0803 TCON") but reverted as commit `1e66a17ce5` ("Revert "drm/amd: Disable PSR-SU on Parade 0803 TCON"") in favor of testing for a new enough microcode (commit `cd2e31a9ab` ("drm/amd/display: Set minimum requirement for using PSR-SU on Phoenix")). As hangs are still happening specifically with this TCON, disable PSR-SU again for it until it can be root caused. Cc: stable@vger.kernel.org Cc: aaron.ma@canonical.com Cc: binli@gnome.org Cc: Marc Rossi <Marc.Rossi@amd.com> Cc: Hamza Mahfooz <Hamza.Mahfooz@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2046131 Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-20 17:02:04 +01:00
Mario Limonciello	c7f6e836e6	drm/amd/display: Restore guard against default backlight value < 1 nit commit b96ab339ee50470d13a1faa6ad94d2218a7cd49f upstream. Mark reports that brightness is not restored after Xorg dpms screen blank. This behavior was introduced by commit d9e865826c20 ("drm/amd/display: Simplify brightness initialization") which dropped the cached backlight value in display code, but also removed code for when the default value read back was less than 1 nit. Restore this code so that the backlight brightness is restored to the correct default value in this circumstance. Reported-by: Mark Herbert <mark.herbert42@gmail.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3031 Cc: stable@vger.kernel.org Cc: Camille Cho <camille.cho@amd.com> Cc: Krunoslav Kovac <krunoslav.kovac@amd.com> Cc: Hamza Mahfooz <hamza.mahfooz@amd.com> Fixes: d9e865826c20 ("drm/amd/display: Simplify brightness initialization") Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-20 17:02:04 +01:00
Jani Nikula	a511e851d4	drm/edid: also call add modes in EDID connector update fallback commit 759f14e20891de72e676d9d738eb2c573aa15f52 upstream. When the separate add modes call was added back in commit `c533b5167c` ("drm/edid: add separate drm_edid_connector_add_modes()"), it failed to address drm_edid_override_connector_update(). Also call add modes there. Reported-by: bbaa <bbaa@bbaa.fun> Closes: https://lore.kernel.org/r/930E9B4C7D91FDFF+29b34d89-8658-4910-966a-c772f320ea03@bbaa.fun Fixes: `c533b5167c` ("drm/edid: add separate drm_edid_connector_add_modes()") Cc: <stable@vger.kernel.org> # v6.3+ Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231207093821.2654267-1-jani.nikula@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-20 17:02:04 +01:00
Christian König	d50670681d	drm/amdgpu: fix tear down order in amdgpu_vm_pt_free commit ceb9a321e7639700844aa3bf234a4e0884f13b77 upstream. When freeing PD/PT with shadows it can happen that the shadow destruction races with detaching the PD/PT from the VM causing a NULL pointer dereference in the invalidation code. Fix this by detaching the the PD/PT from the VM first and then freeing the shadow instead. Signed-off-by: Christian König <christian.koenig@amd.com> Fixes: https://gitlab.freedesktop.org/drm/amd/-/issues/2867 Cc: <stable@vger.kernel.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-20 17:02:04 +01:00
Boris Burkov	d3cf024353	btrfs: don't clear qgroup reserved bit in release_folio commit a86805504b88f636a6458520d85afdf0634e3c6b upstream. The EXTENT_QGROUP_RESERVED bit is used to "lock" regions of the file for duplicate reservations. That is two writes to that range in one transaction shouldn't create two reservations, as the reservation will only be freed once when the write finally goes down. Therefore, it is never OK to clear that bit without freeing the associated qgroup reserve. At this point, we don't want to be freeing the reserve, so mask off the bit. CC: stable@vger.kernel.org # 5.15+ Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Boris Burkov <boris@bur.io> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-20 17:02:03 +01:00
Boris Burkov	1823491513	btrfs: fix qgroup_free_reserved_data int overflow commit 9e65bfca24cf1d77e4a5c7a170db5867377b3fe7 upstream. The reserved data counter and input parameter is a u64, but we inadvertently accumulate it in an int. Overflowing that int results in freeing the wrong amount of data and breaking reserve accounting. Unfortunately, this overflow rot spreads from there, as the qgroup release/free functions rely on returning an int to take advantage of negative values for error codes. Therefore, the full fix is to return the "released" or "freed" amount by a u64 argument and to return 0 or negative error code via the return value. Most of the call sites simply ignore the return value, though some of them handle the error and count the returned bytes. Change all of them accordingly. CC: stable@vger.kernel.org # 6.1+ Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Boris Burkov <boris@bur.io> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-20 17:02:03 +01:00
Boris Burkov	14570dfa17	btrfs: free qgroup reserve when ORDERED_IOERR is set commit f63e1164b90b385cd832ff0fdfcfa76c3cc15436 upstream. An ordered extent completing is a critical moment in qgroup reserve handling, as the ownership of the reservation is handed off from the ordered extent to the delayed ref. In the happy path we release (unlock) but do not free (decrement counter) the reservation, and the delayed ref drives the free. However, on an error, we don't create a delayed ref, since there is no ref to add. Therefore, free on the error path. CC: stable@vger.kernel.org # 6.1+ Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Boris Burkov <boris@bur.io> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-20 17:02:03 +01:00
Ignat Korchagin	37b561d559	kexec: drop dependency on ARCH_SUPPORTS_KEXEC from CRASH_DUMP commit c41bd2514184d75db087fe4c1221237fb7922875 upstream. In commit f8ff23429c62 ("kernel/Kconfig.kexec: drop select of KEXEC for CRASH_DUMP") we tried to fix a config regression, where CONFIG_CRASH_DUMP required CONFIG_KEXEC. However, it was not enough at least for arm64 platforms. While further testing the patch with our arm64 config I noticed that CONFIG_CRASH_DUMP is unavailable in menuconfig. This is because CONFIG_CRASH_DUMP still depends on the new CONFIG_ARCH_SUPPORTS_KEXEC introduced in commit `91506f7e5d` ("arm64/kexec: refactor for kernel/Kconfig.kexec") and on arm64 CONFIG_ARCH_SUPPORTS_KEXEC requires CONFIG_PM_SLEEP_SMP=y, which in turn requires either CONFIG_SUSPEND=y or CONFIG_HIBERNATION=y neither of which are set in our config. Given that we already established that CONFIG_KEXEC (which is a switch for kexec system call itself) is not required for CONFIG_CRASH_DUMP drop CONFIG_ARCH_SUPPORTS_KEXEC dependency as well. The arm64 kernel builds just fine with CONFIG_CRASH_DUMP=y and with both CONFIG_KEXEC=n and CONFIG_KEXEC_FILE=n after f8ff23429c62 ("kernel/Kconfig.kexec: drop select of KEXEC for CRASH_DUMP") and this patch are applied given that the necessary shared bits are included via CONFIG_KEXEC_CORE dependency. [bhe@redhat.com: don't export some symbols when CONFIG_MMU=n] Link: https://lkml.kernel.org/r/ZW03ODUKGGhP1ZGU@MiWiFi-R3L-srv [bhe@redhat.com: riscv, kexec: fix dependency of two items] Link: https://lkml.kernel.org/r/ZW04G/SKnhbE5mnX@MiWiFi-R3L-srv Link: https://lkml.kernel.org/r/20231129220409.55006-1-ignat@cloudflare.com Fixes: `91506f7e5d` ("arm64/kexec: refactor for kernel/Kconfig.kexec") Signed-off-by: Ignat Korchagin <ignat@cloudflare.com> Signed-off-by: Baoquan He <bhe@redhat.com> Acked-by: Baoquan He <bhe@redhat.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: <stable@vger.kernel.org> # 6.6+: f8ff234: kernel/Kconfig.kexec: drop select of KEXEC for CRASH_DUMP Cc: <stable@vger.kernel.org> # 6.6+ Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-20 17:02:03 +01:00
David Stevens	7a4ae7acd2	mm/shmem: fix race in shmem_undo_range w/THP commit 55ac8bbe358bdd2f3c044c12f249fd22d48fe015 upstream. Split folios during the second loop of shmem_undo_range. It's not sufficient to only split folios when dealing with partial pages, since it's possible for a THP to be faulted in after that point. Calling truncate_inode_folio in that situation can result in throwing away data outside of the range being targeted. [akpm@linux-foundation.org: tidy up comment layout] Link: https://lkml.kernel.org/r/20230418084031.3439795-1-stevensd@google.com Fixes: `b9a8a4195c` ("truncate,shmem: Handle truncates that split large folios") Signed-off-by: David Stevens <stevensd@chromium.org> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Suleiman Souhlal <suleiman@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-20 17:02:03 +01:00
Yu Zhao	a107d6a132	mm/mglru: reclaim offlined memcgs harder commit 4376807bf2d5371c3e00080c972be568c3f8a7d1 upstream. In the effort to reduce zombie memcgs [1], it was discovered that the memcg LRU doesn't apply enough pressure on offlined memcgs. Specifically, instead of rotating them to the tail of the current generation (MEMCG_LRU_TAIL) for a second attempt, it moves them to the next generation (MEMCG_LRU_YOUNG) after the first attempt. Not applying enough pressure on offlined memcgs can cause them to build up, and this can be particularly harmful to memory-constrained systems. On Pixel 8 Pro, launching apps for 50 cycles: Before After Change Zombie memcgs 45 35 -22% [1] https://lore.kernel.org/CABdmKX2M6koq4Q0Cmp_-=wbP0Qa190HdEGGaHfxNS05gAkUtPA@mail.gmail.com/ Link: https://lkml.kernel.org/r/20231208061407.2125867-4-yuzhao@google.com Fixes: `e4dde56cd2` ("mm: multi-gen LRU: per-node lru_gen_folio lists") Signed-off-by: Yu Zhao <yuzhao@google.com> Reported-by: T.J. Mercier <tjmercier@google.com> Tested-by: T.J. Mercier <tjmercier@google.com> Cc: Charan Teja Kalla <quic_charante@quicinc.com> Cc: Hillf Danton <hdanton@sina.com> Cc: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com> Cc: Kairui Song <ryncsn@gmail.com> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-20 17:02:03 +01:00
Yu Zhao	6b131c2a28	mm/mglru: respect min_ttl_ms with memcgs commit 8aa420617918d12d1f5d55030a503c9418e73c2c upstream. While investigating kswapd "consuming 100% CPU" [1] (also see "mm/mglru: try to stop at high watermarks"), it was discovered that the memcg LRU can breach the thrashing protection imposed by min_ttl_ms. Before the memcg LRU: kswapd() shrink_node_memcgs() mem_cgroup_iter() inc_max_seq() // always hit a different memcg lru_gen_age_node() mem_cgroup_iter() check the timestamp of the oldest generation After the memcg LRU: kswapd() shrink_many() restart: iterate the memcg LRU: inc_max_seq() // occasionally hit the same memcg if raced with lru_gen_rotate_memcg(): goto restart lru_gen_age_node() mem_cgroup_iter() check the timestamp of the oldest generation Specifically, when the restart happens in shrink_many(), it needs to stick with the (memcg LRU) generation it began with. In other words, it should neither re-read memcg_lru->seq nor age an lruvec of a different generation. Otherwise it can hit the same memcg multiple times without giving lru_gen_age_node() a chance to check the timestamp of that memcg's oldest generation (against min_ttl_ms). [1] https://lore.kernel.org/CAK8fFZ4DY+GtBA40Pm7Nn5xCHy+51w3sfxPqkqpqakSXYyX+Wg@mail.gmail.com/ Link: https://lkml.kernel.org/r/20231208061407.2125867-3-yuzhao@google.com Fixes: `e4dde56cd2` ("mm: multi-gen LRU: per-node lru_gen_folio lists") Signed-off-by: Yu Zhao <yuzhao@google.com> Tested-by: T.J. Mercier <tjmercier@google.com> Cc: Charan Teja Kalla <quic_charante@quicinc.com> Cc: Hillf Danton <hdanton@sina.com> Cc: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com> Cc: Kairui Song <ryncsn@gmail.com> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-20 17:02:03 +01:00
Yu Zhao	c5f67b7e84	mm/mglru: try to stop at high watermarks commit 5095a2b23987d3c3c47dd16b3d4080e2733b8bb9 upstream. The initial MGLRU patchset didn't include the memcg LRU support, and it relied on should_abort_scan(), added by commit `f76c833788` ("mm: multi-gen LRU: optimize multiple memcgs"), to "backoff to avoid overshooting their aggregate reclaim target by too much". Later on when the memcg LRU was added, should_abort_scan() was deemed unnecessary, and the test results [1] showed no side effects after it was removed by commit `a579086c99` ("mm: multi-gen LRU: remove eviction fairness safeguard"). However, that test used memory.reclaim, which sets nr_to_reclaim to SWAP_CLUSTER_MAX. So it can overshoot only by SWAP_CLUSTER_MAX-1 pages, i.e., from nr_reclaimed=nr_to_reclaim-1 to nr_reclaimed=nr_to_reclaim+SWAP_CLUSTER_MAX-1. Compared with the batch size kswapd sets to nr_to_reclaim, SWAP_CLUSTER_MAX is tiny. Therefore that test isn't able to reproduce the worst case scenario, i.e., kswapd overshooting GBs on large systems and "consuming 100% CPU" (see the Closes tag). Bring back a simplified version of should_abort_scan() on top of the memcg LRU, so that kswapd stops when all eligible zones are above their respective high watermarks plus a small delta to lower the chance of KSWAPD_HIGH_WMARK_HIT_QUICKLY. Note that this only applies to order-0 reclaim, meaning compaction-induced reclaim can still run wild (which is a different problem). On Android, launching 55 apps sequentially: Before After Change pgpgin 838377172 802955040 -4% pgpgout 38037080 34336300 -10% [1] https://lore.kernel.org/20221222041905.2431096-1-yuzhao@google.com/ Link: https://lkml.kernel.org/r/20231208061407.2125867-2-yuzhao@google.com Fixes: `a579086c99` ("mm: multi-gen LRU: remove eviction fairness safeguard") Signed-off-by: Yu Zhao <yuzhao@google.com> Reported-by: Charan Teja Kalla <quic_charante@quicinc.com> Reported-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com> Closes: https://lore.kernel.org/CAK8fFZ4DY+GtBA40Pm7Nn5xCHy+51w3sfxPqkqpqakSXYyX+Wg@mail.gmail.com/ Tested-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com> Tested-by: Kalesh Singh <kaleshsingh@google.com> Cc: Hillf Danton <hdanton@sina.com> Cc: Kairui Song <ryncsn@gmail.com> Cc: T.J. Mercier <tjmercier@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-20 17:02:02 +01:00
Yu Zhao	b2ce691b45	mm/mglru: fix underprotected page cache commit 081488051d28d32569ebb7c7a23572778b2e7d57 upstream. Unmapped folios accessed through file descriptors can be underprotected. Those folios are added to the oldest generation based on: 1. The fact that they are less costly to reclaim (no need to walk the rmap and flush the TLB) and have less impact on performance (don't cause major PFs and can be non-blocking if needed again). 2. The observation that they are likely to be single-use. E.g., for client use cases like Android, its apps parse configuration files and store the data in heap (anon); for server use cases like MySQL, it reads from InnoDB files and holds the cached data for tables in buffer pools (anon). However, the oldest generation can be very short lived, and if so, it doesn't provide the PID controller with enough time to respond to a surge of refaults. (Note that the PID controller uses weighted refaults and those from evicted generations only take a half of the whole weight.) In other words, for a short lived generation, the moving average smooths out the spike quickly. To fix the problem: 1. For folios that are already on LRU, if they can be beyond the tracking range of tiers, i.e., five accesses through file descriptors, move them to the second oldest generation to give them more time to age. (Note that tiers are used by the PID controller to statistically determine whether folios accessed multiple times through file descriptors are worth protecting.) 2. When adding unmapped folios to LRU, adjust the placement of them so that they are not too close to the tail. The effect of this is similar to the above. On Android, launching 55 apps sequentially: Before After Change workingset_refault_anon 25641024 25598972 0% workingset_refault_file 115016834 106178438 -8% Link: https://lkml.kernel.org/r/20231208061407.2125867-1-yuzhao@google.com Fixes: `ac35a49023` ("mm: multi-gen LRU: minimal implementation") Signed-off-by: Yu Zhao <yuzhao@google.com> Reported-by: Charan Teja Kalla <quic_charante@quicinc.com> Tested-by: Kalesh Singh <kaleshsingh@google.com> Cc: T.J. Mercier <tjmercier@google.com> Cc: Kairui Song <ryncsn@gmail.com> Cc: Hillf Danton <hdanton@sina.com> Cc: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-20 17:02:02 +01:00
Frank Li	ed50e07d6a	dmaengine: fsl-edma: fix DMA channel leak in eDMAv4 commit 4ee632c82d2dbb9e2dcc816890ef182a151cbd99 upstream. Allocate channel count consistently increases due to a missing source ID (srcid) cleanup in the fsl_edma_free_chan_resources() function at imx93 eDMAv4. Reset 'srcid' at fsl_edma_free_chan_resources(). Cc: stable@vger.kernel.org Fixes: `72f5801a4e` ("dmaengine: fsl-edma: integrate v3 support") Signed-off-by: Frank Li <Frank.Li@nxp.com> Link: https://lore.kernel.org/r/20231127214325.2477247-1-Frank.Li@nxp.com Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-20 17:02:02 +01:00
Amelie Delaunay	9127515bf9	dmaengine: stm32-dma: avoid bitfield overflow assertion commit 54bed6bafa0f38daf9697af50e3aff5ff1354fe1 upstream. stm32_dma_get_burst() returns a negative error for invalid input, which gets turned into a large u32 value in stm32_dma_prep_dma_memcpy() that in turn triggers an assertion because it does not fit into a two-bit field: drivers/dma/stm32-dma.c: In function 'stm32_dma_prep_dma_memcpy': include/linux/compiler_types.h:354:38: error: call to '__compiletime_assert_282' declared with attribute error: FIELD_PREP: value too large for the field _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) ^ include/linux/compiler_types.h:335:4: note: in definition of macro '__compiletime_assert' prefix ## suffix(); \ ^~~~~~ include/linux/compiler_types.h:354:2: note: in expansion of macro '_compiletime_assert' _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) ^~~~~~~~~~~~~~~~~~~ include/linux/build_bug.h:39:37: note: in expansion of macro 'compiletime_assert' #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg) ^~~~~~~~~~~~~~~~~~ include/linux/bitfield.h:68:3: note: in expansion of macro 'BUILD_BUG_ON_MSG' BUILD_BUG_ON_MSG(__builtin_constant_p(_val) ? \ ^~~~~~~~~~~~~~~~ include/linux/bitfield.h:114:3: note: in expansion of macro '__BF_FIELD_CHECK' __BF_FIELD_CHECK(_mask, 0ULL, _val, "FIELD_PREP: "); \ ^~~~~~~~~~~~~~~~ drivers/dma/stm32-dma.c:1237:4: note: in expansion of macro 'FIELD_PREP' FIELD_PREP(STM32_DMA_SCR_PBURST_MASK, dma_burst) \| ^~~~~~~~~~ As an easy workaround, assume the error can happen, so try to handle this by failing stm32_dma_prep_dma_memcpy() before the assertion. It replicates what is done in stm32_dma_set_xfer_param() where stm32_dma_get_burst() is also used. Fixes: `1c32d6c37c` ("dmaengine: stm32-dma: use bitfield helpers") Fixes: `a2b6103b7a` ("dmaengine: stm32-dma: Improve memory burst management") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Amelie Delaunay <amelie.delaunay@foss.st.com> Cc: stable@vger.kernel.org Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202311060135.Q9eMnpCL-lkp@intel.com/ Link: https://lore.kernel.org/r/20231106134832.1470305-1-amelie.delaunay@foss.st.com Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-20 17:02:02 +01:00
Stuart Lee	03e63e497a	drm/mediatek: Fix access violation in mtk_drm_crtc_dma_dev_get commit b6961d187fcd138981b8707dac87b9fcdbfe75d1 upstream. Add error handling to check NULL input in mtk_drm_crtc_dma_dev_get function. While display path is not configured correctly, none of crtc is established. So the caller of mtk_drm_crtc_dma_dev_get may pass input parameter *crtc as NULL, Which may cause coredump when we try to get the container of NULL pointer. Fixes: `cb1d6bcca5` ("drm/mediatek: Add dma dev get function") Signed-off-by: Stuart Lee <stuart.lee@mediatek.com> Cc: stable@vger.kernel.org Reviewed-by: AngeloGioacchino DEl Regno <angelogioacchino.delregno@collabora.com> Tested-by: Macpaul Lin <macpaul.lin@mediatek.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20231110012914.14884-2-stuart.lee@mediatek.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-20 17:02:02 +01:00
Alex Deucher	3aae4ef4d7	drm/amdgpu/sdma5.2: add begin/end_use ring callbacks commit ab4750332dbe535243def5dcebc24ca00c1f98ac upstream. Add begin/end_use ring callbacks to disallow GFXOFF when SDMA work is submitted and allow it again afterward. This should avoid corner cases where GFXOFF is erroneously entered when SDMA is still active. For now just allow/disallow GFXOFF in the begin and end helpers until we root cause the issue. This should not impact power as SDMA usage is pretty minimal and GFXOSS should not be active when SDMA is active anyway, this just makes it explicit. v2: move everything into sdma5.2 code. No reason for this to be generic at this point. v3: Add comments in new code Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2220 Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> (v1) Tested-by: Mario Limonciello <mario.limonciello@amd.com> (v1) Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 5.15+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-20 17:02:02 +01:00

1 2 3 4 5 ...

1219082 Commits