Commit Graph

1157751 Commits

Author SHA1 Message Date
Greg Kroah-Hartman
8968561242 ANDROID: fix crc error in put_cmsg caused in 6.1.68
In commit f2f57f51b5 ("io_uring/af_unix: disable sending io_uring over
sockets") a new .h file was added to the include list, which broke the
crc generation checks with the following error:

function symbol 'int put_cmsg(struct msghdr*, int, int, int, void*)' changed
  CRC changed from 0x31108fe3 to 0xd66fe827

Fix this by only including the .h file if the crc checker is not being
run.

Bug: 161946584
Fixes: f2f57f51b5 ("io_uring/af_unix: disable sending io_uring over sockets")
Change-Id: Ie7a6d5627f169a0fea3eac2b43024cff977b8360
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2024-01-08 07:26:19 +00:00
Rick Yiu
ac90f08292 ANDROID: Update the ABI symbol list
Adding the following symbols:
  - sysctl_sched_idle_min_granularity
  - sysctl_sched_min_granularity

Bug: 316276520
Change-Id: I8e33c3105a3ca62d168a6289ceafc31404757453
Signed-off-by: Rick Yiu <rickyiu@google.com>
2024-01-05 18:06:32 +00:00
Rick Yiu
ef67750d99 ANDROID: sched: Export symbols for vendor modules
Export sysctl_sched_min_granularity and
sysctl_sched_idle_min_granularity. In the vendor module, it will use
several static function in GKI, while we do not want to export these
static functions, which will need to make them not static, we copied
them to the vendor module, so we need the export the symbols used in
those static functions. For example, sysctl_sched_min_granularity
and sysctl_sched_idle_min_granularity are referred in sched_slice(),
and they are only used as read-only.

Bug: 316276520
Change-Id: I976d0a1f3a70e8e60099e55fdd3cc99a90053fbb
Signed-off-by: Rick Yiu <rickyiu@google.com>
2024-01-05 18:06:32 +00:00
Stanley Chang
934a40576e UPSTREAM: usb: dwc3: core: add support for disabling High-speed park mode
Setting the PARKMODE_DISABLE_HS bit in the DWC3_USB3_GUCTL1.
When this bit is set to '1' all HS bus instances in park mode are disabled

For some USB wifi devices, if enable this feature it will reduce the
performance. Therefore, add an option for disabling HS park mode by
device-tree.

In Synopsys's dwc3 data book:
In a few high speed devices when an IN request is sent within 900ns of the
ACK of the previous packet, these devices send a NAK. When connected to
these devices, if required, the software can disable the park mode if you
see performance drop in your system. When park mode is disabled,
pipelining of multiple packet is disabled and instead one packet at a time
is requested by the scheduler. This allows up to 12 NAKs in a micro-frame
and improves performance of these slow devices.

Bug: 300024866
Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
Signed-off-by: Stanley Chang <stanley_chang@realtek.com>
Link: https://lore.kernel.org/r/20230419020044.15475-1-stanley_chang@realtek.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: William Wu <william.wu@rock-chips.com>
(cherry picked from commit d21a797a3e)
Change-Id: I43ee416e54779a073a0ba4057edf4be8bd7886de
Signed-off-by: Kever Yang <kever.yang@rock-chips.com>
2024-01-05 18:03:39 +00:00
Greg Kroah-Hartman
c077094653 Revert "hrtimers: Push pending hrtimers away from outgoing CPU earlier"
This reverts commit 75b5016ce3 which is
commit 5c0930ccaad5a74d74e8b18b648c5eb21ed2fe94 upstream.

It breaks the Android kernel abi and can be brought back in the future
in an abi-safe way if it is really needed.

Bug: 161946584
Change-Id: Ifdc5474de6e7488eb24959df1697b61b2e3e7880
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2024-01-05 13:54:44 +00:00
Greg Kroah-Hartman
e0690152b8 Revert "drop_monitor: Require 'CAP_SYS_ADMIN' when joining "events" group"
This reverts commit b5ca945612 which is
commit e03781879a0d524ce3126678d50a80484a513c4b upstream.

It breaks the Android kernel abi and can be brought back in the future
in an abi-safe way if it is really needed.

Bug: 161946584
Change-Id: Iecbd6b6537bd4cd2d178d0afbdc7557e521429c5
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2024-01-05 13:53:07 +00:00
Sebastian Ene
8a597e7a2d ANDROID: KVM: arm64: Don't prepopulate MMIO regions for host stage-2
As we reserve only 1GB of memory for the MMIO region don't prepopulate
the entire remaining address space with MMIO as this is prone to failure.
Instead, let the MMIO regions to be created lazily on the fault path and
keep only the RAM regions prepopulated.

Bug: 307805059
Test: Boot pKVM with CONFIG_ARM64_16K_PAGES
Change-Id: I6327f42eb17c6588335a1e04736393c9032114ab
Signed-off-by: Sebastian Ene <sebastianene@google.com>
2024-01-05 12:43:55 +00:00
Greg Kroah-Hartman
c9b484c69d Merge 6.1.68 into android14-6.1-lts
Changes in 6.1.68
	vdpa/mlx5: preserve CVQ vringh index
	hrtimers: Push pending hrtimers away from outgoing CPU earlier
	i2c: designware: Fix corrupted memory seen in the ISR
	netfilter: ipset: fix race condition between swap/destroy and kernel side add/del/test
	zstd: Fix array-index-out-of-bounds UBSAN warning
	tg3: Move the [rt]x_dropped counters to tg3_napi
	tg3: Increment tx_dropped in tg3_tso_bug()
	kconfig: fix memory leak from range properties
	drm/amdgpu: correct chunk_ptr to a pointer to chunk.
	x86: Introduce ia32_enabled()
	x86/coco: Disable 32-bit emulation by default on TDX and SEV
	x86/entry: Convert INT 0x80 emulation to IDTENTRY
	x86/entry: Do not allow external 0x80 interrupts
	x86/tdx: Allow 32-bit emulation by default
	dt: dt-extract-compatibles: Handle cfile arguments in generator function
	dt: dt-extract-compatibles: Don't follow symlinks when walking tree
	platform/x86: asus-wmi: Move i8042 filter install to shared asus-wmi code
	of: dynamic: Fix of_reconfig_get_state_change() return value documentation
	platform/x86: wmi: Skip blocks with zero instances
	ipv6: fix potential NULL deref in fib6_add()
	octeontx2-pf: Add missing mutex lock in otx2_get_pauseparam
	octeontx2-af: Check return value of nix_get_nixlf before using nixlf
	hv_netvsc: rndis_filter needs to select NLS
	r8152: Rename RTL8152_UNPLUG to RTL8152_INACCESSIBLE
	r8152: Add RTL8152_INACCESSIBLE checks to more loops
	r8152: Add RTL8152_INACCESSIBLE to r8156b_wait_loading_flash()
	r8152: Add RTL8152_INACCESSIBLE to r8153_pre_firmware_1()
	r8152: Add RTL8152_INACCESSIBLE to r8153_aldps_en()
	mlxbf-bootctl: correctly identify secure boot with development keys
	platform/mellanox: Add null pointer checks for devm_kasprintf()
	platform/mellanox: Check devm_hwmon_device_register_with_groups() return value
	arcnet: restoring support for multiple Sohard Arcnet cards
	octeontx2-pf: consider both Rx and Tx packet stats for adaptive interrupt coalescing
	net: stmmac: fix FPE events losing
	xsk: Skip polling event check for unbound socket
	octeontx2-af: fix a use-after-free in rvu_npa_register_reporters
	i40e: Fix unexpected MFS warning message
	iavf: validate tx_coalesce_usecs even if rx_coalesce_usecs is zero
	net: bnxt: fix a potential use-after-free in bnxt_init_tc
	tcp: fix mid stream window clamp.
	ionic: fix snprintf format length warning
	ionic: Fix dim work handling in split interrupt mode
	ipv4: ip_gre: Avoid skb_pull() failure in ipgre_xmit()
	net: atlantic: Fix NULL dereference of skb pointer in
	net: hns: fix wrong head when modify the tx feature when sending packets
	net: hns: fix fake link up on xge port
	octeontx2-af: Adjust Tx credits when MCS external bypass is disabled
	octeontx2-af: Fix mcs sa cam entries size
	octeontx2-af: Fix mcs stats register address
	octeontx2-af: Add missing mcs flr handler call
	octeontx2-af: Update Tx link register range
	dt-bindings: interrupt-controller: Allow #power-domain-cells
	netfilter: nft_exthdr: add boolean DCCP option matching
	netfilter: nf_tables: fix 'exist' matching on bigendian arches
	netfilter: nf_tables: bail out on mismatching dynset and set expressions
	netfilter: nf_tables: validate family when identifying table via handle
	netfilter: xt_owner: Fix for unsafe access of sk->sk_socket
	tcp: do not accept ACK of bytes we never sent
	bpf: sockmap, updating the sg structure should also update curr
	psample: Require 'CAP_NET_ADMIN' when joining "packets" group
	drop_monitor: Require 'CAP_SYS_ADMIN' when joining "events" group
	mm/damon/sysfs: eliminate potential uninitialized variable warning
	tee: optee: Fix supplicant based device enumeration
	RDMA/hns: Fix unnecessary err return when using invalid congest control algorithm
	RDMA/irdma: Do not modify to SQD on error
	RDMA/irdma: Add wait for suspend on SQD
	arm64: dts: rockchip: Expand reg size of vdec node for RK3328
	arm64: dts: rockchip: Expand reg size of vdec node for RK3399
	ASoC: fsl_sai: Fix no frame sync clock issue on i.MX8MP
	RDMA/rtrs-srv: Do not unconditionally enable irq
	RDMA/rtrs-clt: Start hb after path_up
	RDMA/rtrs-srv: Check return values while processing info request
	RDMA/rtrs-srv: Free srv_mr iu only when always_invalidate is true
	RDMA/rtrs-srv: Destroy path files after making sure no IOs in-flight
	RDMA/rtrs-clt: Fix the max_send_wr setting
	RDMA/rtrs-clt: Remove the warnings for req in_use check
	RDMA/bnxt_re: Correct module description string
	RDMA/irdma: Refactor error handling in create CQP
	RDMA/irdma: Fix UAF in irdma_sc_ccq_get_cqe_info()
	hwmon: (acpi_power_meter) Fix 4.29 MW bug
	ASoC: codecs: lpass-tx-macro: set active_decimator correct default value
	hwmon: (nzxt-kraken2) Fix error handling path in kraken2_probe()
	ASoC: wm_adsp: fix memleak in wm_adsp_buffer_populate
	RDMA/core: Fix umem iterator when PAGE_SIZE is greater then HCA pgsz
	RDMA/irdma: Avoid free the non-cqp_request scratch
	drm/bridge: tc358768: select CONFIG_VIDEOMODE_HELPERS
	arm64: dts: imx8mq: drop usb3-resume-missing-cas from usb
	arm64: dts: imx8mp: imx8mq: Add parkmode-disable-ss-quirk on DWC3
	ARM: dts: imx6ul-pico: Describe the Ethernet PHY clock
	tracing: Fix a warning when allocating buffered events fails
	scsi: be2iscsi: Fix a memleak in beiscsi_init_wrb_handle()
	ARM: imx: Check return value of devm_kasprintf in imx_mmdc_perf_init
	ARM: dts: imx7: Declare timers compatible with fsl,imx6dl-gpt
	ARM: dts: imx28-xea: Pass the 'model' property
	riscv: fix misaligned access handling of C.SWSP and C.SDSP
	md: introduce md_ro_state
	md: don't leave 'MD_RECOVERY_FROZEN' in error path of md_set_readonly()
	iommu: Avoid more races around device probe
	rethook: Use __rcu pointer for rethook::handler
	kprobes: consistent rcu api usage for kretprobe holder
	ASoC: amd: yc: Fix non-functional mic on ASUS E1504FA
	io_uring/af_unix: disable sending io_uring over sockets
	nvme-pci: Add sleep quirk for Kingston drives
	io_uring: fix mutex_unlock with unreferenced ctx
	ALSA: usb-audio: Add Pioneer DJM-450 mixer controls
	ALSA: pcm: fix out-of-bounds in snd_pcm_state_names
	ALSA: hda/realtek: Enable headset on Lenovo M90 Gen5
	ALSA: hda/realtek: add new Framework laptop to quirks
	ALSA: hda/realtek: Add Framework laptop 16 to quirks
	ring-buffer: Test last update in 32bit version of __rb_time_read()
	nilfs2: fix missing error check for sb_set_blocksize call
	nilfs2: prevent WARNING in nilfs_sufile_set_segment_usage()
	cgroup_freezer: cgroup_freezing: Check if not frozen
	checkstack: fix printed address
	tracing: Always update snapshot buffer size
	tracing: Disable snapshot buffer when stopping instance tracers
	tracing: Fix incomplete locking when disabling buffered events
	tracing: Fix a possible race when disabling buffered events
	packet: Move reference count in packet_sock to atomic_long_t
	r8169: fix rtl8125b PAUSE frames blasting when suspended
	regmap: fix bogus error on regcache_sync success
	platform/surface: aggregator: fix recv_buf() return value
	hugetlb: fix null-ptr-deref in hugetlb_vma_lock_write
	mm: fix oops when filemap_map_pmd() without prealloc_pte
	powercap: DTPM: Fix missing cpufreq_cpu_put() calls
	md/raid6: use valid sector values to determine if an I/O should wait on the reshape
	arm64: dts: mediatek: mt7622: fix memory node warning check
	arm64: dts: mediatek: mt8183-kukui-jacuzzi: fix dsi unnecessary cells properties
	arm64: dts: mediatek: cherry: Fix interrupt cells for MT6360 on I2C7
	arm64: dts: mediatek: mt8173-evb: Fix regulator-fixed node names
	arm64: dts: mediatek: mt8195: Fix PM suspend/resume with venc clocks
	arm64: dts: mediatek: mt8183: Fix unit address for scp reserved memory
	arm64: dts: mediatek: mt8183: Move thermal-zones to the root node
	arm64: dts: mediatek: mt8183-evb: Fix unit_address_vs_reg warning on ntc
	binder: fix memory leaks of spam and pending work
	coresight: etm4x: Make etm4_remove_dev() return void
	coresight: etm4x: Remove bogous __exit annotation for some functions
	hwtracing: hisi_ptt: Add dummy callback pmu::read()
	misc: mei: client.c: return negative error code in mei_cl_write
	misc: mei: client.c: fix problem of return '-EOVERFLOW' in mei_cl_write
	LoongArch: BPF: Don't sign extend memory load operand
	LoongArch: BPF: Don't sign extend function return value
	ring-buffer: Force absolute timestamp on discard of event
	tracing: Set actual size after ring buffer resize
	tracing: Stop current tracer when resizing buffer
	parisc: Reduce size of the bug_table on 64-bit kernel by half
	parisc: Fix asm operand number out of range build error in bug table
	arm64: dts: mediatek: add missing space before {
	arm64: dts: mt8183: kukui: Fix underscores in node names
	perf: Fix perf_event_validate_size()
	x86/sev: Fix kernel crash due to late update to read-only ghcb_version
	gpiolib: sysfs: Fix error handling on failed export
	drm/amdgpu: fix memory overflow in the IB test
	drm/amd/amdgpu: Fix warnings in amdgpu/amdgpu_display.c
	drm/amdgpu: correct the amdgpu runtime dereference usage count
	drm/amdgpu: Update ras eeprom support for smu v13_0_0 and v13_0_10
	drm/amdgpu: Add EEPROM I2C address support for ip discovery
	drm/amdgpu: Remove redundant I2C EEPROM address
	drm/amdgpu: Decouple RAS EEPROM addresses from chips
	drm/amdgpu: Add support for RAS table at 0x40000
	drm/amdgpu: Remove second moot switch to set EEPROM I2C address
	drm/amdgpu: Return from switch early for EEPROM I2C address
	drm/amdgpu: simplify amdgpu_ras_eeprom.c
	drm/amdgpu: Add I2C EEPROM support on smu v13_0_6
	drm/amdgpu: Update EEPROM I2C address for smu v13_0_0
	usb: gadget: f_hid: fix report descriptor allocation
	serial: 8250_dw: Add ACPI ID for Granite Rapids-D UART
	parport: Add support for Brainboxes IX/UC/PX parallel cards
	cifs: Fix non-availability of dedup breaking generic/304
	Revert "xhci: Loosen RPM as default policy to cover for AMD xHC 1.1"
	smb: client: fix potential NULL deref in parse_dfs_referrals()
	usb: typec: class: fix typec_altmode_put_partner to put plugs
	ARM: PL011: Fix DMA support
	serial: sc16is7xx: address RX timeout interrupt errata
	serial: 8250: 8250_omap: Clear UART_HAS_RHR_IT_DIS bit
	serial: 8250: 8250_omap: Do not start RX DMA on THRI interrupt
	serial: 8250_omap: Add earlycon support for the AM654 UART controller
	devcoredump: Send uevent once devcd is ready
	x86/CPU/AMD: Check vendor in the AMD microcode callback
	USB: gadget: core: adjust uevent timing on gadget unbind
	cifs: Fix flushing, invalidation and file size with copy_file_range()
	cifs: Fix flushing, invalidation and file size with FICLONE
	MIPS: kernel: Clear FPU states when setting up kernel threads
	KVM: s390/mm: Properly reset no-dat
	KVM: SVM: Update EFER software model on CR0 trap for SEV-ES
	MIPS: Loongson64: Reserve vgabios memory on boot
	MIPS: Loongson64: Handle more memory types passed from firmware
	MIPS: Loongson64: Enable DMA noncoherent support
	netfilter: nft_set_pipapo: skip inactive elements during set walk
	riscv: Kconfig: Add select ARM_AMBA to SOC_STARFIVE
	drm/i915/display: Drop check for doublescan mode in modevalid
	drm/i915/lvds: Use REG_BIT() & co.
	drm/i915/sdvo: stop caching has_hdmi_monitor in struct intel_sdvo
	drm/i915: Skip some timing checks on BXT/GLK DSI transcoders
	Linux 6.1.68

Change-Id: I0a824071a80b24dc4a2e0077f305b7cac42235b8
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2024-01-05 08:40:52 +00:00
Greg Kroah-Hartman
2af1386be0 Merge "Merge 6.1.67 into android14-6.1-lts" into android14-6.1-lts 2024-01-05 08:34:29 +00:00
Peng Zhang
ed9b660cd1 BACKPORT: FROMGIT fork: use __mt_dup() to duplicate maple tree in dup_mmap()
In dup_mmap(), using __mt_dup() to duplicate the old maple tree and then
directly replacing the entries of VMAs in the new maple tree can result in
better performance.  __mt_dup() uses DFS pre-order to duplicate the maple
tree, so it is efficient.

The average time complexity of __mt_dup() is O(n), where n is the number
of VMAs.  The proof of the time complexity is provided in the commit log
that introduces __mt_dup().  After duplicating the maple tree, each
element is traversed and replaced (ignoring the cases of deletion, which
are rare).  Since it is only a replacement operation for each element,
this process is also O(n).

Analyzing the exact time complexity of the previous algorithm is
challenging because each insertion can involve appending to a node,
pushing data to adjacent nodes, or even splitting nodes.  The frequency of
each action is difficult to calculate.  The worst-case scenario for a
single insertion is when the tree undergoes splitting at every level.  If
we consider each insertion as the worst-case scenario, we can determine
that the upper bound of the time complexity is O(n*log(n)), although this
is a loose upper bound.  However, based on the test data, it appears that
the actual time complexity is likely to be O(n).

As the entire maple tree is duplicated using __mt_dup(), if dup_mmap()
fails, there will be a portion of VMAs that have not been duplicated in
the maple tree.  To handle this, we mark the failure point with
XA_ZERO_ENTRY.  In exit_mmap(), if this marker is encountered, stop
releasing VMAs that have not been duplicated after this point.

There is a "spawn" in byte-unixbench[1], which can be used to test the
performance of fork().  I modified it slightly to make it work with
different number of VMAs.

Below are the test results.  The first row shows the number of VMAs.  The
second and third rows show the number of fork() calls per ten seconds,
corresponding to next-20231006 and the this patchset, respectively.  The
test results were obtained with CPU binding to avoid scheduler load
balancing that could cause unstable results.  There are still some
fluctuations in the test results, but at least they are better than the
original performance.

21     121   221    421    821    1621   3221   6421   12821  25621  51221
112100 76261 54227  34035  20195  11112  6017   3161   1606   802    393
114558 83067 65008  45824  28751  16072  8922   4747   2436   1233   599
2.19%  8.92% 19.88% 34.64% 42.37% 44.64% 48.28% 50.17% 51.68% 53.74% 52.42%

[1] https://github.com/kdlucas/byte-unixbench/tree/master

Link: https://lkml.kernel.org/r/20231027033845.90608-11-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Suggested-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

(cherry picked from commit d2406291483775ecddaee929231a39c70c08fda2
https://git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm mm-unstable)

[surenb: open-coded vma_iter_clear_gfp(), vma_iter_bulk_store();
replaced vma_next() with mas_find()]

Bug: 308042511
Change-Id: I42d6620e8ce6a0b16211c231a9b72ba16ba9c0d2
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2024-01-04 22:44:38 +00:00
Peng Zhang
3743b40f65 FROMGIT: maple_tree: preserve the tree attributes when destroying maple tree
When destroying maple tree, preserve its attributes and then turn it into
an empty tree.  This allows it to be reused without needing to be
reinitialized.

Link: https://lkml.kernel.org/r/20231027033845.90608-10-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

(cherry picked from commit 8e50d32c7a89bde896945e4e572ef28ccd87bbf8
https://git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm mm-unstable)

Bug: 308042511
Change-Id: If1725d5a37dcd26bec23e6bffe95d877903dfab1
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2024-01-04 22:44:38 +00:00
Peng Zhang
1bec2dd52e FROMGIT: maple_tree: update check_forking() and bench_forking()
Updated check_forking() and bench_forking() to use __mt_dup() to duplicate
maple tree.

Link: https://lkml.kernel.org/r/20231027033845.90608-9-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

(cherry picked from commit 446e1867e6df3cbdd19af6be8f8f4ed56176adb4
https://git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm mm-unstable)

Bug: 308042511
Change-Id: I3b64ad1cb5ae40b10dc86ee55501f12522c7e2f5
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2024-01-04 22:44:38 +00:00
Peng Zhang
e57d333531 FROMGIT: maple_tree: skip other tests when BENCH is enabled
Skip other tests when BENCH is enabled so that performance can be measured
in user space.

Link: https://lkml.kernel.org/r/20231027033845.90608-8-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

(cherry picked from commit f670fa1caadb4ea532a89012c5451e4c6789bfcc
https://git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm mm-unstable)

Bug: 308042511
Change-Id: I0a761a4b6211b19ec80c97d5aef80f3979523bcb
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2024-01-04 22:44:38 +00:00
Peng Zhang
c79ca61edc FROMGIT: maple_tree: update the documentation of maple tree
Introduce the new interface mtree_dup() in the documentation.

Link: https://lkml.kernel.org/r/20231027033845.90608-7-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

(cherry picked from commit 9bc1d3cdb904170214456bca96c4924f28522ab8
https://git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm mm-unstable)

Bug: 308042511
Change-Id: I3eb330f0be49f7e8d8b37ecb64de3b7ef349c05b
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2024-01-04 22:44:38 +00:00
Peng Zhang
7befa7bbc9 FROMGIT: maple_tree: add test for mtree_dup()
Add test for mtree_dup().

Test by duplicating different maple trees and then comparing the two
trees.  Includes tests for duplicating full trees and memory allocation
failures on different nodes.

Link: https://lkml.kernel.org/r/20231027033845.90608-6-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

(cherry picked from commit a2587a7e8d37885dc063255f5400a66299b42e48
https://git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm mm-unstable)

Bug: 308042511
Change-Id: I7501db5735b1dfd15240ef2946b26d63ffe1d8e0
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2024-01-04 22:44:38 +00:00
Peng Zhang
f73f881af4 FROMGIT: radix tree test suite: align kmem_cache_alloc_bulk() with kernel behavior.
When kmem_cache_alloc_bulk() fails to allocate, leave the freed pointers
in the array.  This enables a more accurate simulation of the kernel's
behavior and allows for testing potential double-free scenarios.

Link: https://lkml.kernel.org/r/20231027033845.90608-5-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

(cherry picked from commit 46c99e26f2f86260fed226cab217d0b3ca8dca56
https://git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm mm-unstable)

Bug: 308042511
Change-Id: If822e9d219066e1573b7c044ef9a7344f652e365
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2024-01-04 22:44:38 +00:00
Peng Zhang
eb5048ea90 FROMGIT: maple_tree: introduce interfaces __mt_dup() and mtree_dup()
Introduce interfaces __mt_dup() and mtree_dup(), which are used to
duplicate a maple tree.  They duplicate a maple tree in Depth-First Search
(DFS) pre-order traversal.  It uses memcopy() to copy nodes in the source
tree and allocate new child nodes in non-leaf nodes.  The new node is
exactly the same as the source node except for all the addresses stored in
it.  It will be faster than traversing all elements in the source tree and
inserting them one by one into the new tree.  The time complexity of these
two functions is O(n).

The difference between __mt_dup() and mtree_dup() is that mtree_dup()
handles locks internally.

Analysis of the average time complexity of this algorithm:

For simplicity, let's assume that the maximum branching factor of all
non-leaf nodes is 16 (in allocation mode, it is 10), and the tree is a
full tree.

Under the given conditions, if there is a maple tree with n elements, the
number of its leaves is n/16.  From bottom to top, the number of nodes in
each level is 1/16 of the number of nodes in the level below.  So the
total number of nodes in the entire tree is given by the sum of n/16 +
n/16^2 + n/16^3 + ...  + 1.  This is a geometric series, and it has log(n)
terms with base 16.  According to the formula for the sum of a geometric
series, the sum of this series can be calculated as (n-1)/15.  Each node
has only one parent node pointer, which can be considered as an edge.  In
total, there are (n-1)/15-1 edges.

This algorithm consists of two operations:

1. Traversing all nodes in DFS order.
2. For each node, making a copy and performing necessary modifications
   to create a new node.

For the first part, DFS traversal will visit each edge twice.  Let
T(ascend) represent the cost of taking one step downwards, and T(descend)
represent the cost of taking one step upwards.  And both of them are
constants (although mas_ascend() may not be, as it contains a loop, but
here we ignore it and treat it as a constant).  So the time spent on the
first part can be represented as ((n-1)/15-1) * (T(ascend) + T(descend)).

For the second part, each node will be copied, and the cost of copying a
node is denoted as T(copy_node).  For each non-leaf node, it is necessary
to reallocate all child nodes, and the cost of this operation is denoted
as T(dup_alloc).  The behavior behind memory allocation is complex and not
specific to the maple tree operation.  Here, we assume that the time
required for a single allocation is constant.  Since the size of a node is
fixed, both of these symbols are also constants.  We can calculate that
the time spent on the second part is ((n-1)/15) * T(copy_node) + ((n-1)/15
- n/16) * T(dup_alloc).

Adding both parts together, the total time spent by the algorithm can be
represented as:

((n-1)/15) * (T(ascend) + T(descend) + T(copy_node) + T(dup_alloc)) -
n/16 * T(dup_alloc) - (T(ascend) + T(descend))

Let C1 = T(ascend) + T(descend) + T(copy_node) + T(dup_alloc)
Let C2 = T(dup_alloc)
Let C3 = T(ascend) + T(descend)

Finally, the expression can be simplified as:
((16 * C1 - 15 * C2) / (15 * 16)) * n - (C1 / 15 + C3).

This is a linear function, so the average time complexity is O(n).

Link: https://lkml.kernel.org/r/20231027033845.90608-4-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Suggested-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

(cherry picked from commit fd32e4e9b7646510ee9010e0d5f8b8857d48a6f7
https://git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm mm-unstable)

Bug: 308042511
Change-Id: I385759a1184a202498e086458b572c203616b9b4
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2024-01-04 22:44:38 +00:00
Peng Zhang
dc9323545b FROMGIT: maple_tree: introduce {mtree,mas}_lock_nested()
In some cases, nested locks may be needed, so {mtree,mas}_lock_nested is
introduced.  For example, when duplicating maple tree, we need to hold the
locks of two trees, in which case nested locks are needed.

At the same time, add the definition of spin_lock_nested() in tools for
testing.

Link: https://lkml.kernel.org/r/20231027033845.90608-3-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

(cherry picked from commit b2472efe4316b2687c153919c1513a098bd82c17
https://git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm mm-unstable)

Bug: 308042511
Change-Id: I06f0eb0a32a2f39b7842de08a0e5ce59895345c5
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2024-01-04 22:44:38 +00:00
Peng Zhang
4ddcdc519b FROMGIT: maple_tree: add mt_free_one() and mt_attr() helpers
Patch series "Introduce __mt_dup() to improve the performance of fork()", v7.

This series introduces __mt_dup() to improve the performance of fork().
During the duplication process of mmap, all VMAs are traversed and
inserted one by one into the new maple tree, causing the maple tree to be
rebalanced multiple times.  Balancing the maple tree is a costly
operation.  To duplicate VMAs more efficiently, mtree_dup() and __mt_dup()
are introduced for the maple tree.  They can efficiently duplicate a maple
tree.

Here are some algorithmic details about {mtree,__mt}_dup().  We perform a
DFS pre-order traversal of all nodes in the source maple tree.  During
this process, we fully copy the nodes from the source tree to the new
tree.  This involves memory allocation, and when encountering a new node,
if it is a non-leaf node, all its child nodes are allocated at once.

This idea was originally from Liam R.  Howlett's Maple Tree Work email,
and I added some of my own ideas to implement it.  Some previous
discussions can be found in [1].  For a more detailed analysis of the
algorithm, please refer to the logs for patch [3/10] and patch [10/10].

There is a "spawn" in byte-unixbench[2], which can be used to test the
performance of fork().  I modified it slightly to make it work with
different number of VMAs.

Below are the test results.  The first row shows the number of VMAs.  The
second and third rows show the number of fork() calls per ten seconds,
corresponding to next-20231006 and the this patchset, respectively.  The
test results were obtained with CPU binding to avoid scheduler load
balancing that could cause unstable results.  There are still some
fluctuations in the test results, but at least they are better than the
original performance.

21     121   221    421    821    1621   3221   6421   12821  25621  51221
112100 76261 54227  34035  20195  11112  6017   3161   1606   802    393
114558 83067 65008  45824  28751  16072  8922   4747   2436   1233   599
2.19%  8.92% 19.88% 34.64% 42.37% 44.64% 48.28% 50.17% 51.68% 53.74% 52.42%

Thanks to Liam and Matthew for the review.

This patch (of 10):

Add two helpers:
1. mt_free_one(), used to free a maple node.
2. mt_attr(), used to obtain the attributes of maple tree.

Link: https://lkml.kernel.org/r/20231027033845.90608-1-zhangpeng.00@bytedance.com
Link: https://lkml.kernel.org/r/20231027033845.90608-2-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

(cherry picked from commit 4f2267b58a22d972be98edef8e6b3c7a67c9fb91
https://git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm mm-unstable)

Bug: 308042511
Change-Id: Ib9b13dee357ac4c85668901c20a3c370fbdd08da
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2024-01-04 22:44:38 +00:00
Liam R. Howlett
c52d48818b UPSTREAM: maple_tree: introduce __mas_set_range()
mas_set_range() resets the node to MAS_START, which will cause a re-walk
of the tree to the range.  This is unnecessary when the maple state is
already at the correct location of the write.  Add a function that only
sets the range to avoid unnecessary re-walking of the tree.

Link: https://lkml.kernel.org/r/20230724183157.3939892-6-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

(cherry picked from commit c1297987cc)

Bug: 308042511
Change-Id: I9e026d0f103e3aa24b47998be6b83e28e7928540
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2024-01-04 22:44:38 +00:00
Kever Yang
066d57de87 ANDROID: GKI: Enable symbols for v4l2 in async and fwnode
INFO: 14 function symbol(s) added
  'struct v4l2_async_subdev* __v4l2_async_nf_add_fwnode(struct v4l2_async_notifier*, struct fwnode_handle*, unsigned int)'
  'struct v4l2_async_subdev* __v4l2_async_nf_add_fwnode_remote(struct v4l2_async_notifier*, struct fwnode_handle*, unsigned int)'
  'void v4l2_async_nf_cleanup(struct v4l2_async_notifier*)'
  'void v4l2_async_nf_init(struct v4l2_async_notifier*)'
  'int v4l2_async_nf_parse_fwnode_endpoints(struct device*, struct v4l2_async_notifier*, size_t, parse_endpoint_func)'
  'int v4l2_async_nf_register(struct v4l2_device*, struct v4l2_async_notifier*)'
  'void v4l2_async_nf_unregister(struct v4l2_async_notifier*)'
  'int v4l2_async_register_subdev(struct v4l2_subdev*)'
  'int v4l2_async_register_subdev_sensor(struct v4l2_subdev*)'
  'int v4l2_async_subdev_nf_register(struct v4l2_subdev*, struct v4l2_async_notifier*)'
  'void v4l2_async_unregister_subdev(struct v4l2_subdev*)'
  'int v4l2_fwnode_endpoint_alloc_parse(struct fwnode_handle*, struct v4l2_fwnode_endpoint*)'
  'void v4l2_fwnode_endpoint_free(struct v4l2_fwnode_endpoint*)'
  'int v4l2_fwnode_endpoint_parse(struct fwnode_handle*, struct v4l2_fwnode_endpoint*)'

Bug: 300024866
Change-Id: I7e4c2faac5c8341a19ea3fed694190d38679dc5b
Signed-off-by: Kever Yang <kever.yang@rock-chips.com>
2024-01-04 22:25:09 +00:00
Greg Kroah-Hartman
44affaea1e Revert "mmc: core: add helpers mmc_regulator_enable/disable_vqmmc"
This reverts commit 38d3216032 which is
commit 8d91f3f8ae upstream.

It breaks the Android kernel abi and can be brought back in the future
in an abi-safe way if it is really needed.

Bug: 161946584
Change-Id: I67705a9f4b04cf051f9b0f1a3494284ae0122da2
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2024-01-04 21:47:02 +00:00
Greg Kroah-Hartman
c49b4a744f Revert "mmc: sdhci-sprd: Fix vqmmc not shutting down after the card was pulled"
This reverts commit 8b01195be4 which is
commit 477865af60b2117ceaa1d558e03559108c15c78c upstream.

It breaks the Android kernel abi and can be brought back in the future
in an abi-safe way if it is really needed.

Bug: 161946584
Change-Id: I3e7316f074393f1b84e8d6a7a845060c268366b0
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2024-01-04 21:47:02 +00:00
Treehugger Robot
59f96234bf Merge "Merge 6.1.66 into android14-6.1-lts" into android14-6.1-lts 2024-01-04 21:47:02 +00:00
Ken Huang
e74417834e ANDROID: Update the ABI symbol list
Adding the following symbols:
  - __drmm_crtc_alloc_with_planes

Bug: 275278929
Change-Id: I41b6069612d44214f474ed82ee2a4b07ca739302
Signed-off-by: Ken Huang <kenbshuang@google.com>
2024-01-04 23:18:14 +08:00
Vincent Donnefort
15a93de464 ANDROID: KVM: arm64: Fix hyp event alignment
The structures that define hyp events must be packed so they match
their format definitions in the tracefs file
hyp/events/hyp/<event>/format.

Bug: 299430621
Change-Id: Ia7e1a686744d5c9c3f8a21881f03228c8acecade
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
2024-01-04 14:09:44 +00:00
Vincent Donnefort
717d1f8f91 ANDROID: KVM: arm64: Fix host_smc print typo
From pKVM point of view, unknown SMCs are simply forwarded, we can't
consider them invalid or not. This was probably a typo following a copy
of the host_hcall event.

Bug: 299430621
Change-Id: Ieb53f985a5187a8b5a9feb4a95982b15cdc1b04a
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
2024-01-04 12:45:10 +00:00
Greg Kroah-Hartman
0292321d13 Merge 6.1.67 into android14-6.1-lts
Changes in 6.1.67
	Revert "wifi: cfg80211: fix CQM for non-range use"
	Linux 6.1.67

Change-Id: I8cb03bcbe527a884eaec064c23758de919700692
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2024-01-04 12:12:49 +00:00
Jaegeuk Kim
8fc25d7862 FROMGIT: f2fs: do not return EFSCORRUPTED, but try to run online repair
If we return the error, there's no way to recover the status as of now, since
fsck does not fix the xattr boundary issue.

Bug: 305658663
Cc: stable@vger.kernel.org
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
(cherry picked from commit 50a472bbc79ff9d5a88be8019a60e936cadf9f13
 https://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs.git dev)
Change-Id: I55060a4eede3f5f85066aba22a6ab7155517e5c4
(cherry picked from commit 70113b9d489050d3e7a6f28e0cd6e43f104fc132)
(cherry picked from commit 2c1f3789d609bd549f14c019b6c7b311bfd2fa64)
2024-01-04 10:39:11 +00:00
Vincent Donnefort
99288e911a ANDROID: KVM: arm64: Document module_change_host_prot_range
When this pKVM module ops has been introduced, the documentation has
been omitted.

Bug: 308373293
Change-Id: I9e471414e72a1ee04c132de4ed95d77e815ae8c9
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
2024-01-04 09:56:20 +00:00
Greg Kroah-Hartman
c539451364 Revert "mmc: core: add helpers mmc_regulator_enable/disable_vqmmc"
This reverts commit 38d3216032 which is
commit 8d91f3f8ae upstream.

It breaks the Android kernel abi and can be brought back in the future
in an abi-safe way if it is really needed.

Bug: 161946584
Change-Id: I67705a9f4b04cf051f9b0f1a3494284ae0122da2
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2024-01-04 09:51:28 +00:00
Greg Kroah-Hartman
975d5f2ae9 Revert "mmc: sdhci-sprd: Fix vqmmc not shutting down after the card was pulled"
This reverts commit 8b01195be4 which is
commit 477865af60b2117ceaa1d558e03559108c15c78c upstream.

It breaks the Android kernel abi and can be brought back in the future
in an abi-safe way if it is really needed.

Bug: 161946584
Change-Id: I3e7316f074393f1b84e8d6a7a845060c268366b0
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2024-01-04 09:51:07 +00:00
Mukesh Ojha
4d99e41ce1 FROMGIT: PM / devfreq: Synchronize devfreq_monitor_[start/stop]
There is a chance if a frequent switch of the governor
done in a loop result in timer list corruption where
timer cancel being done from two place one from
cancel_delayed_work_sync() and followed by expire_timers()
can be seen from the traces[1].

while true
do
        echo "simple_ondemand" > /sys/class/devfreq/1d84000.ufshc/governor
        echo "performance" > /sys/class/devfreq/1d84000.ufshc/governor
done

It looks to be issue with devfreq driver where
device_monitor_[start/stop] need to synchronized so that
delayed work should get corrupted while it is either
being queued or running or being cancelled.

Let's use polling flag and devfreq lock to synchronize the
queueing the timer instance twice and work data being
corrupted.

[1]
...
..
<idle>-0    [003]   9436.209662:  timer_cancel   timer=0xffffff80444f0428
<idle>-0    [003]   9436.209664:  timer_expire_entry   timer=0xffffff80444f0428  now=0x10022da1c  function=__typeid__ZTSFvP10timer_listE_global_addr  baseclk=0x10022da1c
<idle>-0    [003]   9436.209718:  timer_expire_exit   timer=0xffffff80444f0428
kworker/u16:6-14217    [003]   9436.209863:  timer_start   timer=0xffffff80444f0428  function=__typeid__ZTSFvP10timer_listE_global_addr  expires=0x10022da2b  now=0x10022da1c  flags=182452227
vendor.xxxyyy.ha-1593    [004]   9436.209888:  timer_cancel   timer=0xffffff80444f0428
vendor.xxxyyy.ha-1593    [004]   9436.216390:  timer_init   timer=0xffffff80444f0428
vendor.xxxyyy.ha-1593    [004]   9436.216392:  timer_start   timer=0xffffff80444f0428  function=__typeid__ZTSFvP10timer_listE_global_addr  expires=0x10022da2c  now=0x10022da1d  flags=186646532
vendor.xxxyyy.ha-1593    [005]   9436.220992:  timer_cancel   timer=0xffffff80444f0428
xxxyyyTraceManag-7795    [004]   9436.261641:  timer_cancel   timer=0xffffff80444f0428

[2]

 9436.261653][    C4] Unable to handle kernel paging request at virtual address dead00000000012a
[ 9436.261664][    C4] Mem abort info:
[ 9436.261666][    C4]   ESR = 0x96000044
[ 9436.261669][    C4]   EC = 0x25: DABT (current EL), IL = 32 bits
[ 9436.261671][    C4]   SET = 0, FnV = 0
[ 9436.261673][    C4]   EA = 0, S1PTW = 0
[ 9436.261675][    C4] Data abort info:
[ 9436.261677][    C4]   ISV = 0, ISS = 0x00000044
[ 9436.261680][    C4]   CM = 0, WnR = 1
[ 9436.261682][    C4] [dead00000000012a] address between user and kernel address ranges
[ 9436.261685][    C4] Internal error: Oops: 96000044 [#1] PREEMPT SMP
[ 9436.261701][    C4] Skip md ftrace buffer dump for: 0x3a982d0
...

[ 9436.262138][    C4] CPU: 4 PID: 7795 Comm: TraceManag Tainted: G S      W  O      5.10.149-android12-9-o-g17f915d29d0c #1
[ 9436.262141][    C4] Hardware name: Qualcomm Technologies, Inc.  (DT)
[ 9436.262144][    C4] pstate: 22400085 (nzCv daIf +PAN -UAO +TCO BTYPE=--)
[ 9436.262161][    C4] pc : expire_timers+0x9c/0x438
[ 9436.262164][    C4] lr : expire_timers+0x2a4/0x438
[ 9436.262168][    C4] sp : ffffffc010023dd0
[ 9436.262171][    C4] x29: ffffffc010023df0 x28: ffffffd0636fdc18
[ 9436.262178][    C4] x27: ffffffd063569dd0 x26: ffffffd063536008
[ 9436.262182][    C4] x25: 0000000000000001 x24: ffffff88f7c69280
[ 9436.262185][    C4] x23: 00000000000000e0 x22: dead000000000122
[ 9436.262188][    C4] x21: 000000010022da29 x20: ffffff8af72b4e80
[ 9436.262191][    C4] x19: ffffffc010023e50 x18: ffffffc010025038
[ 9436.262195][    C4] x17: 0000000000000240 x16: 0000000000000201
[ 9436.262199][    C4] x15: ffffffffffffffff x14: ffffff889f3c3100
[ 9436.262203][    C4] x13: ffffff889f3c3100 x12: 00000000049f56b8
[ 9436.262207][    C4] x11: 00000000049f56b8 x10: 00000000ffffffff
[ 9436.262212][    C4] x9 : ffffffc010023e50 x8 : dead000000000122
[ 9436.262216][    C4] x7 : ffffffffffffffff x6 : ffffffc0100239d8
[ 9436.262220][    C4] x5 : 0000000000000000 x4 : 0000000000000101
[ 9436.262223][    C4] x3 : 0000000000000080 x2 : ffffff889edc155c
[ 9436.262227][    C4] x1 : ffffff8001005200 x0 : ffffff80444f0428
[ 9436.262232][    C4] Call trace:
[ 9436.262236][    C4]  expire_timers+0x9c/0x438
[ 9436.262240][    C4]  __run_timers+0x1f0/0x330
[ 9436.262245][    C4]  run_timer_softirq+0x28/0x58
[ 9436.262255][    C4]  efi_header_end+0x168/0x5ec
[ 9436.262265][    C4]  __irq_exit_rcu+0x108/0x124
[ 9436.262274][    C4]  __handle_domain_irq+0x118/0x1e4
[ 9436.262282][    C4]  gic_handle_irq.30369+0x6c/0x2bc
[ 9436.262286][    C4]  el0_irq_naked+0x60/0x6c

Bug: 317188938
Change-Id: I9a22325f6abbf28217c8f37b093cf77509b0139a
Link: https://lore.kernel.org/all/1700860318-4025-1-git-send-email-quic_mojha@quicinc.com/
Reported-by: Joyyoung Huang <huangzaiyang@oppo.com>
Acked-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
(cherry picked from commit aed5ed595960c6d301dcd4ed31aeaa7a8054c0c6
 https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git devfreq-next)
Signed-off-by: Srinivasarao Pathipati <quic_c_spathi@quicinc.com>
2024-01-03 23:14:47 +00:00
Suren Baghdasaryan
6c8f710857 FROMGIT: arch/mm/fault: fix major fault accounting when retrying under per-VMA lock
A test [1] in Android test suite started failing after [2] was merged.  It
turns out that after handling a major fault under per-VMA lock, the
process major fault counter does not register that fault as major.  Before
[2] read faults would be done under mmap_lock, in which case
FAULT_FLAG_TRIED flag is set before retrying.  That in turn causes
mm_account_fault() to account the fault as major once retry completes.
With per-VMA locks we often retry because a fault can't be handled without
locking the whole mm using mmap_lock.  Therefore such retries do not set
FAULT_FLAG_TRIED flag.  This logic does not work after [2] because we can
now handle read major faults under per-VMA lock and upon retry the fact
there was a major fault gets lost.  Fix this by setting FAULT_FLAG_TRIED
after retrying under per-VMA lock if VM_FAULT_MAJOR was returned.  Ideally
we would use an additional VM_FAULT bit to indicate the reason for the
retry (could not handle under per-VMA lock vs other reason) but this
simpler solution seems to work, so keeping it simple.

[1] https://cs.android.com/android/platform/superproject/+/master:test/vts-testcase/kernel/api/drop_caches_prop/drop_caches_test.cpp
[2] https://lore.kernel.org/all/20231006195318.4087158-6-willy@infradead.org/

Link: https://lkml.kernel.org/r/20231226214610.109282-1-surenb@google.com
Fixes: 12214eba1992 ("mm: handle read faults under the VMA lock")
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

(cherry picked from commit 46e714c729c8d1d8110bc0545d7ffe8a759c9dc0
 https://git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm mm-hotfixes-stable)

Bug: 317385399
Change-Id: Ic7e97bf610dcabb7d3ac2306b2f1213be0ddd269
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2024-01-03 20:45:51 +00:00
Matthew Wilcox (Oracle)
4a518d8633 UPSTREAM: mm: handle write faults to RO pages under the VMA lock
I think this is a pretty rare occurrence, but for consistency handle
faults with the VMA lock held the same way that we handle other faults
with the VMA lock held.

Link: https://lkml.kernel.org/r/20231006195318.4087158-7-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
(cherry picked from commit 4a68fef16df9d88d528094116f8bbd2dbfa62089)

Bug: 293665307
Change-Id: I69cec218c8a1fe14df3268722e6b1be6dffe7978
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2024-01-03 20:45:51 +00:00
Matthew Wilcox (Oracle)
c1da94fa44 UPSTREAM: mm: handle read faults under the VMA lock
Most file-backed faults are already handled through ->map_pages(), but if
we need to do I/O we'll come this way.  Since filemap_fault() is now safe
to be called under the VMA lock, we can handle these faults under the VMA
lock now.

Link: https://lkml.kernel.org/r/20231006195318.4087158-6-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
(cherry picked from commit 12214eba1992642eee5813a9cc9f626e5b2d1815)

Bug: 293665307
Change-Id: Iee48af98b866d88d88ec01143eb26389ab373b6b
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2024-01-03 20:45:51 +00:00
Matthew Wilcox (Oracle)
6541fffd92 UPSTREAM: mm: handle COW faults under the VMA lock
If the page is not currently present in the page tables, we need to call
the page fault handler to find out which page we're supposed to COW, so we
need to both check that there is already an anon_vma and that the fault
handler doesn't need the mmap_lock.

Link: https://lkml.kernel.org/r/20231006195318.4087158-5-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
(cherry picked from commit 4de8c93a4751e10737b6af65db42c743228c67a6)

Bug: 293665307
Change-Id: If749a6f8fcf69d83bbf872c1d45865d1b1b77ea0
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2024-01-03 20:45:51 +00:00
Matthew Wilcox (Oracle)
c7fa581a79 UPSTREAM: mm: handle shared faults under the VMA lock
There are many implementations of ->fault and some of them depend on
mmap_lock being held.  All vm_ops that implement ->map_pages() end up
calling filemap_fault(), which I have audited to be sure it does not rely
on mmap_lock.  So (for now) key off ->map_pages existing as a flag to
indicate that it's safe to call ->fault while only holding the vma lock.

Link: https://lkml.kernel.org/r/20231006195318.4087158-4-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
(cherry picked from commit 4ed4379881aa62588aba6442a9f362a8cf7624e6)

Bug: 293665307
Change-Id: Ifb5ab3df5d05fb182d0cb52820fa24e28e2d6496
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2024-01-03 20:45:51 +00:00
Matthew Wilcox (Oracle)
95af8a80bb BACKPORT: mm: call wp_page_copy() under the VMA lock
It is usually safe to call wp_page_copy() under the VMA lock.  The only
unsafe situation is when no anon_vma has been allocated for this VMA, and
we have to look at adjacent VMAs to determine if their anon_vma can be
shared.  Since this happens only for the first COW of a page in this VMA,
the majority of calls to wp_page_copy() do not need to fall back to the
mmap_sem.

Add vmf_anon_prepare() as an alternative to anon_vma_prepare() which will
return RETRY if we currently hold the VMA lock and need to allocate an
anon_vma.  This lets us drop the check in do_wp_page().

Link: https://lkml.kernel.org/r/20231006195318.4087158-3-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
(cherry picked from commit 164b06f238b986317131e6b61b2f22aabcbc2cc0)
[surenb: resolved merge conflicts due to folio/page differences]

Bug: 293665307
Change-Id: I39bdc247b375bd3dae8078b52c60fd4ce12e1850
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2024-01-03 20:45:51 +00:00
Matthew Wilcox (Oracle)
b43b26b4cd UPSTREAM: mm: make lock_folio_maybe_drop_mmap() VMA lock aware
Patch series "Handle more faults under the VMA lock", v2.

At this point, we're handling the majority of file-backed page faults
under the VMA lock, using the ->map_pages entry point.  This patch set
attempts to expand that for the following siutations:

 - We have to do a read.  This could be because we've hit the point in
   the readahead window where we need to kick off the next readahead,
   or because the page is simply not present in cache.
 - We're handling a write fault.  Most applications don't do I/O by writes
   to shared mmaps for very good reasons, but some do, and it'd be nice
   to not make that slow unnecessarily.
 - We're doing a COW of a private mapping (both PTE already present
   and PTE not-present).  These are two different codepaths and I handle
   both of them in this patch set.

There is no support in this patch set for drivers to mark themselves as
being VMA lock friendly; they could implement the ->map_pages
vm_operation, but if they do, they would be the first.  This is probably
something we want to change at some point in the future, and I've marked
where to make that change in the code.

There is very little performance change in the benchmarks we've run;
mostly because the vast majority of page faults are handled through the
other paths.  I still think this patch series is useful for workloads that
may take these paths more often, and just for cleaning up the fault path
in general (it's now clearer why we have to retry in these cases).

This patch (of 6):

Drop the VMA lock instead of the mmap_lock if that's the one which
is held.

Link: https://lkml.kernel.org/r/20231006195318.4087158-1-willy@infradead.org
Link: https://lkml.kernel.org/r/20231006195318.4087158-2-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
(cherry picked from commit 5d74b2ab2c15d596c470bae6626f345d5575a9d0)

Bug: 293665307
Change-Id: Ife2d11ab12fb428868cd44751784cf731fbffe62
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2024-01-03 20:45:51 +00:00
Matthew Wilcox
9c4bc457ab UPSTREAM: mm/memory.c: fix mismerge
Fix a build issue.

Link: https://lkml.kernel.org/r/ZNerqcNS4EBJA/2v@casper.infradead.org
Fixes: 4aaa60dad4d1 ("mm: allow per-VMA locks on file-backed VMAs")
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202308121909.XNYBtqNI-lkp@intel.com/
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
(cherry picked from commit 08dff2810e)

Bug: 293665307
Change-Id: I07ce19f29c44831cdcf709fe1ce122d1963f0be2
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2024-01-03 20:45:51 +00:00
Suren Baghdasaryan
7d50253c27 ANDROID: Export functions to be used with dma_map_ops in modules
For modules to reuse default dma_map_ops implementations they need to be
exported. Export the following functions:

dma_direct_alloc
dma_direct_free
dma_common_mmap
dma_common_get_sgtable
dma_direct_get_required_mask

Bug: 151050914
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: Ia77b797fcd909fce01da7431bfbde282dc70b3b3
(cherry picked from commit fd31496dae939c7bf2ef874e08d4bf8c6ab738b3)
Signed-off-by: Qian-Hao Huang <qhhuang@google.com>
(cherry picked from commit cdc9f6ef94)
2024-01-03 20:45:29 +00:00
Gao Xiang
37e0a5b868 BACKPORT: FROMGIT: erofs: enable sub-page compressed block support
Let's just disable cached decompression and inplace I/Os for partial
pages as the first step in order to enable sub-page block initial
support.  In other words, currently it works primarily based on
temporary short-lived pages.  Don't expect too much in terms of
performance.

Reviewed-by: Yue Hu <huyue2@coolpad.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20231206091057.87027-6-hsiangkao@linux.alibaba.com

Bug: 318378021
Change-Id: I00238aa437f20c46d015bbe5ab7b706b80b8cfd7
(cherry picked from commit 0ee3a0d59e007320167a2e9f4b8bf1304ada7771
 https://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs.git dev)
[dhavale: resolved conflicts in inode.c in erofs_fill_inode()]
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
2024-01-03 18:37:43 +00:00
Gao Xiang
f466d52164 FROMGIT: erofs: refine z_erofs_transform_plain() for sub-page block support
Sub-page block support is still unusable even with previous commits if
interlaced PLAIN pclusters exist.  Such pclusters can be found if the
fragment feature is enabled.

This commit tries to handle "the head part" of interlaced PLAIN
pclusters first: it was once explained in commit fdffc091e6 ("erofs:
support interlaced uncompressed data for compressed files").

It uses a unique way for both shifted and interlaced PLAIN pclusters.
As an added bonus, PLAIN pclusters larger than the block size is also
supported now for the upcoming large lclusters.

Reviewed-by: Yue Hu <huyue2@coolpad.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20231206091057.87027-5-hsiangkao@linux.alibaba.com

Bug: 318378021
Change-Id: I3d50132664f8754f56d62744420060108ed0da4f
(cherry picked from commit 192351616a9dde686492bcb9d1e4895a1411a527
https: //git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs.git dev)
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
2024-01-03 18:37:43 +00:00
Gao Xiang
a18efa4e4a FROMGIT: erofs: fix ztailpacking for subpage compressed blocks
`pageofs_in` should be the compressed data offset of the page rather
than of the block.

Acked-by: Chao Yu <chao@kernel.org>
Reviewed-by: Yue Hu <huyue2@coolpad.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20231214161337.753049-1-hsiangkao@linux.alibaba.com

Bug: 318378021
Change-Id: I0997a69b22b0f42c327c810359f55f5fa6a76275
(cherry picked from commit e5aba911dee5e20fa82efbe13e0af8f38ea459e7
 https://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs.git dev)
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
2024-01-03 18:37:43 +00:00
Gao Xiang
0c6a18c75b BACKPORT: FROMGIT: erofs: fix up compacted indexes for block size < 4096
Previously, the block size always equaled to PAGE_SIZE, therefore
`lclusterbits` couldn't be less than 12.

Since sub-page compressed blocks are now considered, `lobits` for
a lcluster in each pack cannot always be `lclusterbits` as before.
Otherwise, there is no enough room for the special value
`Z_EROFS_VLE_DI_D0_CBLKCNT`.

To support smaller block sizes, `lobits` for each compacted lcluster is
now calculated as:
   lobits = max(lclusterbits, ilog2(Z_EROFS_VLE_DI_D0_CBLKCNT) + 1)

Reviewed-by: Yue Hu <huyue2@coolpad.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20231206091057.87027-4-hsiangkao@linux.alibaba.com

Bug: 318378021
Change-Id: Iacd89e2b33ddf39ea40b90e88a2bf99bb5a83b31
(cherry picked from commit 8d2517aaeea3ab8651bb517bca8f3c8664d318ea
 https: //git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs.git dev)
[dhavale: resolved conflicts in zmap.c due to older naming of constants
and updated commit message also to use the older names]
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
2024-01-03 18:37:43 +00:00
Gao Xiang
d7bb85f1cb FROMGIT: erofs: record pclustersize in bytes instead of pages
Currently, compressed sizes are recorded in pages using `pclusterpages`,
However, for tailpacking pclusters, `tailpacking_size` is used instead.

This approach doesn't work when dealing with sub-page blocks. To address
this, let's switch them to the unified `pclustersize` in bytes.

Reviewed-by: Yue Hu <huyue2@coolpad.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20231206091057.87027-3-hsiangkao@linux.alibaba.com

Bug: 318378021
Change-Id: Ia8c50a7b4adcf6cd161b1d6f8bfc5a7fd3371079
(cherry picked from commit 54ed3fdd66055d073cb1cd2c6c65bbc0683c40cf
 https: //git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs.git dev)
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
2024-01-03 18:37:43 +00:00
Gao Xiang
9d259220ac FROMGIT: erofs: support I/O submission for sub-page compressed blocks
Add a basic I/O submission path first to support sub-page blocks:

 - Temporary short-lived pages will be used entirely;

 - In-place I/O pages can be used partially, but compressed pages need
   to be able to be mapped in contiguous virtual memory.

As a start, currently cache decompression is explicitly disabled for
sub-page blocks, which will be supported in the future.

Reviewed-by: Yue Hu <huyue2@coolpad.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20231206091057.87027-2-hsiangkao@linux.alibaba.com

Bug: 318378021
Change-Id: Ib2cb6120805ab479a450580fc8774af131271791
(cherry picked from commit 192351616a9dde686492bcb9d1e4895a1411a527
 https: //git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs.git dev)
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
2024-01-03 18:37:43 +00:00
Gao Xiang
8a49ea9441 FROMGIT: erofs: fix lz4 inplace decompression
Currently EROFS can map another compressed buffer for inplace
decompression, that was used to handle the cases that some pages of
compressed data are actually not in-place I/O.

However, like most simple LZ77 algorithms, LZ4 expects the compressed
data is arranged at the end of the decompressed buffer and it
explicitly uses memmove() to handle overlapping:
  __________________________________________________________
 |_ direction of decompression --> ____ |_ compressed data _|

Although EROFS arranges compressed data like this, it typically maps two
individual virtual buffers so the relative order is uncertain.
Previously, it was hardly observed since LZ4 only uses memmove() for
short overlapped literals and x86/arm64 memmove implementations seem to
completely cover it up and they don't have this issue.  Juhyung reported
that EROFS data corruption can be found on a new Intel x86 processor.
After some analysis, it seems that recent x86 processors with the new
FSRM feature expose this issue with "rep movsb".

Let's strictly use the decompressed buffer for lz4 inplace
decompression for now.  Later, as an useful improvement, we could try
to tie up these two buffers together in the correct order.

Reported-and-tested-by: Juhyung Park <qkrwngud825@gmail.com>
Closes: https://lore.kernel.org/r/CAD14+f2AVKf8Fa2OO1aAUdDNTDsVzzR6ctU_oJSmTyd6zSYR2Q@mail.gmail.com
Fixes: 0ffd71bcc3 ("staging: erofs: introduce LZ4 decompression inplace")
Fixes: 598162d050 ("erofs: support decompress big pcluster for lz4 backend")
Cc: stable <stable@vger.kernel.org> # 5.4+
Tested-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20231206045534.3920847-1-hsiangkao@linux.alibaba.com

Bug: 318378021
Change-Id: Ifd2981320f9f79b27bc7484d8906501a2fa05359
(cherry picked from commit 3c12466b6b7bf1e56f9b32c366a3d83d87afb4de
 https://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs.git dev)
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
2024-01-03 18:37:43 +00:00
Gao Xiang
bdc5d268ba FROMGIT: erofs: fix memory leak on short-lived bounced pages
Both MicroLZMA and DEFLATE algorithms can use short-lived pages on
demand for the overlapped inplace I/O decompression.

However, those short-lived pages are actually added to
`be->compressed_pages`.  Thus, it should be checked instead of
`pcl->compressed_bvecs`.

The LZ4 algorithm doesn't work like this, so it won't be impacted.

Fixes: 67139e36d9 ("erofs: introduce `z_erofs_parse_in_bvecs'")
Reviewed-by: Yue Hu <huyue2@coolpad.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20231128180431.4116991-1-hsiangkao@linux.alibaba.com

Bug: 318378021
Change-Id: Ia1f602e9944b884022a3e20db12af568304fd80c
(cherry picked from commit 93d6fda7f926451a0fa1121b9558d75ca47e861e
https://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs.git dev)
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
2024-01-03 18:37:43 +00:00