Commit Graph

1065949 Commits

Author SHA1 Message Date
Paul Lawrence
14a5cd6ae3 ANDROID: fuse-bpf: Do not change bpf program in lookups
If a lookup finds an existing inode, it must not change the existing bpf
program since it may be in use.

Bug: 267095363
Test: fuse_test, atest CtsScopedStorageHostTest
Change-Id: Icb00681fbcd51fdd4b0764906509093d98caeec4
Signed-off-by: Paul Lawrence <paullawrence@google.com>
2023-02-21 12:56:49 -08:00
Alessandro Astone
dff91fc664 UPSTREAM: binder: Gracefully handle BINDER_TYPE_FDA objects with num_fds=0
Some android userspace is sending BINDER_TYPE_FDA objects with
num_fds=0. Like the previous patch, this is reproducible when
playing a video.

Before commit 09184ae9b5 BINDER_TYPE_FDA objects with num_fds=0
were 'correctly handled', as in no fixup was performed.

After commit 09184ae9b5 we aggregate fixup and skip regions in
binder_ptr_fixup structs and distinguish between the two by using
the skip_size field: if it's 0, then it's a fixup, otherwise skip.
When processing BINDER_TYPE_FDA objects with num_fds=0 we add a
skip region of skip_size=0, and this causes issues because now
binder_do_deferred_txn_copies will think this was a fixup region.

To address that, return early from binder_translate_fd_array to
avoid adding an empty skip region.

Fixes: 09184ae9b5 ("binder: defer copies of pre-patched txn data")
Acked-by: Todd Kjos <tkjos@google.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Alessandro Astone <ales.astone@gmail.com>
Link: https://lore.kernel.org/r/20220415120015.52684-1-ales.astone@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 257685302
(cherry picked from commit ef38de9217)
Change-Id: I34fab41c0c1beee366a5df4724b263e4385ad13b
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Signed-off-by: Lee Jones <joneslee@google.com>
2023-02-21 20:31:36 +00:00
Alessandro Astone
404504ef6e UPSTREAM: binder: Address corner cases in deferred copy and fixup
When handling BINDER_TYPE_FDA object we are pushing a parent fixup
with a certain skip_size but no scatter-gather copy object, since
the copy is handled standalone.
If BINDER_TYPE_FDA is the last children the scatter-gather copy
loop will never stop to skip it, thus we are left with an item in
the parent fixup list. This will trigger the BUG_ON().

This is reproducible in android when playing a video.
We receive a transaction that looks like this:
    obj[0] BINDER_TYPE_PTR, parent
    obj[1] BINDER_TYPE_PTR, child
    obj[2] BINDER_TYPE_PTR, child
    obj[3] BINDER_TYPE_FDA, child

Fixes: 09184ae9b5 ("binder: defer copies of pre-patched txn data")
Acked-by: Todd Kjos <tkjos@google.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Alessandro Astone <ales.astone@gmail.com>
Link: https://lore.kernel.org/r/20220415120015.52684-2-ales.astone@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 257685302
(cherry picked from commit 2d1746e3fd)
Change-Id: I3963a98dfc48b01d7bb8166aaa90341818bf6416
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Signed-off-by: Lee Jones <joneslee@google.com>
2023-02-21 20:31:36 +00:00
Arnd Bergmann
dc88c3e2b7 UPSTREAM: binder: fix pointer cast warning
binder_uintptr_t is not the same as uintptr_t, so converting it into a
pointer requires a second cast:

drivers/android/binder.c: In function 'binder_translate_fd_array':
drivers/android/binder.c:2511:28: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
 2511 |         sender_ufda_base = (void __user *)sender_uparent->buffer + fda->parent_offset;
      |                            ^

Fixes: 656e01f3ab ("binder: read pre-translated fds from sender buffer")
Acked-by: Todd Kjos <tkjos@google.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20211207122448.1185769-1-arnd@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 257685302
(cherry picked from commit 9a0a930fe2)
Change-Id: I1c9b86a90bcf2be81012e59e0c472869f551e61a
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Signed-off-by: Lee Jones <joneslee@google.com>
2023-02-21 20:31:36 +00:00
Todd Kjos
d54b5252db UPSTREAM: binder: defer copies of pre-patched txn data
BINDER_TYPE_PTR objects point to memory areas in the
source process to be copied into the target buffer
as part of a transaction. This implements a scatter-
gather model where non-contiguous memory in a source
process is "gathered" into a contiguous region in
the target buffer.

The data can include pointers that must be fixed up
to correctly point to the copied data. To avoid making
source process pointers visible to the target process,
this patch defers the copy until the fixups are known
and then copies and fixeups are done together.

There is a special case of BINDER_TYPE_FDA which applies
the fixup later in the target process context. In this
case the user data is skipped (so no untranslated fds
become visible to the target).

Reviewed-by: Martijn Coenen <maco@android.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
Link: https://lore.kernel.org/r/20211130185152.437403-5-tkjos@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 137131904
Bug: 257685302
(cherry picked from commit 09184ae9b5)
[cmllamas: fix trivial merge conflict]
Change-Id: I6de75b192d1e3b2cc73c8d91077d97b608e8c5a9
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Signed-off-by: Lee Jones <joneslee@google.com>
2023-02-21 20:31:36 +00:00
Todd Kjos
2eca125266 UPSTREAM: binder: read pre-translated fds from sender buffer
This patch is to prepare for an up coming patch where we read
pre-translated fds from the sender buffer and translate them before
copying them to the target.  It does not change run time.

The patch adds two new parameters to binder_translate_fd_array() to
hold the sender buffer and sender buffer parent.  These parameters let
us call copy_from_user() directly from the sender instead of using
binder_alloc_copy_from_buffer() to copy from the target.  Also the patch
adds some new alignment checks.  Previously the alignment checks would
have been done in a different place, but this lets us print more
useful error messages.

Reviewed-by: Martijn Coenen <maco@android.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
Link: https://lore.kernel.org/r/20211130185152.437403-4-tkjos@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bug: 137131904
Bug: 257685302
(cherry picked from commit 656e01f3ab)
Change-Id: Ib786020e49bd33e35aec88d43965f9d98021fa53
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Signed-off-by: Lee Jones <joneslee@google.com>
2023-02-21 16:06:55 +00:00
Mostafa Saleh
0f00f01625 ANDROID: KVM: arm64: iommu: Add arg to finalize to pass state
Add an argument to finalize HVC/function that should be used from EL1
driver.
The argument holds standard error code. Incase of any error, pKVM
will erase pvmfw.

Bug: 268607700
Change-Id: I9f6a6bfc89d3381ab88938586d3b73dd5d94102a
Signed-off-by: Mostafa Saleh <smostafa@google.com>
2023-02-21 13:24:18 +00:00
Mostafa Saleh
3e8a2f0f1a ANDROID: KVM: arm64: Add function to report misconfigurations to pKVM.
Add function pkvm_handle_system_misconfiguration that is used to
report misconfigurations to pKVM that can undermine its security,
so pKVM can't take the proper action.
This patch only add one event NO_DMA_ISOLATION to indicate that DMA
is not isolated and access the hypervisor.
The patch adds type pkvm_system_misconfiguration to identify the event
instead of having a void function with only one action as in the
future different events can have different responses.

Bug: 268607700
Change-Id: I9f0d2aeee25bd6bed622d327d6cbb36119c54c58
Signed-off-by: Mostafa Saleh <smostafa@google.com>
2023-02-21 13:24:18 +00:00
Peter Collingbourne
f72703a9c1 BACKPORT: FROMLIST: arm64: Reset KASAN tag in copy_highpage with HW tags only
During page migration, the copy_highpage function is used to copy the
page data to the target page. If the source page is a userspace page
with MTE tags, the KASAN tag of the target page must have the match-all
tag in order to avoid tag check faults during subsequent accesses to the
page by the kernel. However, the target page may have been allocated in
a number of ways, some of which will use the KASAN allocator and will
therefore end up setting the KASAN tag to a non-match-all tag. Therefore,
update the target page's KASAN tag to match the source page.

We ended up unintentionally fixing this issue as a result of a bad
merge conflict resolution between commit e059853d14 ("arm64: mte:
Fix/clarify the PG_mte_tagged semantics") and commit 20794545c1 ("arm64:
kasan: Revert "arm64: mte: reset the page tag in page->flags""), which
preserved a tag reset for PG_mte_tagged pages which was considered to be
unnecessary at the time. Because SW tags KASAN uses separate tag storage,
update the code to only reset the tags when HW tags KASAN is enabled.

Signed-off-by: Peter Collingbourne <pcc@google.com>
Link: https://linux-review.googlesource.com/id/If303d8a709438d3ff5af5fd85706505830f52e0c
Reported-by: "Kuan-Ying Lee (李冠穎)" <Kuan-Ying.Lee@mediatek.com>
Cc: <stable@vger.kernel.org> # 6.1
Fixes: 20794545c1 ("arm64: kasan: Revert "arm64: mte: reset the page tag in page->flags"")
Link: https://lore.kernel.org/all/20230215050911.1433132-1-pcc@google.com/
[pcc@google.com: applied merge resolution given in link]
Bug: 265863271
Change-Id: If303d8a709438d3ff5af5fd85706505830f52e0c
2023-02-17 21:48:13 +00:00
Catalin Marinas
7dc7a6cd90 BACKPORT: arm64: mte: Fix/clarify the PG_mte_tagged semantics
Currently the PG_mte_tagged page flag mostly means the page contains
valid tags and it should be set after the tags have been cleared or
restored. However, in mte_sync_tags() it is set before setting the tags
to avoid, in theory, a race with concurrent mprotect(PROT_MTE) for
shared pages. However, a concurrent mprotect(PROT_MTE) with a copy on
write in another thread can cause the new page to have stale tags.
Similarly, tag reading via ptrace() can read stale tags if the
PG_mte_tagged flag is set before actually clearing/restoring the tags.

Fix the PG_mte_tagged semantics so that it is only set after the tags
have been cleared or restored. This is safe for swap restoring into a
MAP_SHARED or CoW page since the core code takes the page lock. Add two
functions to test and set the PG_mte_tagged flag with acquire and
release semantics. The downside is that concurrent mprotect(PROT_MTE) on
a MAP_SHARED page may cause tag loss. This is already the case for KVM
guests if a VMM changes the page protection while the guest triggers a
user_mem_abort().

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
[pcc@google.com: fix build with CONFIG_ARM64_MTE disabled]
Signed-off-by: Peter Collingbourne <pcc@google.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Peter Collingbourne <pcc@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221104011041.290951-3-pcc@google.com
(cherry picked from commit e059853d14)
[pcc@google.com: resolved conflict in arch/arm64/include/asm/pgtable.h]
Bug: 265863271
Change-Id: Iff1bfa26982c16eac47120ee48a68b3fe60a5743
2023-02-17 21:42:20 +00:00
fengqi
2d2faad7be FROMLIST: input: Add KEY_CAMERA_FOCUS event in HID
Our HID device need KEY_CAMERA_FOCUS event to control camera, but this
event is non-existent in current HID driver.
So we add this event in hid-input.c

Bug: 263846073
Link: https://lore.kernel.org/linux-input/Y+4YcnbPwWAnhrPt@kroah.com/
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: fengqi <fengqi@xiaomi.com>
Change-Id: I500881ea8b6b4e31099f2120e2c492f2793bf086
(cherry picked from commit af8dfb011fd0e434de7f0287e561a67757fb9346)
2023-02-17 15:13:43 +00:00
Vincent Donnefort
6742bc7218 ANDROID: KVM: arm64: Support missing pKVM module sections
pKVM modules being rather small, it is expected for some basic sections
to be missing or empty (especially rodata and data). Make those optional
in the loader.

Bug: 269245057
Change-Id: I874050230de5cb4b3b29d316663400bb221e2021
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Signed-off-by: Will Deacon <willdeacon@google.com>
2023-02-17 11:01:17 +00:00
Martin Liu
9c3705e41c FROMGIT: of: reserved-mem: print out reserved-mem details during boot
It's important to know reserved-mem information in mobile world
since reserved memory via device tree keeps increased in platform
(e.g., 45% in our platform). Therefore, it's crucial to know the
reserved memory sizes breakdown for the memory accounting.

This patch prints out reserved memory details during boot to make
them visible.

Below is an example output:

[    0.000000] OF: reserved mem: 0x00000009f9400000..0x00000009fb3fffff ( 32768 KB ) map reusable test1
[    0.000000] OF: reserved mem: 0x00000000ffdf0000..0x00000000ffffffff ( 2112 KB ) map non-reusable test2
[    0.000000] OF: reserved mem: 0x0000000091000000..0x00000000912fffff ( 3072 KB ) nomap non-reusable test3

Bug: 269588564
Change-Id: Idf77b3a9de70ed13c806d3b03d1886b5ae89da62
Signed-off-by: Martin Liu <liumartin@google.com>
Link: https://lore.kernel.org/r/20230209160954.1471909-1-liumartin@google.com
Signed-off-by: Rob Herring <robh@kernel.org>
(cherry picked from commit aeb9267eb6
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git for-next)
2023-02-17 00:04:55 +00:00
Nathan Chancellor
eaee7326d3 ANDROID: GKI: Fix copying of protected_exports
When building gki_defconfig outside of the Android build system, copying
protected_exports fails:

  $ make -skj"$(nproc)" LLVM=1 O=build gki_defconfig all
  cp: cannot create regular file '/protected_exports': Permission denied
  ...

OUT_DIR is an Android build.sh specific variable, so it will not be
defined when using just kbuild. Use objtree instead, which is guaranteed
to be available through kbuild directly; OUT_DIR is passed to make via
O, which is used to ultimately define objtree, so there is no functional
change.

Bug: 268678245
Change-Id: I235cef7c848a7cf9df9d7d5343af33d95b501a15
Fixes: 9f3f9a2634e02 ("ANDROID: GKI: Do not modify protected exports source list")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
2023-02-16 21:56:59 +00:00
Jaegeuk Kim
9ccb90319b ANDROID: scsi: ufs: add zoned device sysfs entries
This adds zufs entries.

Bug: 197782466
Bug: 269471019
Signed-off-by: Jaegeuk Kim <jaegeuk@google.com>
Change-Id: I02b737005b8fa8db1c578c9794f646f35f85d6ac
2023-02-16 18:07:49 +00:00
Jaegeuk Kim
96eb96c648 FROMLIST: scsi: ufs: support IO traces for zoned block device
Let's support WRITE_16, READ_16, ZBC_IN, ZBC_OUT.

Bug: 197782466
Link: https://lore.kernel.org/lkml/20230215190448.1687786-1-jaegeuk@kernel.org/T/#u
Signed-off-by: Jaegeuk Kim <jaegeuk@google.com>
Change-Id: I7bd9b9cddf2074c550f36ac692519b1d1eb617dd
2023-02-16 18:07:49 +00:00
Jaegeuk Kim
b50e6a61bb ANDROID: Revert "f2fs: ensure only power of 2 zone sizes are allowed"
This reverts commit f634322eac.

Bug: 197782466
Bug: 269471019
Signed-off-by: Jaegeuk Kim <jaegeuk@google.com>
Change-Id: I39390fa3d145ca7761ec2b4a0f405fc84ce81206
2023-02-16 18:07:49 +00:00
Aleksei Vetrov
d34386a49d ANDROID: GKI: enable KMI enforcement
Add android/abi_gki_aarch64.stg as initial ABI representation of the
KMI, enable trimming of symbols outside KMI and start enforcing KMI.

While this is hard enforcement in the code base, we still allow
controlled changes to the ABI until KMI freeze.

Test: TH
Bug: 269323432
Change-Id: I016fe12aff4d781640340e16a2ae278e6bf5cd84
Signed-off-by: Aleksei Vetrov <vvvvvv@google.com>
2023-02-16 10:57:37 +00:00
Ramji Jiyani
2f173b83a0 ANDROID: GKI: Update db845c symbol list
Updated with: bazel run //common:db845c_abi_update_symbol_lis

Update to add missing symbols required at runtime for unsigned
modules. This fixes failing build time check for the same
introduced with the aosp/2410926 if it re-lands.

Bug: 269240239
Test: TH
Change-Id: I43ce14dd0549fb7974e3e635e3ee1c02194c8b3a
Signed-off-by: Ramji Jiyani <ramjiyani@google.com>
2023-02-15 22:20:55 +00:00
Bart Van Assche
d6204f1e8d ANDROID: scsi/sd_zbc: Support npo2 zone sizes
Remove the restriction that the zone size must be a power of two. This
patch has been tested with the following test script:

. tests/zbd/rc
. common/null_blk
. common/scsi_debug

DESCRIPTION="test npo2 zone size support"
QUICK=1

requires() {
	_have_fio
	_have_driver f2fs
	_have_module_param scsi_debug zone_size_mb
	_have_scsi_debug
}

test() {
	echo "Running ${TEST_NAME}"

	local scsi_debug_params=(
		delay=0
		dev_size_mb=1024
		sector_size=4096
		zbc=host-managed
		zone_nr_conv=0
		zone_size_mb=3
	)
	_init_scsi_debug "${scsi_debug_params[@]}" &&
	local zdev="/dev/${SCSI_DEBUG_DEVICES[0]}" fail &&
	ls -ld "${zdev}" >>"${FULL}" &&
	local fio_args=(
		--direct=1
		--file="${zdev}"
		--gtod_reduce=1
		--iodepth=64
		--iodepth_batch=16
		--ioengine=io_uring
		--ioscheduler=none
		--name=npo2zs
		--runtime=10
		--size=1M
		--time_based=1
		--zonemode=zbd
	) &&
	_run_fio_verify_io "${fio_args[@]}" >>"${FULL}" 2>&1 ||
	fail=true

	_exit_scsi_debug

	if [ -z "$fail" ]; then
		echo "Test complete"
	else
		echo "Test failed"
		return 1
	fi
}

Bug: 197782466
Bug: 269471019
Change-Id: I70b498ab8920b4e1a13e04b753fe176a632552b2
Signed-off-by: Bart Van Assche <bvanassche@google.com>
2023-02-15 18:44:00 +00:00
Bart Van Assche
7dadc51d98 ANDROID: scsi: scsi_debug: Support npo2 zone sizes
Remove the restriction that the zone size must be a power of two.

Bug: 197782466
Bug: 269471019
Change-Id: I7bd9c8f19ec601b82e0e1c271c3e362ddaf9a0ed
Signed-off-by: Bart Van Assche <bvanassche@google.com>
2023-02-15 18:44:00 +00:00
Bart Van Assche
3236bf4f10 ANDROID: nvmet: Use the bdev_is_zone_start() function
Add support for zone sizes that are not a power of two in nvmet.

Bug: 197782466
Bug: 269471019
Change-Id: I7ec207985799d3c7a82d26033b87e37b51935640
Signed-off-by: Bart Van Assche <bvanassche@google.com>
2023-02-15 18:44:00 +00:00
Pankaj Raghav
730dba4d11 FROMLIST: dm: call dm_zone_endio after the target endio callback for zoned devices
dm_zone_endio() updates the bi_sector of orig bio for zoned devices that
uses either native append or append emulation, and it is called before the
endio of the target. But target endio can still update the clone bio
after dm_zone_endio is called, thereby, the orig bio does not contain
the updated information anymore.

Currently, this is not a problem as the targets that support zoned devices
such as dm-zoned, dm-linear, and dm-crypt do not have an endio function,
and even if they do (such as dm-flakey), they don't modify the
bio->bi_iter.bi_sector of the cloned bio that is used to update the
orig_bio's bi_sector in dm_zone_endio function.

This is a prep patch for the new dm-po2zoned target as it modifies
bi_sector in the endio callback.

Call dm_zone_endio for zoned devices after calling the target's endio
function.

Reviewed-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
Bug: 197782466
Bug: 269471019
Link: https://lore.kernel.org/linux-block/20220923173618.6899-12-p.raghav@samsung.com/
Change-Id: Ia7a96aac805a040f8ab109e6cfdf50ad9895e2ee
Signed-off-by: Bart Van Assche <bvanassche@google.com>
2023-02-15 18:44:00 +00:00
Pankaj Raghav
79d615bb54 FROMLIST: dm-table: allow zoned devices with non power-of-2 zone sizes
Allow dm to support zoned devices with non power-of-2(po2) zone sizes as
the block layer now supports it.

Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Mike Snitzer <snitzer@kernel.org>
Bug: 197782466
Bug: 269471019
Link: https://lore.kernel.org/linux-block/20220923173618.6899-11-p.raghav@samsung.com/
Change-Id: I837811b17aacecc74fc2fd9d4009ad7e66917fb8
Signed-off-by: Bart Van Assche <bvanassche@google.com>
2023-02-15 18:44:00 +00:00
Pankaj Raghav
d3f3571556 FROMLIST: dm-zone: use generic helpers to calculate offset from zone start
Use the bdev_offset_from_zone_start() helper function to calculate
the offset from zone start instead of using power of 2 based
calculation.

Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Mike Snitzer <snitzer@kernel.org>
Bug: 197782466
Bug: 269471019
Link: https://lore.kernel.org/linux-block/20220923173618.6899-10-p.raghav@samsung.com/
Change-Id: If04c016235c87e23b2b544f947ae8bdfc3df4c48
Signed-off-by: Bart Van Assche <bvanassche@google.com>
2023-02-15 18:44:00 +00:00
Luis Chamberlain
3ce9fbc5c2 FROMLIST: dm-zoned: ensure only power of 2 zone sizes are allowed
dm-zoned relies on the assumption that the zone size is a
power-of-2(po2) and the zone capacity is same as the zone size.

Ensure only po2 devices can be used as dm-zoned target until a native
support for zoned devices with non-po2 zone size is added.

Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
Link: https://lore.kernel.org/linux-block/20220923173618.6899-9-p.raghav@samsung.com/
Bug: 197782466
Bug: 269471019
Change-Id: I7b043036a9b95de779bc296465ccf4d9a0f222a2
Signed-off-by: Bart Van Assche <bvanassche@google.com>
2023-02-15 18:44:00 +00:00
Pankaj Raghav
5498feea6c FROMLIST: null_blk: allow zoned devices with non power-of-2 zone sizes
Convert the power-of-2(po2) based calculation with zone size to be generic
in null_zone_no with optimization for po2 zone sizes.

The nr_zones calculation in null_init_zoned_dev has been replaced with a
division without special handling for po2 zone sizes as this function is
called only during the initialization and will not be invoked in the hot
path.

Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed by: Adam Manzanares <a.manzanares@samsung.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
Bug: 197782466
Bug: 269471019
Link: https://lore.kernel.org/linux-block/20220923173618.6899-7-p.raghav@samsung.com/
Change-Id: I8d1a915e6e09b04095acdf964d31837c4206bc49
Signed-off-by: Bart Van Assche <bvanassche@google.com>
2023-02-15 18:44:00 +00:00
Pankaj Raghav
941c952c34 FROMLIST: nvme: zns: Allow ZNS drives that have non-power_of_2 zone size
Remove the condition which disallows non-power_of_2 zone size ZNS drive
to be updated and use generic method to calculate number of zones
instead of relying on log and shift based calculation on zone size.

The power_of_2 calculation has been replaced directly with generic
calculation without special handling. Both modified functions are not
used in hot paths, they are only used during initialization &
revalidation of the ZNS device.

As rounddown macro from math.h does not work for 32 bit architectures,
round down operation is open coded.

Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed by: Adam Manzanares <a.manzanares@samsung.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
Bug: 197782466
Bug: 269471019
Link: https://lore.kernel.org/linux-block/20220923173618.6899-6-p.raghav@samsung.com/
Change-Id: Id15b9b6f68498477f3d1c6159c5a459749f856a9
Signed-off-by: Bart Van Assche <bvanassche@google.com>
2023-02-15 18:44:00 +00:00
Pankaj Raghav
b37ecc177c BACKPORT: FROMLIST: block: allow blk-zoned devices to have non-power-of-2 zone size
Checking if a given sector is aligned to a zone is a common
operation that is performed for zoned devices. Add
bdev_is_zone_start helper to check for this instead of opencoding it
everywhere.

Convert the calculations on zone size to be generic instead of relying on
power-of-2(po2) based arithmetic in the block layer using the helpers
wherever possible.

The only hot path affected by this change for zoned devices with po2
zone size is in blk_check_zone_append() but bdev_is_zone_start() helper is
used to optimize the calculation for po2 zone sizes.

Finally, allow zoned devices with non po2 zone sizes provided that their
zone capacity and zone size are equal. The main motivation to allow zoned
devices with non po2 zone size is to remove the unmapped LBA between
zone capcity and zone size for devices that cannot have a po2 zone
capacity.

Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
Bug: 197782466
Bug: 269471019
Link: https://lore.kernel.org/linux-block/20220923173618.6899-4-p.raghav@samsung.com/
Change-Id: I2ecc186d7b14f5508b6abfe9821526d39a21d7e4
Signed-off-by: Bart Van Assche <bvanassche@google.com>
2023-02-15 18:44:00 +00:00
Pankaj Raghav
39ae1728d8 BACKPORT: FROMLIST: block: make bdev_nr_zones and disk_zone_no generic for npo2 zone size
Adapt bdev_nr_zones and disk_zone_no functions so that they can
also work for non-power-of-2 zone sizes.

As the existing deployments assume that a device zone size is a power of
2 number of sectors, power-of-2 optimized calculation is used for those
devices.

There are no direct hot paths modified and the changes just
introduce one new branch per call.

Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Adam Manzanares <a.manzanares@samsung.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
Bug: 197782466
Bug: 269471019
Link: https://lore.kernel.org/linux-block/20220923173618.6899-2-p.raghav@samsung.com/
Change-Id: I1695f25f55579a342c44c6994fd43055d7356c81
Signed-off-by: Bart Van Assche <bvanassche@google.com>
2023-02-15 18:44:00 +00:00
Pankaj Raghav
c5c1ebb7cf BACKPORT: FROMLIST: block: introduce bdev_zone_no helper
Add a generic bdev_zone_no() helper to calculate zone number for a
given sector in a block device. This helper internally uses disk_zone_no()
to find the zone number.

Use the helper bdev_zone_no() to calculate nr of zones. This lets us
make modifications to the math if needed in one place.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
Bug: 197782466
Bug: 269471019
Link: https://lore.kernel.org/linux-block/20230110143635.77300-4-p.raghav@samsung.com/
Change-Id: I86d97c0a64db5ebe1be725710accdaf6e8346d9e
Signed-off-by: Bart Van Assche <bvanassche@google.com>
2023-02-15 18:44:00 +00:00
Pankaj Raghav
610058c970 BACKPORT: FROMLIST: block: add a new helper bdev_{is_zone_start, offset_from_zone_start}
Instead of open coding to check for zone start, add a helper to improve
readability and store the logic in one place.

Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
Bug: 197782466
Bug: 269471019
Link: https://lore.kernel.org/linux-block/20230110143635.77300-3-p.raghav@samsung.com/
Change-Id: Ieb3ec909589fe087f49a61c48ba85f0e612b6d1d
Signed-off-by: Bart Van Assche <bvanassche@google.com>
2023-02-15 18:44:00 +00:00
Pankaj Raghav
5d1853d5f4 FROMLIST: block: remove superfluous check for request queue in bdev_is_zoned()
Remove the superfluous request queue check in bdev_is_zoned() as
bdev_get_queue() can never return NULL.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
Bug: 197782466
Bug: 269471019
Link: https://lore.kernel.org/linux-block/20230110143635.77300-2-p.raghav@samsung.com/
Change-Id: I016ba8e1462d97f9b9a25e5f12371854294511fb
Signed-off-by: Bart Van Assche <bvanassche@google.com>
2023-02-15 18:44:00 +00:00
Ramji Jiyani
19ce16cc96 ANDROID: GKI: Do not modify protected exports source list
Header generation script is using the protected exports list
as a source to generate the header file during the kernel build.
Script preporcess the symbols in-place before using it to generate
an array of symbols for protected exports causing the source file
to change. This may force the kleaf to build kernel again even
though there are no real changes in terms of symbols.

Use a copy as a temp file for processing leaving the source file
un affected.

Unprotected symbol list is already a temp file; so it doesn't
affect that target.

Bug: 268678245
Test: TH
Change-Id: Ifb551639451d1c7bd935ff732bd1959647c014d7
Signed-off-by: Ramji Jiyani <ramjiyani@google.com>
2023-02-15 18:15:17 +00:00
Greg Kroah-Hartman
6bdcdef440 Merge 5.15.94 into android14-5.15
Changes in 5.15.94
	mm/migration: return errno when isolate_huge_page failed
	migrate: hugetlb: check for hugetlb shared PMD in node migration
	btrfs: limit device extents to the device size
	btrfs: zlib: zero-initialize zlib workspace
	ALSA: hda/realtek: Add Positivo N14KP6-TG
	ALSA: emux: Avoid potential array out-of-bound in snd_emux_xg_control()
	ALSA: hda/realtek: Fix the speaker output on Samsung Galaxy Book2 Pro 360
	ALSA: hda/realtek: Enable mute/micmute LEDs on HP Elitebook, 645 G9
	tracing: Fix poll() and select() do not work on per_cpu trace_pipe and trace_pipe_raw
	of/address: Return an error when no valid dma-ranges are found
	can: j1939: do not wait 250 ms if the same addr was already claimed
	xfrm: compat: change expression for switch in xfrm_xlate64
	IB/hfi1: Restore allocated resources on failed copyout
	xfrm/compat: prevent potential spectre v1 gadget in xfrm_xlate32_attr()
	IB/IPoIB: Fix legacy IPoIB due to wrong number of queues
	RDMA/irdma: Fix potential NULL-ptr-dereference
	RDMA/usnic: use iommu_map_atomic() under spin_lock()
	xfrm: fix bug with DSCP copy to v6 from v4 tunnel
	net: phylink: move phy_device_free() to correctly release phy device
	bonding: fix error checking in bond_debug_reregister()
	net: phy: meson-gxl: use MMD access dummy stubs for GXL, internal PHY
	ionic: clean interrupt before enabling queue to avoid credit race
	uapi: add missing ip/ipv6 header dependencies for linux/stddef.h
	ice: Do not use WQ_MEM_RECLAIM flag for workqueue
	net: dsa: mt7530: don't change PVC_EG_TAG when CPU port becomes VLAN-aware
	net: mscc: ocelot: fix VCAP filters not matching on MAC with "protocol 802.1Q"
	net/mlx5e: Move repeating clear_bit in mlx5e_rx_reporter_err_rq_cqe_recover
	net/mlx5e: Introduce the mlx5e_flush_rq function
	net/mlx5e: Update rx ring hw mtu upon each rx-fcs flag change
	net/mlx5: Bridge, fix ageing of peer FDB entries
	net/mlx5e: IPoIB, Show unknown speed instead of error
	net/mlx5: fw_tracer, Clear load bit when freeing string DBs buffers
	net/mlx5: fw_tracer, Zero consumer index when reloading the tracer
	net/mlx5: Serialize module cleanup with reload and remove
	igc: Add ndo_tx_timeout support
	rds: rds_rm_zerocopy_callback() use list_first_entry()
	selftests: forwarding: lib: quote the sysctl values
	ALSA: pci: lx6464es: fix a debug loop
	riscv: stacktrace: Fix missing the first frame
	ASoC: topology: Return -ENOMEM on memory allocation failure
	pinctrl: mediatek: Fix the drive register definition of some Pins
	pinctrl: aspeed: Fix confusing types in return value
	pinctrl: single: fix potential NULL dereference
	spi: dw: Fix wrong FIFO level setting for long xfers
	pinctrl: intel: Restore the pins that used to be in Direct IRQ mode
	cifs: Fix use-after-free in rdata->read_into_pages()
	net: USB: Fix wrong-direction WARNING in plusb.c
	mptcp: be careful on subflow status propagation on errors
	btrfs: free device in btrfs_close_devices for a single device filesystem
	usb: core: add quirk for Alcor Link AK9563 smartcard reader
	usb: typec: altmodes/displayport: Fix probe pin assign check
	clk: ingenic: jz4760: Update M/N/OD calculation algorithm
	ceph: flush cap releases when the session is flushed
	riscv: Fixup race condition on PG_dcache_clean in flush_icache_pte
	powerpc/64s/interrupt: Fix interrupt exit race with security mitigation switch
	rtmutex: Ensure that the top waiter is always woken up
	arm64: dts: meson-gx: Make mmc host controller interrupts level-sensitive
	arm64: dts: meson-g12-common: Make mmc host controller interrupts level-sensitive
	arm64: dts: meson-axg: Make mmc host controller interrupts level-sensitive
	Fix page corruption caused by racy check in __free_pages
	drm/amdgpu/fence: Fix oops due to non-matching drm_sched init/fini
	drm/i915: Initialize the obj flags for shmem objects
	drm/i915: Fix VBT DSI DVO port handling
	x86/speculation: Identify processors vulnerable to SMT RSB predictions
	KVM: x86: Mitigate the cross-thread return address predictions bug
	Documentation/hw-vuln: Add documentation for Cross-Thread Return Predictions
	Linux 5.15.94

Change-Id: I46aca6bfb09ef8e68122a41734968906982b2a5f
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-02-15 15:06:07 +00:00
Yifan Hong
583c2675f1 ANDROID: Move NDK_TRIPLE to build.config.constants.
... so that they can be loaded by Kleaf extensions
and read during the loading phase.

Moving forward, we should remove build configs in
the future and express constants in .bzl files. However,
for now, until kernel_build has been migrated to
use the defined cc_toolchain, we must keep this file.

Test: Treehugger
Bug: 228238975
Change-Id: Id9628663785970c460470382e1ae162e1112203d
Signed-off-by: Yifan Hong <elsk@google.com>
2023-02-15 03:56:09 +00:00
Matthias Maennich
43daf6bbe2 ANDROID: remove stale symbol list entries
Remove the symbol list entries that are not actually symbols (any more)
to allow strict mode to be enabled.

Bug: 269346251
Change-Id: I32d93e0a3f46c01ccabd4251805066d627518ea0
Signed-off-by: Matthias Maennich <maennich@google.com>
2023-02-14 23:58:39 +00:00
Sophia Wang
40483474ad ANDROID: power: Add vendor hook for suspend
The purpose of this vendor hook is to calculating
the total resume latency for device, CPU and
console, etc. Current vendor hook only supports
individual resume latency for device, each individual
CPU, etc, but lacking of the total resume latency tracing.

Bug: 232541623
Signed-off-by: Sophia Wang <yodagump@google.com>
Change-Id: Idd7c999dcd822cc0f7747baa11ec200eed5f5172
2023-02-14 19:14:09 +00:00
Greg Kroah-Hartman
e2c1a934fd Linux 5.15.94
Link: https://lore.kernel.org/r/20230213144732.336342050@linuxfoundation.org
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Allen Pais <apais@linux.microsoft.com>
Tested-by: Shuah Khan <skhan@linuxfoundation.org>
Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>
Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
Tested-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
Tested-by: Ron Economos <re@w6rz.net>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14 19:18:05 +01:00
Tom Lendacky
17170acdc7 Documentation/hw-vuln: Add documentation for Cross-Thread Return Predictions
commit 493a2c2d23 upstream.

Add the admin guide for the Cross-Thread Return Predictions vulnerability.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <60f9c0b4396956ce70499ae180cb548720b25c7e.1675956146.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14 19:18:05 +01:00
Tom Lendacky
5122e0e443 KVM: x86: Mitigate the cross-thread return address predictions bug
commit 6f0f2d5ef8 upstream.

By default, KVM/SVM will intercept attempts by the guest to transition
out of C0. However, the KVM_CAP_X86_DISABLE_EXITS capability can be used
by a VMM to change this behavior. To mitigate the cross-thread return
address predictions bug (X86_BUG_SMT_RSB), a VMM must not be allowed to
override the default behavior to intercept C0 transitions.

Use a module parameter to control the mitigation on processors that are
vulnerable to X86_BUG_SMT_RSB. If the processor is vulnerable to the
X86_BUG_SMT_RSB bug and the module parameter is set to mitigate the bug,
KVM will not allow the disabling of the HLT, MWAIT and CSTATE exits.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <4019348b5e07148eb4d593380a5f6713b93c9a16.1675956146.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14 19:18:05 +01:00
Tom Lendacky
8f12dcab90 x86/speculation: Identify processors vulnerable to SMT RSB predictions
commit be8de49bea upstream.

Certain AMD processors are vulnerable to a cross-thread return address
predictions bug. When running in SMT mode and one of the sibling threads
transitions out of C0 state, the other sibling thread could use return
target predictions from the sibling thread that transitioned out of C0.

The Spectre v2 mitigations cover the Linux kernel, as it fills the RSB
when context switching to the idle thread. However, KVM allows a VMM to
prevent exiting guest mode when transitioning out of C0. A guest could
act maliciously in this situation, so create a new x86 BUG that can be
used to detect if the processor is vulnerable.

Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <91cec885656ca1fcd4f0185ce403a53dd9edecb7.1675956146.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14 19:18:05 +01:00
Ville Syrjälä
e63c434de8 drm/i915: Fix VBT DSI DVO port handling
commit 6a7ff131f1 upstream.

Turns out modern (icl+) VBTs still declare their DSI ports
as MIPI-A and MIPI-C despite the PHYs now being A and B.
Remap appropriately to allow the panels declared as MIPI-C
to work.

Cc: stable@vger.kernel.org
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/8016
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230207064337.18697-2-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
(cherry picked from commit 118b5c136c)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14 19:18:05 +01:00
Aravind Iddamsetty
fc88c68381 drm/i915: Initialize the obj flags for shmem objects
commit 44e4c5684f upstream.

Obj flags for shmem objects is not being set correctly. Fixes in setting
BO_ALLOC_USER flag which applies to shmem objs as well.

v2: Add fixes tag (Tvrtko, Matt A)

Fixes: 13d29c8237 ("drm/i915/ehl: unconditionally flush the pages on acquire")
Cc: <stable@vger.kernel.org> # v5.15+
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
[tursulin: Grouped all tags together.]
Link: https://patchwork.freedesktop.org/patch/msgid/20230203135205.4051149-1-aravind.iddamsetty@intel.com
(cherry picked from commit bca0d1d3ce)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14 19:18:04 +01:00
Guilherme G. Piccoli
2e557c8ca2 drm/amdgpu/fence: Fix oops due to non-matching drm_sched init/fini
commit 5ad7bbf3db upstream.

Currently amdgpu calls drm_sched_fini() from the fence driver sw fini
routine - such function is expected to be called only after the
respective init function - drm_sched_init() - was executed successfully.

Happens that we faced a driver probe failure in the Steam Deck
recently, and the function drm_sched_fini() was called even without
its counter-part had been previously called, causing the following oops:

amdgpu: probe of 0000:04:00.0 failed with error -110
BUG: kernel NULL pointer dereference, address: 0000000000000090
PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli #338
Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022
RIP: 0010:drm_sched_fini+0x84/0xa0 [gpu_sched]
[...]
Call Trace:
 <TASK>
 amdgpu_fence_driver_sw_fini+0xc8/0xd0 [amdgpu]
 amdgpu_device_fini_sw+0x2b/0x3b0 [amdgpu]
 amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
 devm_drm_dev_init_release+0x49/0x70
 [...]

To prevent that, check if the drm_sched was properly initialized for a
given ring before calling its fini counter-part.

Notice ideally we'd use sched.ready for that; such field is set as the latest
thing on drm_sched_init(). But amdgpu seems to "override" the meaning of such
field - in the above oops for example, it was a GFX ring causing the crash, and
the sched.ready field was set to true in the ring init routine, regardless of
the state of the DRM scheduler. Hence, we ended-up using sched.ops as per
Christian's suggestion [0], and also removed the no_scheduler check [1].

[0] https://lore.kernel.org/amd-gfx/984ee981-2906-0eaf-ccec-9f80975cb136@amd.com/
[1] https://lore.kernel.org/amd-gfx/cd0e2994-f85f-d837-609f-7056d5fb7231@amd.com/

Fixes: 067f44c8b4 ("drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)")
Suggested-by: Christian König <christian.koenig@amd.com>
Cc: Guchun Chen <guchun.chen@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14 19:18:04 +01:00
David Chen
3af734f3ea Fix page corruption caused by racy check in __free_pages
commit 462a8e08e0 upstream.

When we upgraded our kernel, we started seeing some page corruption like
the following consistently:

  BUG: Bad page state in process ganesha.nfsd  pfn:1304ca
  page:0000000022261c55 refcount:0 mapcount:-128 mapping:0000000000000000 index:0x0 pfn:0x1304ca
  flags: 0x17ffffc0000000()
  raw: 0017ffffc0000000 ffff8a513ffd4c98 ffffeee24b35ec08 0000000000000000
  raw: 0000000000000000 0000000000000001 00000000ffffff7f 0000000000000000
  page dumped because: nonzero mapcount
  CPU: 0 PID: 15567 Comm: ganesha.nfsd Kdump: loaded Tainted: P    B      O      5.10.158-1.nutanix.20221209.el7.x86_64 #1
  Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/05/2016
  Call Trace:
   dump_stack+0x74/0x96
   bad_page.cold+0x63/0x94
   check_new_page_bad+0x6d/0x80
   rmqueue+0x46e/0x970
   get_page_from_freelist+0xcb/0x3f0
   ? _cond_resched+0x19/0x40
   __alloc_pages_nodemask+0x164/0x300
   alloc_pages_current+0x87/0xf0
   skb_page_frag_refill+0x84/0x110
   ...

Sometimes, it would also show up as corruption in the free list pointer
and cause crashes.

After bisecting the issue, we found the issue started from commit
e320d3012d ("mm/page_alloc.c: fix freeing non-compound pages"):

	if (put_page_testzero(page))
		free_the_page(page, order);
	else if (!PageHead(page))
		while (order-- > 0)
			free_the_page(page + (1 << order), order);

So the problem is the check PageHead is racy because at this point we
already dropped our reference to the page.  So even if we came in with
compound page, the page can already be freed and PageHead can return
false and we will end up freeing all the tail pages causing double free.

Fixes: e320d3012d ("mm/page_alloc.c: fix freeing non-compound pages")
Link: https://lore.kernel.org/lkml/BYAPR02MB448855960A9656EEA81141FC94D99@BYAPR02MB4488.namprd02.prod.outlook.com/
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: stable@vger.kernel.org
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14 19:18:04 +01:00
Heiner Kallweit
c94ce5ea68 arm64: dts: meson-axg: Make mmc host controller interrupts level-sensitive
commit d182bcf300 upstream.

The usage of edge-triggered interrupts lead to lost interrupts under load,
see [0]. This was confirmed to be fixed by using level-triggered
interrupts.
The report was about SDIO. However, as the host controller is the same
for SD and MMC, apply the change to all mmc controller instances.

[0] https://www.spinics.net/lists/linux-mmc/msg73991.html

Fixes: 221cf34bac ("ARM64: dts: meson-axg: enable the eMMC controller")
Reported-by: Peter Suti <peter.suti@streamunlimited.com>
Tested-by: Vyacheslav Bocharov <adeep@lexina.in>
Tested-by: Peter Suti <peter.suti@streamunlimited.com>
Cc: stable@vger.kernel.org
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Acked-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://lore.kernel.org/r/c00655d3-02f8-6f5f-4239-ca2412420cad@gmail.com
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14 19:18:04 +01:00
Heiner Kallweit
b796c02df3 arm64: dts: meson-g12-common: Make mmc host controller interrupts level-sensitive
commit ac8db4ccee upstream.

The usage of edge-triggered interrupts lead to lost interrupts under load,
see [0]. This was confirmed to be fixed by using level-triggered
interrupts.
The report was about SDIO. However, as the host controller is the same
for SD and MMC, apply the change to all mmc controller instances.

[0] https://www.spinics.net/lists/linux-mmc/msg73991.html

Fixes: 4759fd87b9 ("arm64: dts: meson: g12a: add mmc nodes")
Tested-by: FUKAUMI Naoki <naoki@radxa.com>
Tested-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Tested-by: Jerome Brunet <jbrunet@baylibre.com>
Cc: stable@vger.kernel.org
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Acked-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://lore.kernel.org/r/27d89baa-b8fa-baca-541b-ef17a97cde3c@gmail.com
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14 19:18:04 +01:00
Heiner Kallweit
5d9b771f53 arm64: dts: meson-gx: Make mmc host controller interrupts level-sensitive
commit 66e45351f7 upstream.

The usage of edge-triggered interrupts lead to lost interrupts under load,
see [0]. This was confirmed to be fixed by using level-triggered
interrupts.
The report was about SDIO. However, as the host controller is the same
for SD and MMC, apply the change to all mmc controller instances.

[0] https://www.spinics.net/lists/linux-mmc/msg73991.html

Fixes: ef8d2ffedf ("ARM64: dts: meson-gxbb: add MMC support")
Cc: stable@vger.kernel.org
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Acked-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://lore.kernel.org/r/76e042e0-a610-5ed5-209f-c4d7f879df44@gmail.com
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14 19:18:04 +01:00
Wander Lairson Costa
ac39dce119 rtmutex: Ensure that the top waiter is always woken up
commit db370a8b9f upstream.

Let L1 and L2 be two spinlocks.

Let T1 be a task holding L1 and blocked on L2. T1, currently, is the top
waiter of L2.

Let T2 be the task holding L2.

Let T3 be a task trying to acquire L1.

The following events will lead to a state in which the wait queue of L2
isn't empty, but no task actually holds the lock.

T1                T2                                  T3
==                ==                                  ==

                                                      spin_lock(L1)
                                                      | raw_spin_lock(L1->wait_lock)
                                                      | rtlock_slowlock_locked(L1)
                                                      | | task_blocks_on_rt_mutex(L1, T3)
                                                      | | | orig_waiter->lock = L1
                                                      | | | orig_waiter->task = T3
                                                      | | | raw_spin_unlock(L1->wait_lock)
                                                      | | | rt_mutex_adjust_prio_chain(T1, L1, L2, orig_waiter, T3)
                  spin_unlock(L2)                     | | | |
                  | rt_mutex_slowunlock(L2)           | | | |
                  | | raw_spin_lock(L2->wait_lock)    | | | |
                  | | wakeup(T1)                      | | | |
                  | | raw_spin_unlock(L2->wait_lock)  | | | |
                                                      | | | | waiter = T1->pi_blocked_on
                                                      | | | | waiter == rt_mutex_top_waiter(L2)
                                                      | | | | waiter->task == T1
                                                      | | | | raw_spin_lock(L2->wait_lock)
                                                      | | | | dequeue(L2, waiter)
                                                      | | | | update_prio(waiter, T1)
                                                      | | | | enqueue(L2, waiter)
                                                      | | | | waiter != rt_mutex_top_waiter(L2)
                                                      | | | | L2->owner == NULL
                                                      | | | | wakeup(T1)
                                                      | | | | raw_spin_unlock(L2->wait_lock)
T1 wakes up
T1 != top_waiter(L2)
schedule_rtlock()

If the deadline of T1 is updated before the call to update_prio(), and the
new deadline is greater than the deadline of the second top waiter, then
after the requeue, T1 is no longer the top waiter, and the wrong task is
woken up which will then go back to sleep because it is not the top waiter.

This can be reproduced in PREEMPT_RT with stress-ng:

while true; do
    stress-ng --sched deadline --sched-period 1000000000 \
    	    --sched-runtime 800000000 --sched-deadline \
    	    1000000000 --mmapfork 23 -t 20
done

A similar issue was pointed out by Thomas versus the cases where the top
waiter drops out early due to a signal or timeout, which is a general issue
for all regular rtmutex use cases, e.g. futex.

The problematic code is in rt_mutex_adjust_prio_chain():

    	// Save the top waiter before dequeue/enqueue
	prerequeue_top_waiter = rt_mutex_top_waiter(lock);

	rt_mutex_dequeue(lock, waiter);
	waiter_update_prio(waiter, task);
	rt_mutex_enqueue(lock, waiter);

	// Lock has no owner?
	if (!rt_mutex_owner(lock)) {
	   	// Top waiter changed
  ---->		if (prerequeue_top_waiter != rt_mutex_top_waiter(lock))
  ---->			wake_up_state(waiter->task, waiter->wake_state);

This only takes the case into account where @waiter is the new top waiter
due to the requeue operation.

But it fails to handle the case where @waiter is not longer the top
waiter due to the requeue operation.

Ensure that the new top waiter is woken up so in all cases so it can take
over the ownerless lock.

[ tglx: Amend changelog, add Fixes tag ]

Fixes: c014ef69b3 ("locking/rtmutex: Add wake_state to rt_mutex_waiter")
Signed-off-by: Wander Lairson Costa <wander@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230117172649.52465-1-wander@redhat.com
Link: https://lore.kernel.org/r/20230202123020.14844-1-wander@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14 19:18:04 +01:00