If you have trouble reading this new file format, please refresh your
prebuilt version of STG with repo sync.
Bug: 294213765
Change-Id: I4d7ee716231956c5f4da1343cc0db5170aaaa3b1
Signed-off-by: Giuliano Procida <gprocida@google.com>
In struct usb_phy, the reserved slot 0 is reserved for a notify_port_status
callback addition for ABI freeze. Now this api is accepted on upstream.
Therefore, use ANDROID_KABI_USE to apply this api.
Bug: 286930662
Change-Id: Iae894f9dfff77fd1f23bb48fefdb9b682c54de57
Signed-off-by: Stanley Chang <stanley_chang@realtek.com>
In Realtek SoC, the parameter of usb phy is designed to can dynamic
tuning base on port status. Therefore, add a notify callback of phy
driver when usb port status change.
The Realtek phy driver is designed to dynamically adjust disconnection
level and calibrate phy parameters. When the device connected bit changes
and when the disconnected bit changes, do port status change notification:
Check if portstatus is USB_PORT_STAT_CONNECTION and portchange is
USB_PORT_STAT_C_CONNECTION.
1. The device is connected, the driver lowers the disconnection level and
calibrates the phy parameters.
2. The device disconnects, the driver increases the disconnect level and
calibrates the phy parameters.
When controller to notify connect that device is already ready. If we
adjust the disconnection level in notify_connect, the disconnect may have
been triggered at this stage. So we need to change that as early as
possible. The status change of connection is before port reset.
Therefore, we add an api to notify phy the port status changes. In this
stage, the device is not port enable, and it will not trigger
disconnection.
Signed-off-by: Stanley Chang <stanley_chang@realtek.com>
Link: https://lore.kernel.org/r/20230725033318.8361-1-stanley_chang@realtek.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bug: 286930662
(cherry picked from commit a08799cf17
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git usb-next)
Change-Id: Ie7d903d00896aba02675642a1e168ea8d426024a
Signed-off-by: Stanley Chang <stanley_chang@realtek.com>
Create input symbol files to generate GKI modules header
under include/config. By placing files in this generated
directory, the default filters that ignore certain files
will work without any special handling required, and they
will also be available to inspect after the build to inspect
for the debugging purposes.
abi_gki_protected_exports: Input for gki_module_protected_exports.h
From :- ${objtree}/abi_gki_protected_exports
To :- include/config/abi_gki_protected_exports
all_kmi_symbols: Input for gki_module_unprotected.h
- Rename to abi_gki_kmi_symbols
From :- all_kmi_symbols
To :- include/config/abi_gki_kmi_symbols
Bug: 286529877
Test: TH
Test: Manual verification of the generated files
Change-Id: Iafa10631e7712a8e1e87a2f56cfd614de6b1053a
Signed-off-by: Ramji Jiyani <ramjiyani@google.com>
create_open would always take its parent directory's bpf for the created
object. Modify to use the bpf stored in fuse_dentry which is set by
lookup.
Bug: 291705489
Test: fuse_test passes, adb push file /sdcard/Android/data works
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Change-Id: I0a1ea2a291a8fdf67923f1827176b2ea96bd4c2d
Store the results of a negative lookup in the fuse_dentry so later
opcodes can use them to create files
Bug: 291705489
Test: fuse_test passes
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Change-Id: I725e714a1d6ce43f24431d07c24e96349ef1a55c
fuse_iget_backing returns an inode or null, not a ERR_PTR. So check it's
not NULL
Also make sure we put the inode if d_splice_alias fails
Bug: 293349757
Test: fuse_test runs
Signed_off_by: Paul Lawrence <paullawrence@google.com>
Change-Id: I1eadad32f80bab6730e461412b4b7ab4d6c56bf2
This adds passthrough support for flock on fuse-bpf files. It does not
give any control via a bpf filter. The flock will act as though it was
taken on the lower file.
Bug: 289882899
Test: fuse_test -t32 (flock_test)
Change-Id: Iba0b9630766cedbd3195532c5e929891593cfe30
Signed-off-by: Daniel Rosenberg <drosen@google.com>
commit 035641b01e upstream.
Just calling wait_for_device_probe() is not enough to ensure that
asynchronously probed block devices are available (E.G. mmc, usb), so
add a "dm-mod.waitfor=<device1>[,..,<deviceN>]" parameter to get
dm-init to explicitly wait for specific block devices before
initializing the tables with logic similar to the rootwait logic that
was introduced with commit cc1ed7542c ("init: wait for
asynchronously scanned block devices").
E.G. with dm-verity on mmc using:
dm-mod.waitfor="PARTLABEL=hash-a,PARTLABEL=root-a"
[ 0.671671] device-mapper: init: waiting for all devices to be available before creating mapped devices
[ 0.671679] device-mapper: init: waiting for device PARTLABEL=hash-a ...
[ 0.710695] mmc0: new HS200 MMC card at address 0001
[ 0.711158] mmcblk0: mmc0:0001 004GA0 3.69 GiB
[ 0.715954] mmcblk0boot0: mmc0:0001 004GA0 partition 1 2.00 MiB
[ 0.722085] mmcblk0boot1: mmc0:0001 004GA0 partition 2 2.00 MiB
[ 0.728093] mmcblk0rpmb: mmc0:0001 004GA0 partition 3 512 KiB, chardev (249:0)
[ 0.738274] mmcblk0: p1 p2 p3 p4 p5 p6 p7
[ 0.751282] device-mapper: init: waiting for device PARTLABEL=root-a ...
[ 0.751306] device-mapper: init: all devices available
[ 0.751683] device-mapper: verity: sha256 using implementation "sha256-generic"
[ 0.759344] device-mapper: ioctl: dm-0 (vroot) is ready
[ 0.766540] VFS: Mounted root (squashfs filesystem) readonly on device 254:0.
Change-Id: I6aa87e1164f9d4857074bafc9194d4085cdfcddc
Signed-off-by: Peter Korsgaard <peter@korsgaard.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Cc: Mark-PK Tsai <mark-pk.tsai@mediatek.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bug: 293688634
(cherry picked from commit 866bf37b7c)
Signed-off-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com>
There are some corner cases where we do worse in power because the
threshold is too low. Until these cases are better understood and
addressed upstream, provide a function for vendors to override this
value with something more suitable in their modules.
Bug: 289293494
Signed-off-by: Qais Yousef <qyousef@google.com>
Change-Id: I95dd36718a317f3fcb2a9f4bc87dd3390a4f7d7d
Adding the following symbols:
- drm_edid_is_valid
Bug: 293142328
Change-Id: I307bdd226542fe8ffe4c657a9e1987c193a9d064
Signed-off-by: Petri Gynther <pgynther@google.com>
When a device-mapper device is passing through the inline encryption
support of an underlying device, calls to blk_crypto_evict_key() take
the blk_crypto_profile::lock of the device-mapper device, then take the
blk_crypto_profile::lock of the underlying device (nested). This isn't
a real deadlock, but it causes a lockdep report because there is only
one lock class for all instances of this lock.
Lockdep subclasses don't really work here because the hierarchy of block
devices is dynamic and could have more than 2 levels.
Instead, register a dynamic lock class for each blk_crypto_profile, and
associate that with the lock.
This avoids false-positive lockdep reports like the following:
============================================
WARNING: possible recursive locking detected
6.4.0-rc5 #2 Not tainted
--------------------------------------------
fscryptctl/1421 is trying to acquire lock:
ffffff80829ca418 (&profile->lock){++++}-{3:3}, at: __blk_crypto_evict_key+0x44/0x1c0
but task is already holding lock:
ffffff8086b68ca8 (&profile->lock){++++}-{3:3}, at: __blk_crypto_evict_key+0xc8/0x1c0
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&profile->lock);
lock(&profile->lock);
*** DEADLOCK ***
May be due to missing lock nesting notation
Fixes: 1b26283970 ("block: Keyslot Manager for Inline Encryption")
Reported-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20230610061139.212085-1-ebiggers@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Bug: 286427075
(cherry picked from commit 2fb48d88e7)
(added '#ifdef CONFIG_LOCKDEP' to keep the KMI tooling happy)
Change-Id: I21c0f941a36663c956a5c89324813bbaac0633ef
Signed-off-by: Eric Biggers <ebiggers@google.com>
Registration of module callbacks for the pKVM hypervisor is lockless
thanks to the use of a cmpxchg.
Problem, a CPU can speculatively execute an indirect branch and
speculatively read variables used in that branch. We then need to order
the memory access between variables potentially set in the driver init
(before the callback registration happen) and the call to that
registered callback.
e.g. in the case of the serial.
CPU0: CPU1:
driver_init(): hyp_serial_enabled()
base_addr = 0xdeadbeef; enabled = __hyp_putc
barrier(); barrier();
ops->register_serial_driver(putc); if (enabled)
__hyp_putc(); /* read base_addr */
This is the same for the SMC and PSCI handler callbacks. The abort and
fault callbacks are not impacted: the driver init can only happen before
the kernel is deprivileged i.e. before the host stage-2 is in place and
then before any of those callbacks can be triggered.
Instead of a full barrier, we can use the acquire/release semantics:
relaxing cmpxchg to cmpxchg_release in the registration path and use a
load_acquire in hyp_serial_enabled().
Bug: 292470326
Change-Id: I4b5fe3713fe40cc5ab42ea0e9cdf54e8315dfb44
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
[ Upstream commit 0323bce598 ]
In the event of a failure in tcf_change_indev(), fw_set_parms() will
immediately return an error after incrementing or decrementing
reference counter in tcf_bind_filter(). If attacker can control
reference counter to zero and make reference freed, leading to
use after free.
In order to prevent this, move the point of possible failure above the
point where the TC_FW_CLASSID is handled.
Bug: 292252062
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: M A Ramdhan <ramdhan@starlabs.sg>
Signed-off-by: M A Ramdhan <ramdhan@starlabs.sg>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Pedro Tammela <pctammela@mojatatu.com>
Message-ID: <20230705161530.52003-1-ramdhan@starlabs.sg>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit c91fb29bb0)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I9bf6f540b4eb23ea5641fb3efe6f3e621d7b6151
[ Upstream commit 4bedf9eee0 ]
Add bound flag to rule and chain transactions as in 6a0a8d10a3
("netfilter: nf_tables: use-after-free in failing rule with bound set")
to skip them in case that the chain is already bound from the abort
path.
This patch fixes an imbalance in the chain use refcnt that triggers a
WARN_ON on the table and chain destroy path.
This patch also disallows nested chain bindings, which is not
supported from userspace.
The logic to deal with chain binding in nft_data_hold() and
nft_data_release() is not correct. The NFT_TRANS_PREPARE state needs a
special handling in case a chain is bound but next expressions in the
same rule fail to initialize as described by 1240eb93f0 ("netfilter:
nf_tables: incorrect error path handling with NFT_MSG_NEWRULE").
The chain is left bound if rule construction fails, so the objects
stored in this chain (and the chain itself) are released by the
transaction records from the abort path, follow up patch ("netfilter:
nf_tables: add NFT_TRANS_PREPARE_ERROR to deal with bound set/chain")
completes this error handling.
When deleting an existing rule, chain bound flag is set off so the
rule expression .destroy path releases the objects.
Bug: 292097846
Fixes: d0e2c7de92 ("netfilter: nf_tables: add NFT_CHAIN_BINDING")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 891cd2eddd)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I8a8cf012e9e6fd0d0081f3f7616c9cf31ea02989
commit 0e8235d28f upstream.
Added new functions index_hdr_check and index_buf_check.
Now we check all stuff for correctness while reading from disk.
Also fixed bug with stale nfs data.
Bug: 286390611
Reported-by: van fantasy <g1042620637@gmail.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Fixes: 82cae269cf ("fs/ntfs3: Add initialization of super block")
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 000a9a72ef)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I2b17511acdef8617aea3fecb45d2f11e49145097
Change build time generated GKI module headers location
From :- kernel/module/gki_module_*.h
To :- include/generated/gki_module_*.h
This prevents the kernel source from being contaminated.
By placing the header files in a generated directory,
the default filters that ignore certain files will work
without any special handling required.
Bug: 286529877
Test: Manual verification & TH
Change-Id: Ie247d1c132ddae54906de2e2850e95d7ae9edd50
Signed-off-by: Ramji Jiyani <ramjiyani@google.com>
(cherry picked from commit e9cba885543fc50a5b59ff7234d02b74a380573c)
On KMI frozen branches, symbols may no longer be removed
from KMI symbol lists.
This change sets kmi_symbol_list_add_only=true for Kleaf builds.
Test: Treehugger
Bug: 292106238
Change-Id: I74cf98ebad2705b92468c996e9b3b472447e8203
Signed-off-by: Yifan Hong <elsk@google.com>
abi_gki_protected_exports is a prepped list of symbols protected
from being exported by unsigned modules; and input to generate
gki_module_protected_exports.h during kernel build.
Delete it once header is generated so it is not lingering in the
source when kernel sournce is being built in-place i.e. OBJ is
not set during the build.
Bug: 286529877
Test: Manual verification & TH
Change-Id: Ia06db62da03289b8f90917bcd302c81c8a4d31d2
Signed-off-by: Ramji Jiyani <ramjiyani@google.com>
The recent fix for DPCM locking also covered the loop in
dpcm_be_disconnect() with the FE stream lock. This caused an
unexpected side effect, thought: calling debugfs_remove_recursive() in
the spinlock may lead to lockdep splats as the code there assumes the
SOFTIRQ-safe context.
For avoiding the problem, this patch changes the disconnection
procedure to two phases: at first, the matching entries are removed
from the linked list, then the resources are freed outside the lock.
Bug: 291825551
Fixes: b7898396f4 ("ASoC: soc-pcm: Fix and cleanup DPCM locking")
Reported-and-tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Change-Id: I7a793b029cfbddb6082afe001c08890c54c67045
Signed-off-by: Takashi Iwai <tiwai@suse.de>
(cherry picked from commit 9f620684c1)
Signed-off-by: Isaac J. Manjarres <isaacmanjarres@google.com>
The recent change for DPCM locking caused spurious lockdep warnings.
Actually the warnings are false-positive, as those are triggered due
to the nested stream locks for FE and BE. Since both locks belong to
the same lock class, lockdep sees it as if a deadlock.
For fixing this, we need to take PCM stream locks for BE with the
nested lock primitives. Since currently snd_pcm_stream_lock*() helper
assumes only the top-level single locking, a new helper function
snd_pcm_stream_lock_irqsave_nested() is defined for a single-depth
nested lock, which is now used in the BE DAI trigger that is always
performed inside a FE stream lock.
Bug: 291825551
Fixes: b2ae806630 ("ASoC: soc-pcm: serialize BE triggers")
Reported-and-tested-by: Hans de Goede <hdegoede@redhat.com>
Reported-and-tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://lore.kernel.org/r/73018f3c-9769-72ea-0325-b3f8e2381e30@redhat.com
Link: https://lore.kernel.org/alsa-devel/9a0abddd-49e9-872d-2f00-a1697340f786@samsung.com
Change-Id: I163307c958c1e86f8d15c637a4d8739286b6d062
Signed-off-by: Takashi Iwai <tiwai@suse.de>
(cherry picked from commit 3c75c0ea5d)
Signed-off-by: Isaac J. Manjarres <isaacmanjarres@google.com>
Before commit 93e72b3c61 ("squashfs: migrate from ll_rw_block
usage to BIO"), compressed blocks read by squashfs were cached in the page
cache, but that is not the case after that commit. That has lead to
squashfs having to re-read a lot of sectors from disk/flash.
For example, the first sectors of every metadata block need to be read
twice from the disk. Once partially to read the length, and a second time
to read the block itself. Also, in linear reads of large files, the last
sectors of one data block are re-read from disk when reading the next data
block, since the compressed blocks are of variable sizes and not aligned
to device blocks. This extra I/O results in a degrade in read performance
of, for example, ~16% in one scenario on my ARM platform using squashfs
with dm-verity and NAND.
Since the decompressed data is cached in the page cache or squashfs'
internal metadata and fragment caches, caching _all_ compressed pages
would lead to a lot of double caching and is undesirable. But make the
code cache any disk blocks which were only partially requested, since
these are the ones likely to include data which is needed by other file
system blocks. This restores read performance in my test scenario.
The compressed block caching is only applied when the disk block size is
equal to the page size, to avoid having to deal with caching sub-page
reads.
[akpm@linux-foundation.org: fs/squashfs/block.c needs linux/pagemap.h]
[vincent.whitchurch@axis.com: fix page update race]
Link: https://lkml.kernel.org/r/20230526-squashfs-cache-fixup-v1-1-d54a7fa23e7b@axis.com
[vincent.whitchurch@axis.com: fix page indices]
Link: https://lkml.kernel.org/r/20230526-squashfs-cache-fixup-v1-2-d54a7fa23e7b@axis.com
[akpm@linux-foundation.org: fix layout, per hch]
Link: https://lkml.kernel.org/r/20230510-squashfs-cache-v4-1-3bd394e1ee71@axis.com
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Phillip Lougher <phillip@squashfs.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Bug: 290900323
[block.c: bio_alloc_clone() -> bio_clone_fast()]
(cherry picked from commit e994f5b677)
Change-Id: I34f6478501c0da6b8f67af4e6d173040a486f5a5
Signed-off-by: Lecopzer Chen <lecopzer.chen@mediatek.com>
Patch series "Squashfs: handle missing pages decompressing into page
cache".
This patchset enables Squashfs to handle missing pages when directly
decompressing datablocks into the page cache.
Previously if the full set of pages needed was not available, Squashfs
would have to fall back to using an intermediate buffer (the older
method), which is slower, involving a memcopy, and it introduces
contention on a shared buffer.
The first patch extends the "page actor" code to handle missing pages.
The second patch updates Squashfs_readpage_block() to use the new
functionality, and removes the code that falls back to using an
intermediate buffer.
This patchset is independent of the readahead work, and it is standalone.
It can be merged on its own.
But the readahead patch for efficiency also needs this patch-set.
This patch (of 2):
This patch extends the "page actor" code to handle missing pages.
Previously if the full set of pages needed to decompress a Squashfs
datablock was unavailable, this would cause decompression to fail on the
missing pages.
In this case direct decompression into the page cache could not be
achieved and the code would fall back to using the older intermediate
buffer method.
With this patch, direct decompression into the page cache can be achieved
with missing pages.
For "multi-shot" decompressors (zlib, xz, zstd), the page actor will
allocate a temporary buffer which is passed to the decompressor, and then
freed by the page actor.
For "single shot" decompressors (lz4, lzo) which decompress into a
contiguous "bounce buffer", and which is then copied into the page cache,
it would be pointless to allocate a temporary buffer, memcpy into it, and
then free it. For these decompressors -ENOMEM is returned, which
signifies that the memcpy for that page should be skipped.
This also happens if the data block is uncompressed.
Link: https://lkml.kernel.org/r/20220611032133.5743-1-phillip@squashfs.org.uk
Link: https://lkml.kernel.org/r/20220611032133.5743-2-phillip@squashfs.org.uk
Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Hsin-Yi Wang <hsinyi@chromium.org>
Cc: Xiongwei Song <Xiongwei.Song@windriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Bug: 290900323
(cherry picked from commit f268eedddf)
Change-Id: I08906140517a56660e0c487cef49b2efb9c098be
Signed-off-by: Lecopzer Chen <lecopzer.chen@mediatek.com>
[ Upstream commit 504a10d9e4 ]
On corrupt gfs2 file systems the evict code can try to reference the
journal descriptor structure, jdesc, after it has been freed and set to
NULL. The sequence of events is:
init_journal()
...
fail_jindex:
gfs2_jindex_free(sdp); <------frees journals, sets jdesc = NULL
if (gfs2_holder_initialized(&ji_gh))
gfs2_glock_dq_uninit(&ji_gh);
fail:
iput(sdp->sd_jindex); <--references jdesc in evict_linked_inode
evict()
gfs2_evict_inode()
evict_linked_inode()
ret = gfs2_trans_begin(sdp, 0, sdp->sd_jdesc->jd_blocks);
<------references the now freed/zeroed sd_jdesc pointer.
The call to gfs2_trans_begin is done because the truncate_inode_pages
call can cause gfs2 events that require a transaction, such as removing
journaled data (jdata) blocks from the journal.
This patch fixes the problem by adding a check for sdp->sd_jdesc to
function gfs2_evict_inode. In theory, this should only happen to corrupt
gfs2 file systems, when gfs2 detects the problem, reports it, then tries
to evict all the system inodes it has read in up to that point.
Bug: 289870854
Reported-by: Yang Lan <lanyang0908@gmail.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 5ae4a618a1)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I501e8631e1b60479023f5e6ad957540f9e10bcd5
Presently the data buffer used to return the per-UID timeout description
is created based on information provided by the user. It is expected
that the user populates a variable called 'timeouts_array_size' which is
heavily scrutinised to ensure the value provided is appropriate i.e.
smaller than the largest possible value but large enough to contain all
of the data we wish to pass back.
The issue is that the aforementioned scrutiny is imposed on a different
variable to the one expected. Contrary to expectation, the data buffer
is actually being allocated to the size specified in a variable named
'timeouts_array_size_out'. A variable originally designed to only
contain the output information i.e. the size of the data actually copied
to the user for consumption. This value is also user provided and is
not given the same level of scrutiny as the former.
The fix in this case is simple. Ignore 'timeouts_array_size_out' until
it is time to populate (over-write) it ourselves and use
'timeouts_array_size' to shape the buffer as intended.
Bug: 281547360
Change-Id: I95e12879a33a2355f9e4bc0ce2bfc3f229141aa8
Signed-off-by: Lee Jones <joneslee@google.com>
(cherry picked from commit 5a4d20a3eb4e651f88ed2f1f08cee066639ca801)
The 'mmu' parameter to enter_vmid_context() represents the target MMU
to switch to, so we should stash away the current MMU for restoration
by exit_vmid_context() rather than the one we're about to switch to!
Bug: 291568386
Fixes: e815dfc6c6 ("ANDROID: KVM: arm64: Support TLB invalidation in guest context")
Tested-by: Mostafa Saleh <smostafa@google.com>
Reported-by: Mostafa Saleh <smostafa@google.com>
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I5d76c159424e32a6d70c598d0007f98ea80c1db4
KASAN suppresses reports for bad accesses done by the KASAN reporting
code. The reporting code might access poisoned memory for reporting
purposes.
Software KASAN modes do this by suppressing reports during reporting via
current->kasan_depth, the same way they suppress reports during accesses
to poisoned slab metadata.
Hardware Tag-Based KASAN does not use current->kasan_depth, and instead
resets pointer tags for accesses to poisoned memory done by the reporting
code.
Despite that, a recursive report can still happen:
1. On hardware with faulty MTE support. This was observed by Weizhao
Ouyang on a faulty hardware that caused memory tags to randomly change
from time to time.
2. Theoretically, due to a previous MTE-undetected memory corruption.
A recursive report can happen via:
1. Accessing a pointer with a non-reset tag in the reporting code, e.g.
slab->slab_cache, which is what Weizhao Ouyang observed.
2. Theoretically, via external non-annotated routines, e.g. stackdepot.
To resolve this issue, resetting tags for all of the pointers in the
reporting code and all the used external routines would be impractical.
Instead, disable tag checking done by the CPU for the duration of KASAN
reporting for Hardware Tag-Based KASAN.
Without this fix, Hardware Tag-Based KASAN reporting code might deadlock.
[andreyknvl@google.com: disable preemption instead of migration, fix comment typo]
Link: https://lkml.kernel.org/r/d14417c8bc5eea7589e99381203432f15c0f9138.1680114854.git.andreyknvl@google.com
Link: https://lkml.kernel.org/r/59f433e00f7fa985e8bf9f7caf78574db16b67ab.1678491668.git.andreyknvl@google.com
Fixes: 2e903b9147 ("kasan, arm64: implement HW_TAGS runtime")
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reported-by: Weizhao Ouyang <ouyangweizhao@zeku.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Peter Collingbourne <pcc@google.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Bug: 254721825
(cherry picked from commit c6a690e0c9)
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Change-Id: Ifc5daf66f57dd16e85de73257cc0966565836269