After waking up a suspended VM, the kernel prints the following trace
for virtio drivers which do not directly call virtio_device_ready() in
the .restore:
PM: suspend exit
irq 22: nobody cared (try booting with the "irqpoll" option)
Call Trace:
<IRQ>
dump_stack_lvl+0x38/0x49
dump_stack+0x10/0x12
__report_bad_irq+0x3a/0xaf
note_interrupt.cold+0xb/0x60
handle_irq_event+0x71/0x80
handle_fasteoi_irq+0x95/0x1e0
__common_interrupt+0x6b/0x110
common_interrupt+0x63/0xe0
asm_common_interrupt+0x1e/0x40
? __do_softirq+0x75/0x2f3
irq_exit_rcu+0x93/0xe0
sysvec_apic_timer_interrupt+0xac/0xd0
</IRQ>
<TASK>
asm_sysvec_apic_timer_interrupt+0x12/0x20
arch_cpu_idle+0x12/0x20
default_idle_call+0x39/0xf0
do_idle+0x1b5/0x210
cpu_startup_entry+0x20/0x30
start_secondary+0xf3/0x100
secondary_startup_64_no_verify+0xc3/0xcb
</TASK>
handlers:
[<000000008f9bac49>] vp_interrupt
[<000000008f9bac49>] vp_interrupt
Disabling IRQ #22
This happens because we don't invoke .enable_cbs callback in
virtio_device_restore(). That callback is used by some transports
(e.g. virtio-pci) to enable interrupts.
Let's fix it, by calling virtio_device_ready() as we do in
virtio_dev_probe(). This function calls .enable_cts callback and sets
DRIVER_OK status bit.
This fix also avoids setting DRIVER_OK twice for those drivers that
call virtio_device_ready() in the .restore.
Bug: 254441685
Fixes: d50497eb4e ("virtio_config: introduce a new .enable_cbs method")
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20220322114313.116516-1-sgarzare@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 8d65bc9a5b)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I1a16d4e905ed3929ecdd87c3a7852c0906611ff3
Changes in 5.15.84
x86/vdso: Conditionally export __vdso_sgx_enter_enclave()
vfs: fix copy_file_range() averts filesystem freeze protection
nfp: fix use-after-free in area_cache_get()
ASoC: fsl_micfil: explicitly clear software reset bit
ASoC: fsl_micfil: explicitly clear CHnF flags
ASoC: ops: Check bounds for second channel in snd_soc_put_volsw_sx()
libbpf: Use page size as max_entries when probing ring buffer map
pinctrl: meditatek: Startup with the IRQs disabled
can: sja1000: fix size of OCR_MODE_MASK define
can: mcba_usb: Fix termination command argument
net: fec: don't reset irq coalesce settings to defaults on "ip link up"
ASoC: cs42l51: Correct PGA Volume minimum value
perf: Fix perf_pending_task() UaF
nvme-pci: clear the prp2 field when not used
ASoC: ops: Correct bounds check for second channel on SX controls
net: fec: properly guard irq coalesce setup
Linux 5.15.84
Change-Id: I34ef5e73fca9da9a77c89b1f0c7ad4af37b63a79
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Not all .S files include asm/assembler.h, however the SYM_FUNC_*
definitions invoke the 'bti' macro. Include asm/assembler.h in
asm/linkage.h.
Bug: 254441685
Fixes: 9be34be87c ("arm64: Add macro version of the BTI instruction")
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit dd73d18e7f)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I5dc6693315e56c36bd5c597a3b0de1655e11c7ba
Only enforce export protection if there are symbols in the
unprotected list for the Kernel Module Interface (KMI).
This is only relevant for targets like arm64 that have
defined ABI symbol lists. This allows non-GKI targets
like arm and x86 to continue using GKI source code
without disabling the feature for those targets.
Bug: 232430739
Test: TH
Fixes: fd1e768866 ("ANDROID: GKI: Protect exports of protected GKI modules")
Change-Id: Ie89e8f63eda99d9b7aacd1bb76d036b3ff4ba37c
Signed-off-by: Ramji Jiyani <ramjiyani@google.com>
Update protected export symbols list with exports
from list of protected modules at
android/gki_protected_modules.
It includes symbols from every GKI modules except
zram & zsmalloc; and serves as a baseline.
Bug: 232430739
Test: TH
Change-Id: Iec33dfe093b4e9e0281b910b2b3bf998cef55394
Signed-off-by: Ramji Jiyani <ramjiyani@google.com>
This reverts commit eb57c31115.
This branch looks clean of WERROR warnings. Let's try to re-enable it.
Fixes: eb57c31115 ("ANDROID: allmodconfig: disable WERROR")
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I0106dcd43d7e4b4e20ac768f3faac40285bc837b
Signed-off-by: Lee Jones <joneslee@google.com>
commit 7e6303567c upstream.
Prior to the Fixes: commit, the initialization code went through the
same fec_enet_set_coalesce() function as used by ethtool, and that
function correctly checks whether the current variant has support for
irq coalescing.
Now that the initialization code instead calls fec_enet_itr_coal_set()
directly, that call needs to be guarded by a check for the
FEC_QUIRK_HAS_COALESCE bit.
Fixes: df727d4547 (net: fec: don't reset irq coalesce settings to defaults on "ip link up")
Reported-by: Greg Ungerer <gregungerer@westnet.com.au>
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20221205204604.869853-1-linux@rasmusvillemoes.dk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a56ea6147f ]
If the prp2 field is not filled in nvme_setup_prp_simple(), the prp2
field is garbage data. According to nvme spec, the prp2 is reserved if
the data transfer does not cross a memory page boundary, so clear it to
zero if it is not used.
Signed-off-by: Lei Rao <lei.rao@intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 517e6a301f ]
Per syzbot it is possible for perf_pending_task() to run after the
event is free()'d. There are two related but distinct cases:
- the task_work was already queued before destroying the event;
- destroying the event itself queues the task_work.
The first cannot be solved using task_work_cancel() since
perf_release() itself might be called from a task_work (____fput),
which means the current->task_works list is already empty and
task_work_cancel() won't be able to find the perf_pending_task()
entry.
The simplest alternative is extending the perf_event lifetime to cover
the task_work.
The second is just silly, queueing a task_work while you know the
event is going away makes no sense and is easily avoided by
re-arranging how the event is marked STATE_DEAD and ensuring it goes
through STATE_OFF on the way down.
Reported-by: syzbot+9228d6098455bb209ec8@syzkaller.appspotmail.com
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Marco Elver <elver@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit df727d4547 ]
Currently, when a FEC device is brought up, the irq coalesce settings
are reset to their default values (1000us, 200 frames). That's
unexpected, and breaks for example use of an appropriate .link file to
make systemd-udev apply the desired
settings (https://www.freedesktop.org/software/systemd/man/systemd.link.html),
or any other method that would do a one-time setup during early boot.
Refactor the code so that fec_restart() instead uses
fec_enet_itr_coal_set(), which simply applies the settings that are
stored in the private data, and initialize that private data with the
default values.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1a8e3bd25f ]
Microchip USB Analyzer can activate the internal termination resistors
by setting the "termination" option ON, or OFF to to deactivate them.
As I've observed, both with my oscilloscope and captured USB packets
below, you must send "0" to turn it ON, and "1" to turn it OFF.
From the schematics in the user's guide, I can confirm that you must
drive the CAN_RES signal LOW "0" to activate the resistors.
Reverse the argument value of usb_msg.termination to fix this.
These are the two commands sequence, ON then OFF.
> No. Time Source Destination Protocol Length Info
> 1 0.000000 host 1.3.1 USB 46 URB_BULK out
>
> Frame 1: 46 bytes on wire (368 bits), 46 bytes captured (368 bits)
> USB URB
> Leftover Capture Data: a80000000000000000000000000000000000a8
>
> No. Time Source Destination Protocol Length Info
> 2 4.372547 host 1.3.1 USB 46 URB_BULK out
>
> Frame 2: 46 bytes on wire (368 bits), 46 bytes captured (368 bits)
> USB URB
> Leftover Capture Data: a80100000000000000000000000000000000a9
Signed-off-by: Yasushi SHOJI <yashi@spacecubics.com>
Link: https://lore.kernel.org/all/20221124152504.125994-1-yashi@spacecubics.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 689eb2f1ba ]
Using page size as max_entries when probing ring buffer map, else the
probe may fail on host with 64KB page size (e.g., an ARM64 host).
After the fix, the output of "bpftool feature" on above host will be
correct.
Before :
eBPF map_type ringbuf is NOT available
eBPF map_type user_ringbuf is NOT available
After :
eBPF map_type ringbuf is available
eBPF map_type user_ringbuf is available
Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20221116072351.1168938-2-houtao@huaweicloud.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 02e1a114fd upstream.
area_cache_get() is used to distribute cache->area and set cache->id,
and if cache->id is not 0 and cache->area->kref refcount is 0, it will
release the cache->area by nfp_cpp_area_release(). area_cache_get()
set cache->id before cpp->op->area_init() and nfp_cpp_area_acquire().
But if area_init() or nfp_cpp_area_acquire() fails, the cache->id is
is already set but the refcount is not increased as expected. At this
time, calling the nfp_cpp_area_release() will cause use-after-free.
To avoid the use-after-free, set cache->id after area_init() and
nfp_cpp_area_acquire() complete successfully.
Note: This vulnerability is triggerable by providing emulated device
equipped with specified configuration.
BUG: KASAN: use-after-free in nfp6000_area_init (drivers/net/ethernet/netronome/nfp/nfpcore/nfp6000_pcie.c:760)
Write of size 4 at addr ffff888005b7f4a0 by task swapper/0/1
Call Trace:
<TASK>
nfp6000_area_init (drivers/net/ethernet/netronome/nfp/nfpcore/nfp6000_pcie.c:760)
area_cache_get.constprop.8 (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:884)
Allocated by task 1:
nfp_cpp_area_alloc_with_name (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:303)
nfp_cpp_area_cache_add (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:802)
nfp6000_init (drivers/net/ethernet/netronome/nfp/nfpcore/nfp6000_pcie.c:1230)
nfp_cpp_from_operations (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:1215)
nfp_pci_probe (drivers/net/ethernet/netronome/nfp/nfp_main.c:744)
Freed by task 1:
kfree (mm/slub.c:4562)
area_cache_get.constprop.8 (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:873)
nfp_cpp_read (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:924 drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:973)
nfp_cpp_readl (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cpplib.c:48)
Signed-off-by: Jialiang Wang <wangjialiang0806@163.com>
Reviewed-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Acked-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20220810073057.4032-1-wangjialiang0806@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 10bc8e4af6 upstream.
Commit 868f9f2f8e ("vfs: fix copy_file_range() regression in cross-fs
copies") removed fallback to generic_copy_file_range() for cross-fs
cases inside vfs_copy_file_range().
To preserve behavior of nfsd and ksmbd server-side-copy, the fallback to
generic_copy_file_range() was added in nfsd and ksmbd code, but that
call is missing sb_start_write(), fsnotify hooks and more.
Ideally, nfsd and ksmbd would pass a flag to vfs_copy_file_range() that
will take care of the fallback, but that code would be subtle and we got
vfs_copy_file_range() logic wrong too many times already.
Instead, add a flag to explicitly request vfs_copy_file_range() to
perform only generic_copy_file_range() and let nfsd and ksmbd use this
flag only in the fallback path.
This choise keeps the logic changes to minimum in the non-nfsd/ksmbd code
paths to reduce the risk of further regressions.
Fixes: 868f9f2f8e ("vfs: fix copy_file_range() regression in cross-fs copies")
Tested-by: Namjae Jeon <linkinjeon@kernel.org>
Tested-by: Luis Henriques <lhenriques@suse.de>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
[backport comments for v5.15: - sb_write_started() is missing - assert was dropped ]
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 45be2ad007 upstream.
Recently, ld.lld moved from '--undefined-version' to
'--no-undefined-version' as the default, which breaks building the vDSO
when CONFIG_X86_SGX is not set:
ld.lld: error: version script assignment of 'LINUX_2.6' to symbol '__vdso_sgx_enter_enclave' failed: symbol not defined
__vdso_sgx_enter_enclave is only included in the vDSO when
CONFIG_X86_SGX is set. Only export it if it will be present in the final
object, which clears up the error.
Fixes: 8466436952 ("x86/vdso: Implement a vDSO for Intel SGX enclave call")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/1756
Link: https://lore.kernel.org/r/20221109000306.1407357-1-nathan@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When CONFIG_CRYPTO_MANAGER_DISABLE_TESTS is set, the code in algboss.c
that handles CRYPTO_MSG_ALG_REGISTER is unnecessary, so make it be
compiled out.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Bug: 256875295
(cherry picked from commit 441cb1b730)
Change-Id: I11ebf60e1915ad5d13bd16a26d6c2c0944b4c401
Signed-off-by: Eric Biggers <ebiggers@google.com>
The crypto_boot_test_finished static key is unnecessary when self-tests
are disabled in the kconfig, so optimize it out accordingly, along with
the entirety of crypto_start_tests(). This mainly avoids the overhead
of an unnecessary static_branch_enable() on every boot.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Bug: 256875295
(cherry picked from commit 06bd9c967e)
Change-Id: I68eff9772dc219a8786bf410cb4e946052ea7811
Signed-off-by: Eric Biggers <ebiggers@google.com>
Since algboss always skips testing of algorithms with the
CRYPTO_ALG_INTERNAL flag, there is no need to go through the dance of
creating the test kthread, which creates a lot of overhead. Instead, we
can just directly finish the algorithm registration, like is now done
when self-tests are disabled entirely.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Bug: 256875295
(cherry picked from commit 9cadd73ade)
Change-Id: I10f814cd6903d41265f69297d8568b43ec30012e
Signed-off-by: Eric Biggers <ebiggers@google.com>
Currently, registering an algorithm with the crypto API always causes a
notification to be posted to the "cryptomgr", which then creates a
kthread to self-test the algorithm. However, if self-tests are disabled
in the kconfig (as is the default option), then this kthread just
notifies waiters that the algorithm has been tested, then exits.
This causes a significant amount of overhead, especially in the kthread
creation and destruction, which is not necessary at all. For example,
in a quick test I found that booting a "minimum" x86_64 kernel with all
the crypto options enabled (except for the self-tests) takes about 400ms
until PID 1 can start. Of that, a full 13ms is spent just doing this
pointless dance, involving a kthread being created, run, and destroyed
over 200 times. That's over 3% of the entire kernel start time.
Fix this by just skipping the creation of the test larval and the
posting of the registration notification entirely, when self-tests are
disabled.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Bug: 256875295
(cherry picked from commit a7008584ab)
(Resolved trivial conflict due to missing upstream commit d6097b8d5d)
Change-Id: Ia6be068618e9286c1be01415a6766ba2fa94fc0d
Signed-off-by: Eric Biggers <ebiggers@google.com>
The delayed boot-time testing patch created a dependency loop
between api.c and algapi.c because it added a crypto_alg_tested
call to the former when the crypto manager is disabled.
We could instead avoid creating the test larvals if the crypto
manager is disabled. This avoids the dependency loop as well
as saving some unnecessary work, albeit in a very unlikely case.
Reported-by: Nathan Chancellor <nathan@kernel.org>
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Reported-by: kernel test robot <lkp@intel.com>
Fixes: adad556efc ("crypto: api - Fix built-in testing dependency failures")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Bug: 256875295
(cherry picked from commit cad439fc04)
Change-Id: I4e0e0b2022dc060fc1d84744e04beae411165ad0
Signed-off-by: Eric Biggers <ebiggers@google.com>
We need to export crypto_boot_test_finished in case api.c is
built-in while algapi.c is built as a module.
Fixes: adad556efc ("crypto: api - Fix built-in testing dependency failures")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Tested-by: Stephen Rothwell <sfr@canb.auug.org.au> # ppc32 build
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Bug: 256875295
(cherry picked from commit e42dff467e)
Change-Id: Iefc190f29539084e7c84e23120e861de2e0b9351
Signed-off-by: Eric Biggers <ebiggers@google.com>
When complex algorithms that depend on other algorithms are built
into the kernel, the order of registration must be done such that
the underlying algorithms are ready before the ones on top are
registered. As otherwise they would fail during the self-test
which is required during registration.
In the past we have used subsystem initialisation ordering to
guarantee this. The number of such precedence levels are limited
and they may cause ripple effects in other subsystems.
This patch solves this problem by delaying all self-tests during
boot-up for built-in algorithms. They will be tested either when
something else in the kernel requests for them, or when we have
finished registering all built-in algorithms, whichever comes
earlier.
Reported-by: Vladis Dronov <vdronov@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Bug: 256875295
(cherry picked from commit adad556efc)
Change-Id: I9cb048ffe0ce7e471cc6e71904f1b2c462b57be4
Signed-off-by: Eric Biggers <ebiggers@google.com>
android/gki_protected_modules serves as a running
list of protected GKI modules. This list is being
used as an input to generate list of protected
GKI modules exports at android/abi_gki_protected_exports
All GKI modules are protected except zram.ko & zsmalloc.ko
as baseline in this list.
Bug: 232430739
Test: TH
Change-Id: I0c993769b9d07543755fd056199b0e4d10d27f77
Signed-off-by: Ramji Jiyani <ramjiyani@google.com>
Implement support for protecting the exported symbols of
protected GKI modules.
Only signed GKI modules are permitted to export symbols
listed in the android/abi_gki_protected_exports file.
Attempting to export these symbols from an unsigned module
will result in the module failing to load, with a
'Permission denied' error message.
Bug: 232430739
Test: TH
Change-Id: I3e8b330938e116bb2e022d356ac0d55108a84a01
Signed-off-by: Ramji Jiyani <ramjiyani@google.com>
Hypervisor vendor modules may need to create non-cacheable mappings in
the hypervisor stage-1 for interacting with devices such as IOMMUs.
Add support for this memory type to the KVM pgtable API and implement
it for both stage-1 and stage-2.
Bug: 244373730
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I2f88db7fe47e16366018e3e48f30d09b299ae6e4
The merge of 5.15.61 into this branch incorrectly deleted the test
vectors that were added by the following commits:
commit 0035442093 ("UPSTREAM: crypto: xctr - Add XCTR support")
commit e3efa8253b ("UPSTREAM: crypto: polyval - Add POLYVAL support")
commit d672bb9c20 ("UPSTREAM: crypto: hctr2 - Add HCTR2 support")
This causes a build error when CONFIG_CRYPTO_MANAGER_DISABLE_TESTS is
not set. Fix this by adding back the test vectors.
Bug: 233652475
Fixes: 47c7e57022 ("Merge 5.15.61 into android14-5.15")
Change-Id: I7dce7570d51a97b88ae751046443df6f0a9038b2
Signed-off-by: Eric Biggers <ebiggers@google.com>
If the filesystem being watched supports d_canonical_path,
notify the lower filesystem of the open as well.
Fixes: f37e05049b ("ANDROID: vfs: d_canonical_path for stacked FS")
Test: atest CtsOsTestCases:android.os.cts.FileObserverTest
Bug: 70706497
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Signed-off-by: Alessio Balsini <balsini@google.com>
Change-Id: I7c9d210e8e6ee99928ad9db0b41ffc3ac3371dc0
* aosp/upstream-f2fs-stable-linux-5.15.y:
f2fs: reset wait_ms to default if any of the victims have been selected
f2fs: fix some format WARNING in debug.c and sysfs.c
f2fs: don't call f2fs_issue_discard_timeout() when discard_cmd_cnt is 0 in f2fs_put_super()
f2fs: fix iostat parameter for discard
f2fs: Fix spelling mistake in label: free_bio_enrty_cache -> free_bio_entry_cache
f2fs: add block_age-based extent cache
f2fs: allocate the extent_cache by default
f2fs: refactor extent_cache to support for read and more
f2fs: remove unnecessary __init_extent_tree
f2fs: move internal functions into extent_cache.c
f2fs: specify extent cache for read explicitly
f2fs: introduce f2fs_is_readonly() for readability
f2fs: remove F2FS_SET_FEATURE() and F2FS_CLEAR_FEATURE() macro
f2fs: do some cleanup for f2fs module init
MAINTAINERS: Add f2fs bug tracker link
f2fs: remove the unused flush argument to change_curseg
f2fs: open code allocate_segment_by_default
f2fs: remove struct segment_allocation default_salloc_ops
f2fs: introduce discard_urgent_util sysfs node
f2fs: define MIN_DISCARD_GRANULARITY macro
f2fs: init discard policy after thread wakeup
f2fs: avoid victim selection from previous victim section
f2fs: truncate blocks in batch in __complete_revoke_list()
f2fs: make __queue_discard_cmd() return void
f2fs: fix description about discard_granularity node
f2fs: move set_file_temperature into f2fs_new_inode
f2fs: fix to enable compress for newly created file if extension matches
f2fs: change type for 'sbi->readdir_ra'
f2fs: cleanup for 'f2fs_tuning_parameters' function
f2fs: fix to alloc_mode changed after remount on a small volume device
f2fs: remove submit label in __submit_discard_cmd()
f2fs: fix to do sanity check on i_extra_isize in is_alive()
f2fs: introduce F2FS_IOC_START_ATOMIC_REPLACE
f2fs: fix to set flush_merge opt and show noflush_merge
f2fs: initialize locks earlier in f2fs_fill_super()
f2fs: optimize iteration over sparse directories
f2fs: fix to avoid accessing uninitialized spinlock
f2fs: correct i_size change for atomic writes
f2fs: add proc entry to show discard_plist info
f2fs: allow to read node block after shutdown
f2fs: replace ternary operator with max()
f2fs: replace gc_urgent_high_remaining with gc_remaining_trials
f2fs: add missing bracket in doc
f2fs: use sysfs_emit instead of sprintf
f2fs: introduce gc_mode sysfs node
f2fs: fix to destroy sbi->post_read_wq in error path of f2fs_fill_super()
f2fs: fix return val in f2fs_start_ckpt_thread()
f2fs: fix the msg data type
f2fs: fix the assign logic of iocb
f2fs: Fix typo in comments
f2fs: introduce max_ordered_discard sysfs node
f2fs: allow to set compression for inlined file
f2fs: add barrier mount option
f2fs: fix normal discard process
f2fs: cleanup in f2fs_create_flush_cmd_control()
f2fs: fix gc mode when gc_urgent_high_remaining is 1
f2fs: remove batched_trim_sections node
f2fs: support fault injection for f2fs_is_valid_blkaddr()
f2fs: fix to invalidate dcc->f2fs_issue_discard in error path
f2fs: Fix the race condition of resize flag between resizefs
f2fs: let's avoid to get cp_rwsem twice by f2fs_evict_inode by d_invalidate
f2fs: should put a page when checking the summary info
fscrypt: fix keyring memory leak on mount failure
Bug: 256243893
Signed-off-by: Jaegeuk Kim <jaegeuk@google.com>
Change-Id: I1755d4a31521e16602673d1327e2494cb0b84fdf
If a KVM_FUNC_MMIO_GUARD_MAP hypercall from a protected guest fails at
EL2 due to running out of page-table memory, the call is forwarded to
the host so that additional memory can be donated using the vCPU's
memcache.
Unfortunately, the host filters out these calls the hypervisor will
replay the guest's HVC instruction forever, making no progress because
it will fail each time.
Avoid filtering out KVM_FUNC_MMIO_GUARD_MAP, in the same way as we
handle the SHARE and UNSHARE hypercalls.
Bug: 262700476
Cc: Keir Fraser <keirf@google.com>
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: Idd14c6bc08a4232939676e3566b79cbc7c927a3a
This optimization allows us to re-create higher order block mappings in
the host stage2 pagetables after we teardown a guest VM.
When the host reclaims ownership during guest teardown, the page table
walker drops the refcount of the counted entries and clears out
unreferenced entries (refcount == 1). Clearing out the entry installs a
zero PTE. When the host stage2 receives a data abort because there is no
mapping associated, it will try to create the largest possible block
mapping from the founded leaf entry.
With the current patch, we increase the chances of finding a leaf entry
that has level < 3 if the requested region comes from a reclaimed torned
down VM memory. This has the advantage of reducing the TLB pressure at
host stage2.
To increase the coalescing chances, we modify the way we refcount page
table descriptors for host stage2:
- non-zero invalid PTEs
- any of the reserved-high bits(58-55) toogled
- non-default attribute mappings
- page table descriptors
Bug: 222044487
Test: dump the host stage2 pagetables and view the mapping
Signed-off-by: Sebastian Ene <sebastianene@google.com>
Change-Id: I90ff4ec2185e9a76d7ad17e77ef9bdd8ce3e8698
In preparation for the coalescing algorithm implementation, move the
function which verifies if a page table entry is a tabel to the common
header.
Bug: 222044487
Change-Id: I4124b7727e91f61b8f0a7e44cd91403d09d83c3c
Signed-off-by: Sebastian Ene <sebastianene@google.com>
Move the host specific code for PTE reference counting out of the
pagetable code and define a new structure that wraps all the PTE
manipulation callbacks. This structure will be passed during the
pagetable code initialization and it allows to register different
callback for [guest|host].
Bug: 222044487
Signed-off-by: Sebastian Ene <sebastianene@google.com>
Change-Id: I116e8322935762df2f2be6e8d51a3f0c140b3d36
Make PTE attribute definitions available from kvm_pgtable.h and take
them out of the pagetable code. These attributes will be used later in
mem_protect.c to construct different masks during the PTE manipulation
callbacks.
Bug: 222044487
Signed-off-by: Sebastian Ene <sebastianene@google.com>
Change-Id: I2f7108815ef0fa536e7f3314762a412119400fe9
Refactor the code and add stage2_clear_pte(..) which removes the PTE
without dropping the refcount for an entry.
Bug: 222044487
Signed-off-by: Sebastian Ene <sebastianene@google.com>
Change-Id: Ia2cb47f2ffad6faa5c6b4ec8a37bcbe61be0bc2f
Extend the scope of the stage2_freewalker by passing the pgt instead of
the mm_ops callbacks. This will later be used by the stage2_pte_is_counted
function.
Bug: 222044487
Signed-off-by: Sebastian Ene <sebastianene@google.com>
Change-Id: I390661eb106cbdb863cbb1832e39ec155c439091