commit 1ac2a41049 upstream.
Currently we only support idmapped mounts for filesystems mounted
without an idmapping. This was a conscious decision mentioned in
multiple places (cf. e.g. [1]).
As explained at length in [3] it is perfectly fine to extend support for
idmapped mounts to filesystem's mounted with an idmapping should the
need arise. The need has been there for some time now. Various container
projects in userspace need this to run unprivileged and nested
unprivileged containers (cf. [2]).
Before we can port any filesystem that is mountable with an idmapping to
support idmapped mounts we need to first extend the mapping helpers to
account for the filesystem's idmapping. This again, is explained at
length in our documentation at [3] but I'll give an overview here again.
Currently, the low-level mapping helpers implement the remapping
algorithms described in [3] in a simplified manner. Because we could
rely on the fact that all filesystems supporting idmapped mounts are
mounted without an idmapping the translation step from or into the
filesystem idmapping could be skipped.
In order to support idmapped mounts of filesystem's mountable with an
idmapping the translation step we were able to skip before cannot be
skipped anymore. A filesystem mounted with an idmapping is very likely
to not use an identity mapping and will instead use a non-identity
mapping. So the translation step from or into the filesystem's idmapping
in the remapping algorithm cannot be skipped for such filesystems. More
details with examples can be found in [3].
This patch adds a few new and prepares some already existing low-level
mapping helpers to perform the full translation algorithm explained in
[3]. The low-level helpers can be written in a way that they only
perform the additional translation step when the filesystem is indeed
mounted with an idmapping.
If the low-level helpers detect that they are not dealing with an
idmapped mount they can simply return the relevant k{g,u}id unchanged;
no remapping needs to be performed at all. The no_idmapping() helper
detects whether the shortcut can be used.
If the low-level helpers detected that they are dealing with an idmapped
mount but the underlying filesystem is mounted without an idmapping we
can rely on the previous shorcut and can continue to skip the
translation step from or into the filesystem's idmapping.
These checks guarantee that only the minimal amount of work is
performed. As before, if idmapped mounts aren't used the low-level
helpers are idempotent and no work is performed at all.
This patch adds the helpers mapped_k{g,u}id_fs() and
mapped_k{g,u}id_user(). Following patches will port all places to
replace the old k{g,u}id_into_mnt() and k{g,u}id_from_mnt() with these
two new helpers. After the conversion is done k{g,u}id_into_mnt() and
k{g,u}id_from_mnt() will be removed. This also concludes the renaming of
the mapping helpers we started in [4]. Now, all mapping helpers will
started with the "mapped_" prefix making everything nice and consistent.
The mapped_k{g,u}id_fs() helpers replace the k{g,u}id_into_mnt()
helpers. They are to be used when k{g,u}ids are to be mapped from the
vfs, e.g. from from struct inode's i_{g,u}id. Conversely, the
mapped_k{g,u}id_user() helpers replace the k{g,u}id_from_mnt() helpers.
They are to be used when k{g,u}ids are to be written to disk, e.g. when
entering from a system call to change ownership of a file.
This patch only introduces the helpers. It doesn't yet convert the
relevant places to account for filesystem mounted with an idmapping.
[1]: commit 2ca4dcc490 ("fs/mount_setattr: tighten permission checks")
[2]: https://github.com/containers/podman/issues/10374
[3]: Documentations/filesystems/idmappings.rst
[4]: commit a65e58e791 ("fs: document and rename fsid helpers")
Link: https://lore.kernel.org/r/20211123114227.3124056-5-brauner@kernel.org (v1)
Link: https://lore.kernel.org/r/20211130121032.3753852-5-brauner@kernel.org (v2)
Link: https://lore.kernel.org/r/20211203111707.3901969-5-brauner@kernel.org
Cc: Seth Forshee <sforshee@digitalocean.com>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
CC: linux-fsdevel@vger.kernel.org
Reviewed-by: Seth Forshee <sforshee@digitalocean.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The hypervisor memory pool is sized to allow mapping up to 1GiB of data
in the 'private' range of the hypervisor. However, this is currently
not enforced in any way, which might become a problem as private range
mappings are used more and more (e.g. from pKVM modules).
Enforce the 1GiB limit at allocation time, and while at it, rename
__io_map_base to __private_range_base for consistency.
Bug: 244543039
Change-Id: I32c9145ba331309b49428ff461a41c94ea0c1512
Signed-off-by: Quentin Perret <qperret@google.com>
Parse the devicetree during pKVM init to find nodes with the
"pkvm,protected-region" compatible string. These nodes specify a
physical address range in reg that must alway be mapped as invalid in
the host stage-2 page table when running under pKVM.
Example DT:
pkvm_prot_reg: pkvm_prot_reg@80000000 {
compatible = "pkvm,protected-region";
reg = <0x00 0x80000000 0x00 0x200000>;
};
Bug: 244543039
Bug: 244373730
Change-Id: I102cd16c91d96e5283cdd1a4fa58836cc4834eac
Signed-off-by: Quentin Perret <qperret@google.com>
The pKVM memory pool is currently sized to allow page-granularity
mapping in the host stage-2 page-table of all the memory as well as up
to 1GiB of MMIO range. Indeed, pKVM currently assumes that MMIO regions
are completely and solely owned by the host for the entire lifetime of
the system. As such, the pages used to map MMIO regions can always be
recycled to allow forward progress if the memory pool ran out of
pages -- pKVM can unmap MMIO ranges at stage-2 without fearing to loose
important information about the state of the underlying page, and those
mappings can always be reconstructed later.
In order to allow transitioning the ownership of non-memory regions,
introduce a concept of pkvm 'moveable' regions, which represents regions
of the physical address space which can be 'moved' from an ownership
perspective. These moveable regions are used to size the hyp memory
pool. In a first step, the list of moveable regions is equal to the
memblock list, but it will be extended in subsequent changes.
No functional changes intended.
Bug: 244543039
Bug: 244373730
Change-Id: I7f451924b1eed9579868e6ff8c7adc7b4a5a0ae1
Signed-off-by: Quentin Perret <qperret@google.com>
The host_get_page_state() logic has currently a baked in assumption that
it will only be used on memory, and checks against the default memory
permssions to flag pages as having a RESTRICTED_PROT state.
Add support for correctly flagging non-memory pages to prepare the
ground for future patches.
Bug: 244543039
Bug: 244373730
Change-Id: Idaaef96cb98c147c8b793059438064cf770af525
Signed-off-by: Quentin Perret <qperret@google.com>
pKVM uses different default permissions for memory and non-memory
regions of the PA space. To avoid scattering this logic around,
introduce a default_host_prot() helper function.
Non functional changes intended.
Bug: 244543039
Bug: 244373730
Change-Id: I36cdbb26a2cb0d54b5641f945f6ede4ffe371045
Signed-off-by: Quentin Perret <qperret@google.com>
pKVM modules may need to be notified in case of unexpected same-level
EL2 exceptions, which result in a hyp panic. To do so, introduce a new
notifier on the hyp_panic path.
Bug: 244373730
Change-Id: I144609a933d648ddf2aebcd950e64d6035bf8be3
Signed-off-by: Quentin Perret <qperret@google.com>
pKVM modules may need to temporarily map large-ish physically contiguous
regions of memory when bootstrapping themselves. In order to support
this use-case, introduce two new APIs in the module_ops struct allowing
to map and unmap pages in pKVM's linear map range. Since pKVM's page
ownership infrastructure relies on linear map PTEs, this needs to be
done with special care. To avoid any problem, let's count the number of
pages mapped by modules and unsure they have been unmapped before
reaching the point of deprivilege.
Bug: 244373730
Change-Id: I4aecb93f5c9ba08d9f830d1f0976704688b98509
Signed-off-by: Quentin Perret <qperret@google.com>
The rmap locks(i_mmap_rwsem and anon_vma->root->rwsem) could be contended
under memory pressure if processes keep working on their vmas(e.g., fork,
mmap, munmap). It makes reclaim path stuck. In our real workload traces,
we see kswapd is waiting the lock for 300ms+(worst case, a sec) and it
makes other processes entering direct reclaim, which were also stuck on
the lock.
This patch makes lru aging path try_lock mode like shink_page_list so the
reclaim context will keep working with next lru pages without being stuck.
if it found the rmap lock contended, it rotates the page back to head of
lru in both active/inactive lrus to make them consistent behavior, which
is basic starting point rather than adding more heristic.
Since this patch introduces a new "contended" field as out-param along
with try_lock in-param in rmap_walk_control, it's not immutable any longer
if the try_lock is set so remove const keywords on rmap related functions.
Since rmap walking is already expensive operation, I doubt the const
would help sizable benefit( And we didn't have it until 5.17).
In a heavy app workload in Android, trace shows following statistics. It
almost removes rmap lock contention from reclaim path.
Martin Liu reported:
Before:
max_dur(ms) min_dur(ms) max-min(dur)ms avg_dur(ms) sum_dur(ms) count blocked_function
1632 0 1631 151.542173 31672 209 page_lock_anon_vma_read
601 0 601 145.544681 28817 198 rmap_walk_file
After:
max_dur(ms) min_dur(ms) max-min(dur)ms avg_dur(ms) sum_dur(ms) count blocked_function
NaN NaN NaN NaN NaN 0.0 NaN
0 0 0 0.127645 1 12 rmap_walk_file
[minchan@kernel.org: add comment, per Matthew]
Link: https://lkml.kernel.org/r/YnNqeB5tUf6LZ57b@google.com
Link: https://lkml.kernel.org/r/20220510215423.164547-1-minchan@kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: John Dias <joaodias@google.com>
Cc: Tim Murray <timmurray@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Martin Liu <liumartin@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Conflicts:
folio->page
Conflicts:
mm/huge_memory.c was refactored by
commit c5b5a3dd2c mm: thp: refactor NUMA fault handling
(cherry picked from commit 6d4675e601)
Bug: 239681156
Signed-off-by: Minchan Kim <minchan@google.com>
Change-Id: I0c63e0291120c8a1b5f2d83b8a7b210cb56c27a2
Signed-off-by: Richard Chang <richardycc@google.com>
(cherry picked from commit b8762fa265)
If page_pinner is configured with page_pinner_enabled=false and
failure_tracking=true, pp_buffer will be accessed without being
initialized. Prevent this by adding page_pinner_inited checks in
functions that access it.
Fixes: 898cfbf094a2 ("ANDROID: mm: introduce page_pinner")
Bug: 259024332
Bug: 260179017
Change-Id: I8f612cae3e74d36e8a4eee5edec25281246cbe5e
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Richard Chang <richardycc@google.com>
(cherry picked from commit 23fb3111f63e5fe239a769668275c20493a5849c)
The below is one path where race between page_ext and offline of the
respective memory blocks will cause use-after-free on the access of
page_ext structure.
process1 process2
--------- ---------
a)doing /proc/page_owner doing memory offline
through offline_pages.
b)PageBuddy check is failed
thus proceed to get the
page_owner information
through page_ext access.
page_ext = lookup_page_ext(page);
migrate_pages();
.................
Since all pages are successfully
migrated as part of the offline
operation,send MEM_OFFLINE notification
where for page_ext it calls:
offline_page_ext()-->
__free_page_ext()-->
free_page_ext()-->
vfree(ms->page_ext)
mem_section->page_ext = NULL
c) Check for the PAGE_EXT flags
in the page_ext->flags access
results into the use-after-free(leading
to the translation faults).
As mentioned above, there is really no synchronization between page_ext
access and its freeing in the memory_offline.
The memory offline steps(roughly) on a memory block is as below:
1) Isolate all the pages
2) while(1)
try free the pages to buddy.(->free_list[MIGRATE_ISOLATE])
3) delete the pages from this buddy list.
4) Then free page_ext.(Note: The struct page is still alive as it is
freed only during hot remove of the memory which frees the memmap, which
steps the user might not perform).
This design leads to the state where struct page is alive but the struct
page_ext is freed, where the later is ideally part of the former which
just representing the page_flags (check [3] for why this design is
chosen).
The above mentioned race is just one example __but the problem persists
in the other paths too involving page_ext->flags access(eg:
page_is_idle())__.
Fix all the paths where offline races with page_ext access by
maintaining synchronization with rcu lock and is achieved in 3 steps:
1) Invalidate all the page_ext's of the sections of a memory block by
storing a flag in the LSB of mem_section->page_ext.
2) Wait till all the existing readers to finish working with the
->page_ext's with synchronize_rcu(). Any parallel process that starts
after this call will not get page_ext, through lookup_page_ext(), for
the block parallel offline operation is being performed.
3) Now safely free all sections ->page_ext's of the block on which
offline operation is being performed.
Note: If synchronize_rcu() takes time then optimizations can be done in
this path through call_rcu()[2].
Thanks to David Hildenbrand for his views/suggestions on the initial
discussion[1] and Pavan kondeti for various inputs on this patch.
[1] https://lore.kernel.org/linux-mm/59edde13-4167-8550-86f0-11fc67882107@quicinc.com/
[2] https://lore.kernel.org/all/a26ce299-aed1-b8ad-711e-a49e82bdd180@quicinc.com/T/#u
[3] https://lore.kernel.org/all/6fa6b7aa-731e-891c-3efb-a03d6a700efa@redhat.com/
Bug: 236222283
Bug: 240196534
Link: https://lore.kernel.org/all/1661496993-11473-1-git-send-email-quic_charante@quicinc.com/
Change-Id: Ib439ae19c61a557a5c70ea90e3c4b35a5583ba0d
Suggested-by: David Hildenbrand <david@redhat.com>
Suggested-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Charan Teja Kalla <quic_charante@quicinc.com>
Signed-off-by: Minchan Kim <minchan@google.com>
(fixed merge conflicts and still exported lookup_page_ext)
(minchan: fixed page_pinner with new page_ext scheme)
For CMA allocation, it's really critical to migrate a page but
sometimes it fails. One of the reasons is some driver holds a
page refcount for a long time so VM couldn't migrate the page
at that time.
The concern here is there is no way to find the who hold the
refcount of the page effectively. This patch introduces feature
to keep tracking page's pinner. All get_page sites are vulnerable
to pin a page for a long time but the cost to keep track it would
be significat since get_page is the most frequent kernel operation.
Furthermore, the page could be not user page but kernel page which
is not related to the page migration failure.
Thus, this patch keeps tracks of only migration failed pages to
reduce runtime cost. Once page migration fails in CMA allocation
path, those pages are marked as "migration failure" and every
put_page operation against those pages, callstack of the put
are recorded into page_pinner buffer. Later, admin can see
what pages were failed and who released the refcount since the
failure. It really helps effectively to find out longtime refcount
holder to prevent the page migration.
note: page_pinner doesn't guarantee attributing/unattributing are
atomic if they happen at the same time. It's just best effort so
false-positive could happen.
Bug: 183414571
BUg: 240196534
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Minchan Kim <minchan@google.com>
Change-Id: I603d0c0122734c377db6b1eb95848a6f734173a0
(cherry picked from commit 898cfbf094a2fc13c67fab5b5d3c916f0139833a)
The hardware-wrapped key support in this branch is based on my patch
"[RFC PATCH v3 3/3] fscrypt: add support for hardware-wrapped keys"
(https://lore.kernel.org/r/20211021181608.54127-4-ebiggers@kernel.org)
I've since made several updates to that patch and it is now at v7.
This commit brings in the updates from v3 to v7, to the extent possible
while retaining compatibility with the UAPI and on-disk format used for
this feature in Android. This mainly includes some improved log
messages, and compatibility with the blk-crypto updates.
Bug: 160883801
Link: https://lore.kernel.org/all/20221216203636.81491-5-ebiggers@kernel.org
Change-Id: I1c43ca55ec7e95dd06f8f7944100ffd14771d3a7
Signed-off-by: Eric Biggers <ebiggers@google.com>
Update this code to be compatible with the updated version of
"block: add basic hardware-wrapped key support".
Bug: 160883801
Change-Id: Ic6991ad163035870ace3cd468f53b21a824c5359
Signed-off-by: Eric Biggers <ebiggers@google.com>
The hardware-wrapped key support in this branch is based on my patch
"[RFC PATCH v3 1/3] block: add basic hardware-wrapped key support"
(https://lore.kernel.org/all/20211021181608.54127-2-ebiggers@kernel.org).
I've since made several updates to that patch and it is now at v7.
This commit brings in the updates from v3 to v7. The main change is
making blk_crypto_derive_sw_secret() operate on a struct block_device,
and adding blk_crypto_hw_wrapped_keys_compatible(). This aligns with
changes upstream in v6.1 and v6.2 that removed block-layer internal
structures from the API that blk-crypto exposes to upper layers.
There's also a slight change in prototype for ->derive_sw_secret, so a
couple out-of-tree drivers will need to be updated, but people
maintaining out-of-tree drivers know what they are dealing with anyway.
Bug: 160883801
Link: https://lore.kernel.org/r/20221216203636.81491-2-ebiggers@kernel.org
Change-Id: I0f285c11c2764064cd4a9d6eac0089099a9601ed
Signed-off-by: Eric Biggers <ebiggers@google.com>
The prototypes of blk_crypto_evict_key() and
blk_crypto_start_using_key() changed, so update the callers in
dm-default-key which is not upstream.
Bug: 160885805
Change-Id: Ie39a298d8aca77c042f11bbfa25fd9bf50593c52
Signed-off-by: Eric Biggers <ebiggers@google.com>
blk_crypto_get_keyslot, blk_crypto_put_keyslot, __blk_crypto_evict_key
and __blk_crypto_cfg_supported are only used internally by the
blk-crypto code, so move the out of blk-crypto-profile.h, which is
included by drivers that supply blk-crypto functionality.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Link: https://lore.kernel.org/r/20221114042944.1009870-4-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
(cherry picked from commit 3569788c08)
Change-Id: I80b07a1c3b6e6f41ffe48adbdb27a3ca4480ff75
Signed-off-by: Eric Biggers <ebiggers@google.com>
Add a blk_crypto_config_supported_natively helper that wraps
__blk_crypto_cfg_supported to retrieve the crypto_profile from the
request queue. With this fscrypt can stop including
blk-crypto-profile.h and rely on the public consumer interface in
blk-crypto.h.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Link: https://lore.kernel.org/r/20221114042944.1009870-3-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
(cherry picked from commit 6715c98b6c)
(resolved conflicts in blk_crypto_config_supported() and
__blk_crypto_bio_prep())
Change-Id: I40c4ab6bd9a108661c40c837227b6aed64685ae7
Signed-off-by: Eric Biggers <ebiggers@google.com>
Switch all public blk-crypto interfaces to use struct block_device
arguments to specify the device they operate on instead of th
request_queue, which is a block layer implementation detail.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Link: https://lore.kernel.org/r/20221114042944.1009870-2-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
(cherry picked from commit fce3caea0f)
(resolved conflict in blk_crypto_config_supported())
Change-Id: Ifde7cf1c8a2a5ddfb2fde4e5fb118269a3bfcdb0
Signed-off-by: Eric Biggers <ebiggers@google.com>
After recent fixes [1], speculative page fault walks are performed with
disabled interrupts, therefore do not depend on ALLOC_SPLIT_PTLOCKS
which would affect them if performed under RCU protection. Remove
unnecessary config dependency.
[1] 5fcb50b0559a ("ANDROID: mm: fix speculative walk which is unsafe under RCU")
Bug: 253557903
Change-Id: Ia1c835c7b08419f8fce61fa4f7e6842fbf786229
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Since Android has pcp list for MIGRATE_CMA[1], it could cause
CMA allocation latency due to not freeing the MIGRATE_ISOLATE
page immediately.
Originally, MIGRATE_ISOLATED page is supposed to go buddy list
with skipping pcp list. Otherwise, the page could be reallocated
from pcp list or staying on the pcp list until the pcp is drained
so that CMA keeps retrying since it couldn't find the freed page
from buddy list. That worked before since the CMA pfnblocks changed
only from MIGRATE_CMA to MIGRATE_ISOLATE and free function logic
in page allocator has checked MIGRATE_ISOLATEness on every CMA
pages using below.
free_unref_page_commit
if (migratetype >= MIGRATE_PCPTYPES)
if(is_migrate_isolate(migratetype))
free_one_page(page);
It worked since enum MIGRATE_CMA was bigger than enum
MIGRATE_PCPTYPES but since [1], the enum MIGRATE_CMA is less than
MIGRATE_PCPTYPES so the logic above doesn't work any more.
It could cause following race
CPU 0 CPU 1
free_unref_page
migratetype = get_pfnblock_migratetype()
set_pcppage_migratetype(MIGRATE_CMA)
cma_alloc
alloc_contig_range
set_migrate_isolate(MIGRATE_ISOLATE)
add the page into pcp list
the page could be reallocated
This patch couldn't fix the race completely due to missing zone->lock
in order-0 page free(for performance reason). However, it's not a new
problem so we need to deal with the issue separately.
[1] ANDROID: mm: add cma pcp list
Bug: 218731671
Signed-off-by: Minchan Kim <minchan@google.com>
Change-Id: Ibea20085ce5bfb4b74b83b041f9bda9a380120f9
Signed-off-by: Richard Chang <richardycc@google.com>
(cherry picked from commit d9e4b67784)
build.config.gki sources a GKI_BUILD_CONFIG_FRAGMENT before all of
the variables that are considered as part of a GKI kernel build are
declared. This reduces the effectiveness of a
GKI_BUILD_CONFIG_FRAGMENT, as it is only able to modify a subset of
the build variables.
Thus, move the logic to source GKI_BUILD_CONFIG_FRAGMENT to the end
of the GKI build config files to provide more flexibility for a
GKI_BUILD_CONFIG_FRAGMENT.
Bug: 262930113
Change-Id: I74abb45f9043acce04cb0052f54fded4340a9366
[isaacmanjarres: Modified build.config.gki.aarch64.fips140, which
did not exist on android13-5.15.]
Signed-off-by: Isaac J. Manjarres <isaacmanjarres@google.com>
(cherry picked from commit 69fefbb3db711e543ff0676526b7d285a4d10a14)
Introduce a new default trap handler for the host that can be set
from modules.
Bug: 244543039
Bug: 245034629
Change-Id: Iaabfa44f5f2c41af51f36ed4eec8762e7c951c01
Signed-off-by: Quentin Perret <qperret@google.com>
Introduce a notifier allowing a pKVM module to be notified for major
PSCI events: {CPU,SYSTEM}_SUSPEND, as well as on the resume path.
Bug: 244543039
Bug: 245034629
Change-Id: Ia82923445214925fc77e321457c8eab31f9d42e8
Signed-off-by: Quentin Perret <qperret@google.com>
Introduce a new handler allowing to notify pKVM modules when pKVM
detects an illegal access from the host.
Bug: 244543039
Bug: 245034629
Change-Id: I62133a8d967d91437e5216b307e449f8c83dfab6
Signed-off-by: Quentin Perret <qperret@google.com>
Introduce a new default SMC handler for the host that can be set from
modules.
Bug: 244543039
Bug: 245034629
Change-Id: I8481bfb1926a3cb433b15de5c1a99e3550710689
Signed-off-by: Quentin Perret <qperret@google.com>
Instead of looking up the algorithm by name in hash_algo_name[] to get
its hash_algo ID, just store the hash_algo ID in the fsverity_hash_alg
struct. Verify at boot time that every fsverity_hash_alg has a valid
hash_algo ID with matching digest size.
Remove an unnecessary memset() of the whole digest array to 0 before the
digest is copied into it.
Finally, remove the pr_debug statement. There is already a pr_debug for
the fsverity digest when the file is opened.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Link: https://lore.kernel.org/r/20221129045139.69803-1-ebiggers@kernel.org
As a step towards freeing the PG_error flag for other uses, change ext4
and f2fs to stop using PG_error to track verity errors. Instead, if a
verity error occurs, just mark the whole bio as failed. The coarser
granularity isn't really a problem since it isn't any worse than what
the block layer provides, and errors from a multi-page readahead aren't
reported to applications unless a single-page read fails too.
f2fs supports compression, which makes the f2fs changes a bit more
complicated than desired, but the basic premise still works.
Note: there are still a few uses of PageError in f2fs, but they are on
the write path, so they are unrelated and this patch doesn't touch them.
Reviewed-by: Chao Yu <chao@kernel.org>
Acked-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Link: https://lore.kernel.org/r/20221129070401.156114-1-ebiggers@kernel.org
__kunmap_ {local,atomic}() currently take pointers to void. However, this
is semantically incorrect, since these functions do not change the memory
their arguments point to.
Therefore, make this semantics explicit by modifying the
__kunmap_{local,atomic}() prototypes to take pointers to const void.
As a side effect, compilers may produce more efficient code.
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Helge Deller <deller@gmx.de> # parisc
Suggested-by: David Sterba <dsterba@suse.cz>
Suggested-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>