Commit Graph

1153836 Commits

Author SHA1 Message Date
Jiri Kosina
91e9b02185 Merge branch 'for-6.2/hyperv' into for-linus
- functionally equivalent code cleanups for hyperv driver (Paulo Miguel Almeida)
2022-12-13 14:32:07 +01:00
Jiri Kosina
8d437f11ee Merge branch 'for-6.2/ft260' into for-linus
- fixes and performance improvements to the hid-ft260 driver (Michael Zaidman)
2022-12-13 14:30:33 +01:00
Masahiro Yamada
875ef1a57f kbuild: use .NOTINTERMEDIATE for future GNU Make versions
In Kbuild, some files are generated by chains of pattern/implicit rules.
For example, *.dtb.o files in drivers/of/unittest-data/Makefile are
generated by the chain of 3 pattern rules, like this:

  %.dts  ->  %.dtb  ->  %.dtb.S  ->  %.dtb.o

Here, %.dts is the real source, %.dtb.o is the final target.
%.dtb and %.dtb.S are called "intermediate files".

As GNU Make manual [1] says, intermediate files are treated differently
in two ways:

 (a) The first difference is what happens if the intermediate file does
   not exist. If an ordinary file 'b' does not exist, and make considers
   a target that depends on 'b', it invariably creates 'b' and then
   updates the target from 'b'. But if 'b' is an intermediate file, then
   make can leave well enough alone: it won't create 'b' unless one of
   its prerequisites is out of date. This means the target depending
   on 'b' won't be rebuilt either, unless there is some other reason
   to update that target: for example the target doesn't exist or a
   different prerequisite is newer than the target.

 (b) The second difference is that if make does create 'b' in order to
   update something else, it deletes 'b' later on after it is no longer
   needed. Therefore, an intermediate file which did not exist before
   make also does not exist after make. make reports the deletion to
   you by printing a 'rm' command showing which file it is deleting.

The combination of these is problematic for Kbuild because most of the
build rules depend on FORCE and the if_changed* macros really determine
if the target should be updated. So, all missing files, whether they
are intermediate or not, are always rebuilt.

To see the problem, delete ".SECONDARY:" from scripts/Kbuild.include,
and repeat this command:

  $ make allmodconfig drivers/of/unittest-data/

The intermediate files will be deleted, which results in rebuilding
intermediate and final objects in the next run of make.

In the old days, people suppressed (b) in inconsistent ways.
As commit 54a702f705 ("kbuild: mark $(targets) as .SECONDARY and
remove .PRECIOUS markers") noted, you should not use .PRECIOUS because
.PRECIOUS has the following behavior (c), which is not likely what you
want.

 (c) If make is killed or interrupted during the execution of their
   recipes, the target is not deleted. Also, the target is not deleted
   on error even if .DELETE_ON_ERROR is specified.

.SECONDARY is a much better way to disable (b), but a small problem
is that .SECONDARY enables (a), which gives a side-effect to $?;
prerequisites marked as .SECONDARY do not appear in $?. This is a
drawback for Kbuild.

I thought it was a bug and opened a bug report. As Paul, the GNU Make
maintainer, concluded in [2], this is not a bug.

A good news is that, GNU Make 4.4 added the perfect solution,
.NOTINTERMEDIATE, which cancels both (a) and (b).

For clarificaton, my understanding of .INTERMEDIATE, .SECONDARY,
.PRECIOUS and .NOTINTERMEDIATE are as follows:

                        (a)         (b)         (c)
  .INTERMEDIATE        enable      enable      disable
  .SECONDARY           enable      disable     disable
  .PRECIOUS            disable     disable     enable
  .NOTINTERMEDIATE     disable     disable     disable

However, GNU Make 4.4 has a bug for the global .NOTINTERMEDIATE. [3]
It was fixed by commit 6164608900ad ("[SV 63417] Ensure global
.NOTINTERMEDIATE disables all intermediates"), and will be available
in the next release of GNU Make.

The following is the gain for .NOTINTERMEDIATE:

  [Current Make]

      $ make allnoconfig vmlinux
          [ full build ]
      $ rm include/linux/device.h
      $ make vmlinux
        CALL    scripts/checksyscalls.sh

  Make does not notice the removal of <linux/device.h>.

  [Future Make]

      $ make-latest allnoconfig vmlinux
          [ full build ]
      $ rm include/linux/device.h
      $ make-latest vmlinux
        CC      arch/x86/kernel/asm-offsets.s
      In file included from ./include/linux/writeback.h:13,
                       from ./include/linux/memcontrol.h:22,
                       from ./include/linux/swap.h:9,
                       from ./include/linux/suspend.h:5,
                       from arch/x86/kernel/asm-offsets.c:13:
      ./include/linux/blk_types.h:11:10: fatal error: linux/device.h: No such file or directory
         11 | #include <linux/device.h>
            |          ^~~~~~~~~~~~~~~~
      compilation terminated.
      make-latest[1]: *** [scripts/Makefile.build:114: arch/x86/kernel/asm-offsets.s] Error 1
      make-latest: *** [Makefile:1282: prepare0] Error 2

  Make notices the removal of <linux/device.h>, and rebuilds objects
  that depended on <linux/device.h>. There exists a source file that
  includes <linux/device.h>, and it raises an error.

To see detailed background information, refer to commit 2d3b1b8f0d
("kbuild: drop $(wildcard $^) check in if_changed* for faster rebuild").

[1]: https://www.gnu.org/software/make/manual/make.html#Chained-Rules
[2]: https://savannah.gnu.org/bugs/?55532
[3]: https://savannah.gnu.org/bugs/?63417

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-12-13 22:29:10 +09:00
Masahiro Yamada
3122c84409 kconfig: refactor Makefile to reduce process forks
Refactor Makefile and use read-file macro. For Make >= 4.2, it can read
out a file by using the built-in function.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
2022-12-13 22:29:10 +09:00
Masahiro Yamada
6768fa4bcb kbuild: add read-file macro
Since GNU Make 4.2, $(file ...) supports the read operater '<', which
is useful to read a file without forking a new process. No warning is
shown even if the input file is missing.

For older Make versions, it falls back to the cat command.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Tested-by: Alexander Lobakin <alexandr.lobakin@intel.com>
2022-12-13 22:29:10 +09:00
Masahiro Yamada
a5db80c65d kbuild: do not sort after reading modules.order
modules.order lists modules in the deterministic order (that is why
"modules order"), and there is no duplication in the list.

$(sort ) is pointless.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
2022-12-13 22:28:58 +09:00
Jiri Kosina
ab970ae1d6 Merge branch 'for-6.2/default-remove-cleanup' into for-linus
- removal of superfluous hid_hw_stop() calls for drivers with default
  .remove callback (Marcus Folkesson)
2022-12-13 14:28:47 +01:00
Jiri Kosina
cfd1f6c16f Merge branch 'for-6.2/apple' into for-linus
- new quirks for select Apple keyboards (Kerem Karabay, Aditya Garg)
2022-12-13 14:27:16 +01:00
Masahiro Yamada
fccb3d3eda kbuild: add test-{ge,gt,le,lt} macros
GNU Make 4.4 introduced $(intcmp ...), which is useful to compare two
integers without forking a new process.

Add test-{ge,gt,le,lt} macros, which work more efficiently with GNU
Make >= 4.4. For older Make versions, they fall back to the 'test'
shell command.

The first two parameters to $(intcmp ...) must not be empty. To avoid
the syntax error, I appended '0' to them. Fortunately, '00' is treated
as '0'. This is needed because CONFIG options may expand to an empty
string when the kernel configuration is not included.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com> # RISC-V
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
2022-12-13 22:21:14 +09:00
Masahiro Yamada
e441273947 Documentation: raise minimum supported version of binutils to 2.25
Binutils 2.23 was released in 2012. Almost 10 years old.

We already require GCC 5.1, released in 2015.

Bump the binutils version to 2.25, which was released some months
before GCC 5.1.

With this applied, some subsystems can start to clean up code.
Examples:
  arch/arm/Kconfig.assembler
  arch/mips/vdso/Kconfig
  arch/powerpc/Makefile
  arch/x86/Kconfig.assembler

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2022-12-13 22:21:14 +09:00
Christian Brauner
2c05bf3aa0 mnt_idmapping: move ima-only helpers to ima
The vfs{g,u}id_{gt,lt}_* helpers are currently not needed outside of
ima and we shouldn't incentivize people to use them by placing them into
the header. Let's just define them locally in the one file in ima where
they are used.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
2022-12-13 12:28:51 +01:00
Sriram Yagnaraman
f9645abe42 netfilter: conntrack: document sctp timeouts
Exposed through sysctl, update documentation to describe sctp states and
their default timeouts.

Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-12-13 12:25:45 +01:00
Li Qiong
ba57ee0944 ipvs: add a 'default' case in do_ip_vs_set_ctl()
It is better to return the default switch case with
'-EINVAL', in case new commands are added. otherwise,
return a uninitialized value of ret.

Signed-off-by: Li Qiong <liqiong@nfschina.com>
Reviewed-by: Simon Horman <horms@verge.net.au>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-12-13 12:23:18 +01:00
Huacai Chen
1a34e7f2fc Merge tags 'acpi-6.2-rc1' and 'irq-core-2022-12-10' into loongarch-next
LoongArch architecture changes for 6.2 depend on the acpi and irqchip
changes to work, so merge them to create a base.
2022-12-13 19:19:41 +08:00
Yuezhang Mo
36955d368d exfat: reuse exfat_find_location() to simplify exfat_get_dentry_set()
In exfat_get_dentry_set(), part of the code is the same as
exfat_find_location(), reuse exfat_find_location() to simplify
exfat_get_dentry_set().

Code refinement, no functional changes.

Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Reviewed-by: Andy Wu <Andy.Wu@sony.com>
Reviewed-by: Aoyama Wataru <wataru.aoyama@sony.com>
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2022-12-13 20:17:16 +09:00
Yuezhang Mo
40306b4d1b exfat: fix overflow in sector and cluster conversion
According to the exFAT specification, there are at most 2^32-11
clusters in a volume. so using 'int' is not enough for cluster
index, the return value type of exfat_sector_to_cluster() should
be 'unsigned int'.

Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2022-12-13 20:17:11 +09:00
Jakub Kicinski
7c4a6309e2 ipvs: fix type warning in do_div() on 32 bit
32 bit platforms without 64bit div generate the following warning:

net/netfilter/ipvs/ip_vs_est.c: In function 'ip_vs_est_calc_limits':
include/asm-generic/div64.h:222:35: warning: comparison of distinct pointer types lacks a cast
  222 |         (void)(((typeof((n)) *)0) == ((uint64_t *)0));  \
      |                                   ^~
net/netfilter/ipvs/ip_vs_est.c:694:17: note: in expansion of macro 'do_div'
  694 |                 do_div(val, loops);
      |                 ^~~~~~
include/asm-generic/div64.h:222:35: warning: comparison of distinct pointer types lacks a cast
  222 |         (void)(((typeof((n)) *)0) == ((uint64_t *)0));  \
      |                                   ^~
net/netfilter/ipvs/ip_vs_est.c:700:33: note: in expansion of macro 'do_div'
  700 |                                 do_div(val, min_est);
      |                                 ^~~~~~

first argument of do_div() should be unsigned. We can't just cast
as do_div() updates it as well, so we need an lval.
Make val unsigned in the first place, all paths check that the value
they assign to this variables are non-negative already.

Fixes: 705dd34440 ("ipvs: use kthreads for stats estimation")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20221213032037.844517-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-13 10:44:02 +01:00
Paolo Abeni
b11919e1bb Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Merge in the left-over fixes before the net-next pull-request.

net/mptcp/subflow.c
  d3295fee3c ("mptcp: use proper req destructor for IPv6")
  36b122baf6 ("mptcp: add subflow_v(4,6)_send_synack()")

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-13 09:49:29 +01:00
Christian König
58377de46e drm/i915: stop using ttm_bo_wait
TTM is just wrapping core DMA functionality here, remove the mid-layer.
No functional change.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221125102137.1801-7-christian.koenig@amd.com
2022-12-13 09:46:02 +01:00
Alexandre Ghiti
71fc3621ef riscv: Fix P4D_SHIFT definition for 3-level page table mode
RISC-V kernels support 3,4,5-level page tables at runtime by folding
upper levels.

In case of a 3-level page table, PGDIR is folded into P4D which in turn
is folded into PUD: PGDIR_SHIFT value is correctly set to the same value
as PUD_SHIFT, but P4D_SHIFT is not, then any use of P4D_SHIFT will access
invalid address bits (all set to 1).

Fix this by dynamically defining P4D_SHIFT value, like we already do for
PGDIR_SHIFT.

Fixes: d10efa21a9 ("riscv: mm: Control p4d's folding by pgtable_l5_enabled")
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>
Link: https://lore.kernel.org/r/20221201135128.1482189-2-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-12-12 22:38:18 -08:00
Andrew Jones
e923f4625e riscv: Apply a static assert to riscv_isa_ext_id
Add a static assert to ensure a RISCV_ISA_EXT_* enum is never
created with a value >= RISCV_ISA_EXT_MAX. We can do this by
putting RISCV_ISA_EXT_ID_MAX to more work. Before it was
redundant with RISCV_ISA_EXT_MAX and hence only used to
document the limit. Now it grows with the enum and is used to
check the limit.

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20221201113750.18021-1-ajones@ventanamicro.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-12-12 22:21:25 -08:00
Linus Torvalds
764822972d Merge tag 'nfsd-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Pull nfsd updates from Chuck Lever:
 "This release introduces support for the CB_RECALL_ANY operation. NFSD
  can send this operation to request that clients return any delegations
  they choose. The server uses this operation to handle low memory
  scenarios or indicate to a client when that client has reached the
  maximum number of delegations the server supports.

  The NFSv4.2 READ_PLUS operation has been simplified temporarily whilst
  support for sparse files in local filesystems and the VFS is improved.

  Two major data structure fixes appear in this release:

   - The nfs4_file hash table is replaced with a resizable hash table to
     reduce the latency of NFSv4 OPEN operations.

   - Reference counting in the NFSD filecache has been hardened against
     races.

  In furtherance of removing support for NFSv2 in a subsequent kernel
  release, a new Kconfig option enables server-side support for NFSv2 to
  be left out of a kernel build.

  MAINTAINERS has been updated to indicate that changes to fs/exportfs
  should go through the NFSD tree"

* tag 'nfsd-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (49 commits)
  NFSD: Avoid clashing function prototypes
  SUNRPC: Fix crasher in unwrap_integ_data()
  SUNRPC: Make the svc_authenticate tracepoint conditional
  NFSD: Use only RQ_DROPME to signal the need to drop a reply
  SUNRPC: Clean up xdr_write_pages()
  SUNRPC: Don't leak netobj memory when gss_read_proxy_verf() fails
  NFSD: add CB_RECALL_ANY tracepoints
  NFSD: add delegation reaper to react to low memory condition
  NFSD: add support for sending CB_RECALL_ANY
  NFSD: refactoring courtesy_client_reaper to a generic low memory shrinker
  trace: Relocate event helper files
  NFSD: pass range end to vfs_fsync_range() instead of count
  lockd: fix file selection in nlmsvc_cancel_blocked
  lockd: ensure we use the correct file descriptor when unlocking
  lockd: set missing fl_flags field when retrieving args
  NFSD: Use struct_size() helper in alloc_session()
  nfsd: return error if nfs4_setacl fails
  lockd: set other missing fields when unlocking files
  NFSD: Add an nfsd_file_fsync tracepoint
  sunrpc: svc: Remove an unused static function svc_ungetu32()
  ...
2022-12-12 20:54:39 -08:00
Linus Torvalds
149c51f876 Merge tag 'for-6.2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs updates from David Sterba:
 "This round there are a lot of cleanups and moved code so the diffstat
  looks huge, otherwise there are some nice performance improvements and
  an update to raid56 reliability.

  User visible features:

   - raid56 reliability vs performance trade off:
      - fix destructive RMW for raid5 data (raid6 still needs work): do
        full checksum verification for all data during RMW cycle, this
        should prevent rewriting potentially corrupted data without
        notice
      - stripes are cached in memory which should reduce the performance
        impact but still can hurt some workloads
      - checksums are verified after repair again
      - this is the last option without introducing additional features
        (write intent bitmap, journal, another tree), the extra checksum
        read/verification was supposed to be avoided by the original
        implementation exactly for performance reasons but that caused
        all the reliability problems

   - discard=async by default for devices that support it

   - implement emergency flush reserve to avoid almost all unnecessary
     transaction aborts due to ENOSPC in cases where there are too many
     delayed refs or delayed allocation

   - skip block group synchronization if there's no change in used
     bytes, can reduce transaction commit count for some workloads

  Performance improvements:

   - fiemap and lseek:
      - overall speedup due to skipping unnecessary or duplicate
        searches (-40% run time)
      - cache some data structures and sharedness of extents (-30% run
        time)

   - send:
      - faster backref resolution when finding clones
      - cached leaf to root mapping for faster backref walking
      - improved clone/sharing detection
      - overall run time improvements (-70%)

  Core:

   - module initialization converted to a table of function pointers run
     in a sequence

   - preparation for fscrypt, extend passing file names across calls,
     dir item can store encryption status

   - raid56 updates:
      - more accurate error tracking of sectors within stripe
      - simplify recovery path and remove dedicated endio worker kthread
      - simplify scrub call paths
      - refactoring to support the extra data checksum verification
        during RMW cycle

   - tree block parentness checks consolidated and done at metadata read
     time

   - improved error handling

   - cleanups:
      - move a lot of code for better synchronization between kernel and
        user space sources, split big files
      - enum cleanups
      - GFP flag cleanups
      - header file cleanups, prototypes, dependencies
      - redundant parameter cleanups
      - inline extent handling simplifications
      - inode parameter conversion
      - data structure cleanups, reductions, renames, merges"

* tag 'for-6.2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (249 commits)
  btrfs: print transaction aborted messages with an error level
  btrfs: sync some cleanups from progs into uapi/btrfs.h
  btrfs: do not BUG_ON() on ENOMEM when dropping extent items for a range
  btrfs: fix extent map use-after-free when handling missing device in read_one_chunk
  btrfs: remove outdated logic from overwrite_item() and add assertion
  btrfs: unify overwrite_item() and do_overwrite_item()
  btrfs: replace strncpy() with strscpy()
  btrfs: fix uninitialized variable in find_first_clear_extent_bit
  btrfs: fix uninitialized parent in insert_state
  btrfs: add might_sleep() annotations
  btrfs: add stack helpers for a few btrfs items
  btrfs: add nr_global_roots to the super block definition
  btrfs: remove BTRFS_LEAF_DATA_OFFSET
  btrfs: add helpers for manipulating leaf items and data
  btrfs: add eb to btrfs_node_key_ptr_offset
  btrfs: pass the extent buffer for the btrfs_item_nr helpers
  btrfs: move the csum helpers into ctree.h
  btrfs: move eb offset helpers into extent_io.h
  btrfs: move file_extent_item helpers into file-item.h
  btrfs: move leaf_data_end into ctree.c
  ...
2022-12-12 20:47:51 -08:00
Linus Torvalds
97971df811 Merge tag 'dlm-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm
Pull dlm updates from David Teigland:
 "These patches include the usual cleanups and minor fixes, the removal
  of code that is no longer needed due to recent improvements, and
  improvements to processing large volumes of messages during heavy
  locking activity.

  Summary:

   - Misc code cleanup

   - Fix a couple of socket handling bugs: a double release on an error
     path and a data-ready race in an accept loop

   - Remove code for resending dir-remove messages. This code is no
     longer needed since the midcomms layer now ensures the messages are
     resent if needed

   - Add tracepoints for dlm messages

   - Improve callback queueing by replacing the fixed array with a list

   - Simplify the handling of a remove message followed by a lookup
     message by sending both without releasing a spinlock in between

   - Improve the concurrency of sending and receiving messages by
     holding locks for a shorter time, and changing how workqueues are
     used

   - Remove old code for shutting down sockets, which is no longer
     needed with the reliable connection handling that was recently
     added"

* tag 'dlm-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm: (37 commits)
  fs: dlm: fix building without lockdep
  fs: dlm: parallelize lowcomms socket handling
  fs: dlm: don't init error value
  fs: dlm: use saved sk_error_report()
  fs: dlm: use sock2con without checking null
  fs: dlm: remove dlm_node_addrs lookup list
  fs: dlm: don't put dlm_local_addrs on heap
  fs: dlm: cleanup listen sock handling
  fs: dlm: remove socket shutdown handling
  fs: dlm: use listen sock as dlm running indicator
  fs: dlm: use list_first_entry_or_null
  fs: dlm: remove twice INIT_WORK
  fs: dlm: add midcomms init/start functions
  fs: dlm: add dst nodeid for msg tracing
  fs: dlm: rename seq to h_seq for msg tracing
  fs: dlm: rename DLM_IFL_NEED_SCHED to DLM_IFL_CB_PENDING
  fs: dlm: ast do WARN_ON_ONCE() on hotpath
  fs: dlm: drop lkb ref in bug case
  fs: dlm: avoid false-positive checker warning
  fs: dlm: use WARN_ON_ONCE() instead of WARN_ON()
  ...
2022-12-12 20:41:50 -08:00
Linus Torvalds
56c003e4db Merge tag 'jfs-6.2' of https://github.com/kleikamp/linux-shaggy
Pull jfs updates from David Kleikamp:
 "Assorted JFS fixes for 6.2"

* tag 'jfs-6.2' of https://github.com/kleikamp/linux-shaggy:
  jfs: makes diUnmount/diMount in jfs_mount_rw atomic
  jfs: Fix a typo in function jfs_umount
  fs: jfs: fix shift-out-of-bounds in dbDiscardAG
  jfs: Fix fortify moan in symlink
  jfs: remove redundant assignments to ipaimap and ipaimap2
  jfs: remove unused declarations for jfs
  fs/jfs/jfs_xattr.h: Fix spelling typo in comment
  MAINTAINERS: git://github -> https://github.com for kleikamp
  fs/jfs: replace ternary operator with min_t()
  fs: jfs: fix shift-out-of-bounds in dbAllocAG
2022-12-12 20:38:28 -08:00
Linus Torvalds
cda6a60acc Merge tag 'fixes_for_v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull udf and ext2 fixes from Jan Kara:

 - a couple of smaller cleanups and fixes for ext2

 - fixes of a data corruption issues in udf when handling holes and
   preallocation extents

 - fixes and cleanups of several smaller issues in udf

 - add maintainer entry for isofs

* tag 'fixes_for_v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  udf: Fix extending file within last block
  udf: Discard preallocation before extending file with a hole
  udf: Do not bother looking for prealloc extents if i_lenExtents matches i_size
  udf: Fix preallocation discarding at indirect extent boundary
  udf: Increase UDF_MAX_READ_VERSION to 0x0260
  fs/ext2: Fix code indentation
  ext2: unbugger ext2_empty_dir()
  udf: remove ->writepage
  ext2: remove ->writepage
  ext2: Don't flush page immediately for DIRSYNC directories
  ext2: Fix some kernel-doc warnings
  maintainers: Add ISOFS entry
  udf: Avoid double brelse() in udf_rename()
  fs: udf: Optimize udf_free_in_core_inode and udf_find_fileset function
2022-12-12 20:32:50 -08:00
Linus Torvalds
07d7a4d696 Merge tag 'fs.xattr.simple.noaudit.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping
Pull xattr audit fix from Seth Forshee:
 "This is a single patch to remove auditing of the capability check in
  simple_xattr_list().

  This check is done to check whether trusted xattrs should be included
  by listxattr(2). SELinux will normally log a denial when capable() is
  called and the task's SELinux context doesn't have the corresponding
  capability permission allowed, which can end up spamming the log.

  Since a failed check here cannot be used to infer malicious intent,
  auditing is of no real value, and it makes sense to stop auditing the
  capability check"

* tag 'fs.xattr.simple.noaudit.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping:
  fs: don't audit the capability check in simple_xattr_list()
2022-12-12 20:29:45 -08:00
Linus Torvalds
6e8948a063 Merge tag 'fs.idmapped.squashfs.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping
Pull squashfs update from Seth Forshee:
 "This is a simple patch to enable idmapped mounts for squashfs.

  All functionality squashfs needs to support idmapped mounts is already
  implemented in generic VFS code, so all that is needed is to set
  FS_ALLOW_IDMAP in fs_flags"

* tag 'fs.idmapped.squashfs.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping:
  squashfs: enable idmapped mounts
2022-12-12 20:24:51 -08:00
Linus Torvalds
043930b1c8 Merge tag 'fuse-update-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
Pull fuse update from Miklos Szeredi:

 - Allow some write requests to proceed in parallel

 - Fix a performance problem with allow_sys_admin_access

 - Add a special kind of invalidation that doesn't immediately purge
   submounts

 - On revalidation treat the target of rename(RENAME_NOREPLACE) the same
   as open(O_EXCL)

 - Use type safe helpers for some mnt_userns transformations

* tag 'fuse-update-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  fuse: Rearrange fuse_allow_current_process checks
  fuse: allow non-extending parallel direct writes on the same file
  fuse: remove the unneeded result variable
  fuse: port to vfs{g,u}id_t and associated helpers
  fuse: Remove user_ns check for FUSE_DEV_IOC_CLONE
  fuse: always revalidate rename target dentry
  fuse: add "expire only" mode to FUSE_NOTIFY_INVAL_ENTRY
  fs/fuse: Replace kmap() with kmap_local_page()
2022-12-12 20:22:09 -08:00
Linus Torvalds
6df7cc2268 Merge tag 'ovl-update-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs
Pull overlayfs update from Miklos Szeredi:

 - Fix a couple of bugs found by syzbot

 - Don't ingore some open flags set by fcntl(F_SETFL)

 - Fix failure to create a hard link in certain cases

 - Use type safe helpers for some mnt_userns transformations

 - Improve performance of mount

 - Misc cleanups

* tag 'ovl-update-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
  ovl: Kconfig: Fix spelling mistake "undelying" -> "underlying"
  ovl: use inode instead of dentry where possible
  ovl: Add comment on upperredirect reassignment
  ovl: use plain list filler in indexdir and workdir cleanup
  ovl: do not reconnect upper index records in ovl_indexdir_cleanup()
  ovl: fix comment typos
  ovl: port to vfs{g,u}id_t and associated helpers
  ovl: Use ovl mounter's fsuid and fsgid in ovl_link()
  ovl: Use "buf" flexible array for memcpy() destination
  ovl: update ->f_iocb_flags when ovl_change_flags() modifies ->f_flags
  ovl: fix use inode directly in rcu-walk mode
2022-12-12 20:18:26 -08:00
Linus Torvalds
4a6bff1187 Merge tag 'erofs-for-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs
Pull erofs updates from Gao Xiang:
 "In this cycle, large folios are now enabled in the iomap/fscache mode
  for uncompressed files first. In order to do that, we've also cleaned
  up better interfaces between erofs and fscache, which are acked by
  fscache/netfs folks and included in this pull request.

  Other than that, there are random fixes around erofs over fscache and
  crafted images by syzbot, minor cleanups and documentation updates.

  Summary:

   - Enable large folios for iomap/fscache mode

   - Avoid sysfs warning due to mounting twice with the same fsid and
     domain_id in fscache mode

   - Refine fscache interface among erofs, fscache, and cachefiles

   - Use kmap_local_page() only for metabuf

   - Fixes around crafted images found by syzbot

   - Minor cleanups and documentation updates"

* tag 'erofs-for-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
  erofs: validate the extent length for uncompressed pclusters
  erofs: fix missing unmap if z_erofs_get_extent_compressedlen() fails
  erofs: Fix pcluster memleak when its block address is zero
  erofs: use kmap_local_page() only for erofs_bread()
  erofs: enable large folios for fscache mode
  erofs: support large folios for fscache mode
  erofs: switch to prepare_ondemand_read() in fscache mode
  fscache,cachefiles: add prepare_ondemand_read() callback
  erofs: clean up cached I/O strategies
  erofs: update documentation
  erofs: check the uniqueness of fsid in shared domain in advance
  erofs: enable large folios for iomap mode
2022-12-12 20:14:04 -08:00
Linus Torvalds
ad0d9da164 Merge tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt
Pull fsverity updates from Eric Biggers:
 "The main change this cycle is to stop using the PG_error flag to track
  verity failures, and instead just track failures at the bio level.
  This follows a similar fscrypt change that went into 6.1, and it is a
  step towards freeing up PG_error for other uses.

  There's also one other small cleanup"

* tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt:
  fsverity: simplify fsverity_get_digest()
  fsverity: stop using PG_error to track error status
2022-12-12 20:06:35 -08:00
Linus Torvalds
8129bac60f Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt
Pull fscrypt updates from Eric Biggers:
 "This release adds SM4 encryption support, contributed by Tianjia
  Zhang. SM4 is a Chinese block cipher that is an alternative to AES.

  I recommend against using SM4, but (according to Tianjia) some people
  are being required to use it. Since SM4 has been turning up in many
  other places (crypto API, wireless, TLS, OpenSSL, ARMv8 CPUs, etc.),
  it hasn't been very controversial, and some people have to use it, I
  don't think it would be fair for me to reject this optional feature.

  Besides the above, there are a couple cleanups"

* tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt:
  fscrypt: add additional documentation for SM4 support
  fscrypt: remove unused Speck definitions
  fscrypt: Add SM4 XTS/CTS symmetric algorithm support
  blk-crypto: Add support for SM4-XTS blk crypto mode
  fscrypt: add comment for fscrypt_valid_enc_modes_v1()
  fscrypt: pass super_block to fscrypt_put_master_key_activeref()
2022-12-12 20:03:50 -08:00
Dominique Martinet
1a4f69ef15 9p/client: fix data race on req->status
KCSAN reported a race between writing req->status in p9_client_cb and
accessing it in p9_client_rpc's wait_event.

Accesses to req itself is protected by the data barrier (writing req
fields, write barrier, writing status // reading status, read barrier,
reading other req fields), but status accesses themselves apparently
also must be annotated properly with WRITE_ONCE/READ_ONCE when we
access it without locks.

Follows:
 - error paths writing status in various threads all can notify
p9_client_rpc, so these all also need WRITE_ONCE
 - there's a similar read loop in trans_virtio for zc case that also
needs READ_ONCE
 - other reads in trans_fd should be protected by the trans_fd lock and
lists state machine, as corresponding writers all are within trans_fd
and should be under the same lock. If KCSAN complains on them we likely
will have something else to fix as well, so it's better to leave them
unmarked and look again if required.

Link: https://lkml.kernel.org/r/20221205124756.426350-1-asmadeus@codewreck.org
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Suggested-by: Marco Elver <elver@google.com>
Acked-by: Marco Elver <elver@google.com>
Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com>
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2022-12-13 13:02:15 +09:00
Linus Torvalds
deb9acc122 Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 updates from Ted Ts'o:
 "A large number of cleanups and bug fixes, with many of the bug fixes
  found by Syzbot and fuzzing. (Many of the bug fixes involve less-used
  ext4 features such as fast_commit, inline_data and bigalloc)

  In addition, remove the writepage function for ext4, since the
  medium-term plan is to remove ->writepage() entirely. (The VM doesn't
  need or want writepage() for writeback, since it is fine with
  ->writepages() so long as ->migrate_folio() is implemented)"

* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (58 commits)
  ext4: fix reserved cluster accounting in __es_remove_extent()
  ext4: fix inode leak in ext4_xattr_inode_create() on an error path
  ext4: allocate extended attribute value in vmalloc area
  ext4: avoid unaccounted block allocation when expanding inode
  ext4: initialize quota before expanding inode in setproject ioctl
  ext4: stop providing .writepage hook
  mm: export buffer_migrate_folio_norefs()
  ext4: switch to using write_cache_pages() for data=journal writeout
  jbd2: switch jbd2_submit_inode_data() to use fs-provided hook for data writeout
  ext4: switch to using ext4_do_writepages() for ordered data writeout
  ext4: move percpu_rwsem protection into ext4_writepages()
  ext4: provide ext4_do_writepages()
  ext4: add support for writepages calls that cannot map blocks
  ext4: drop pointless IO submission from ext4_bio_write_page()
  ext4: remove nr_submitted from ext4_bio_write_page()
  ext4: move keep_towrite handling to ext4_bio_write_page()
  ext4: handle redirtying in ext4_bio_write_page()
  ext4: fix kernel BUG in 'ext4_write_inline_data_end()'
  ext4: make ext4_mb_initialize_context return void
  ext4: fix deadlock due to mbcache entry corruption
  ...
2022-12-12 19:56:37 -08:00
Christophe JAILLET
d1c722867f net: lan966x: Remove a useless test in lan966x_ptp_add_trap()
vcap_alloc_rule() can't return NULL.

So remove some dead-code

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/27992ffcee47fc865ce87274d6dfcffe7a1e69e0.1670873784.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-12 19:39:24 -08:00
Linus Torvalds
9b93f5069f Merge tag 'fs.idmapped.mnt_idmap.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping
Pull idmapping updates from Christian Brauner:
 "Last cycle we've already made the interaction with idmapped mounts
  more robust and type safe by introducing the vfs{g,u}id_t type. This
  cycle we concluded the conversion and removed the legacy helpers.

  Currently we still pass around the plain namespace that was attached
  to a mount. This is in general pretty convenient but it makes it easy
  to conflate namespaces that are relevant on the filesystem - with
  namespaces that are relevent on the mount level. Especially for
  filesystem developers without detailed knowledge in this area this can
  be a potential source for bugs.

  Instead of passing the plain namespace we introduce a dedicated type
  struct mnt_idmap and replace the pointer with a pointer to a struct
  mnt_idmap. There are no semantic or size changes for the mount struct
  caused by this.

  We then start converting all places aware of idmapped mounts to rely
  on struct mnt_idmap. Once the conversion is done all helpers down to
  the really low-level make_vfs{g,u}id() and from_vfs{g,u}id() will take
  a struct mnt_idmap argument instead of two namespace arguments. This
  way it becomes impossible to conflate the two removing and thus
  eliminating the possibility of any bugs. Fwiw, I fixed some issues in
  that area a while ago in ntfs3 and ksmbd in the past. Afterwards only
  low-level code can ultimately use the associated namespace for any
  permission checks. Even most of the vfs can be completely obivious
  about this ultimately and filesystems will never interact with it in
  any form in the future.

  A struct mnt_idmap currently encompasses a simple refcount and pointer
  to the relevant namespace the mount is idmapped to. If a mount isn't
  idmapped then it will point to a static nop_mnt_idmap and if it
  doesn't that it is idmapped. As usual there are no allocations or
  anything happening for non-idmapped mounts. Everthing is carefully
  written to be a nop for non-idmapped mounts as has always been the
  case.

  If an idmapped mount is created a struct mnt_idmap is allocated and a
  reference taken on the relevant namespace. Each mount that gets
  idmapped or inherits the idmap simply bumps the reference count on
  struct mnt_idmap. Just a reminder that we only allow a mount to change
  it's idmapping a single time and only if it hasn't already been
  attached to the filesystems and has no active writers.

  The actual changes are fairly straightforward but this will have huge
  benefits for maintenance and security in the long run even if it
  causes some churn.

  Note that this also makes it possible to extend struct mount_idmap in
  the future. For example, it would be possible to place the namespace
  pointer in an anonymous union together with an idmapping struct. This
  would allow us to expose an api to userspace that would let it specify
  idmappings directly instead of having to go through the detour of
  setting up namespaces at all"

* tag 'fs.idmapped.mnt_idmap.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping:
  acl: conver higher-level helpers to rely on mnt_idmap
  fs: introduce dedicated idmap type for mounts
2022-12-12 19:30:18 -08:00
Linus Torvalds
e1212e9b6f Merge tag 'fs.vfsuid.conversion.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping
Pull vfsuid updates from Christian Brauner:
 "Last cycle we introduced the vfs{g,u}id_t types and associated helpers
  to gain type safety when dealing with idmapped mounts. That initial
  work already converted a lot of places over but there were still some
  left,

  This converts all remaining places that still make use of non-type
  safe idmapping helpers to rely on the new type safe vfs{g,u}id based
  helpers.

  Afterwards it removes all the old non-type safe helpers"

* tag 'fs.vfsuid.conversion.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping:
  fs: remove unused idmapping helpers
  ovl: port to vfs{g,u}id_t and associated helpers
  fuse: port to vfs{g,u}id_t and associated helpers
  ima: use type safe idmapping helpers
  apparmor: use type safe idmapping helpers
  caps: use type safe idmapping helpers
  fs: use type safe idmapping helpers
  mnt_idmapping: add missing helpers
2022-12-12 19:20:05 -08:00
Linus Torvalds
cf619f8919 Merge tag 'fs.ovl.setgid.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping
Pull setgid inheritance updates from Christian Brauner:
 "This contains the work to make setgid inheritance consistent between
  modifying a file and when changing ownership or mode as this has been
  a repeated source of very subtle bugs. The gist is that we perform the
  same permission checks in the write path as we do in the ownership and
  mode changing paths after this series where we're currently doing
  different things.

  We've already made setgid inheritance a lot more consistent and
  reliable in the last releases by moving setgid stripping from the
  individual filesystems up into the vfs. This aims to make the logic
  even more consistent and easier to understand and also to fix
  long-standing overlayfs setgid inheritance bugs. Miklos was nice
  enough to just let me carry the trivial overlayfs patches from Amir
  too.

  Below is a more detailed explanation how the current difference in
  setgid handling lead to very subtle bugs exemplified via overlayfs
  which is a victim of the current rules. I hope this explains why I
  think taking the regression risk here is worth it.

  A long while ago I found a few setgid inheritance bugs in overlayfs in
  the write path in certain conditions. Amir recently picked this back
  up in [1] and I jumped on board to fix this more generally.

  On the surface all that overlayfs would need to fix setgid inheritance
  would be to call file_remove_privs() or file_modified() but actually
  that isn't enough because the setgid inheritance api is wildly
  inconsistent in that area.

  Before this pr setgid stripping in file_remove_privs()'s old
  should_remove_suid() helper was inconsistent with other parts of the
  vfs. Specifically, it only raises ATTR_KILL_SGID if the inode is
  S_ISGID and S_IXGRP but not if the inode isn't in the caller's groups
  and the caller isn't privileged over the inode although we require
  this already in setattr_prepare() and setattr_copy() and so all
  filesystem implement this requirement implicitly because they have to
  use setattr_{prepare,copy}() anyway.

  But the inconsistency shows up in setgid stripping bugs for overlayfs
  in xfstests (e.g., generic/673, generic/683, generic/685, generic/686,
  generic/687). For example, we test whether suid and setgid stripping
  works correctly when performing various write-like operations as an
  unprivileged user (fallocate, reflink, write, etc.):

      echo "Test 1 - qa_user, non-exec file $verb"
      setup_testfile
      chmod a+rws $junk_file
      commit_and_check "$qa_user" "$verb" 64k 64k

  The test basically creates a file with 6666 permissions. While the
  file has the S_ISUID and S_ISGID bits set it does not have the S_IXGRP
  set.

  On a regular filesystem like xfs what will happen is:

      sys_fallocate()
      -> vfs_fallocate()
         -> xfs_file_fallocate()
            -> file_modified()
               -> __file_remove_privs()
                  -> dentry_needs_remove_privs()
                     -> should_remove_suid()
                  -> __remove_privs()
                     newattrs.ia_valid = ATTR_FORCE | kill;
                     -> notify_change()
                        -> setattr_copy()

  In should_remove_suid() we can see that ATTR_KILL_SUID is raised
  unconditionally because the file in the test has S_ISUID set.

  But we also see that ATTR_KILL_SGID won't be set because while the
  file is S_ISGID it is not S_IXGRP (see above) which is a condition for
  ATTR_KILL_SGID being raised.

  So by the time we call notify_change() we have attr->ia_valid set to
  ATTR_KILL_SUID | ATTR_FORCE.

  Now notify_change() sees that ATTR_KILL_SUID is set and does:

      ia_valid      = attr->ia_valid |= ATTR_MODE
      attr->ia_mode = (inode->i_mode & ~S_ISUID);

  which means that when we call setattr_copy() later we will definitely
  update inode->i_mode. Note that attr->ia_mode still contains S_ISGID.

  Now we call into the filesystem's ->setattr() inode operation which
  will end up calling setattr_copy(). Since ATTR_MODE is set we will
  hit:

      if (ia_valid & ATTR_MODE) {
              umode_t mode = attr->ia_mode;
              vfsgid_t vfsgid = i_gid_into_vfsgid(mnt_userns, inode);
              if (!vfsgid_in_group_p(vfsgid) &&
                  !capable_wrt_inode_uidgid(mnt_userns, inode, CAP_FSETID))
                      mode &= ~S_ISGID;
              inode->i_mode = mode;
      }

  and since the caller in the test is neither capable nor in the group
  of the inode the S_ISGID bit is stripped.

  But assume the file isn't suid then ATTR_KILL_SUID won't be raised
  which has the consequence that neither the setgid nor the suid bits
  are stripped even though it should be stripped because the inode isn't
  in the caller's groups and the caller isn't privileged over the inode.

  If overlayfs is in the mix things become a bit more complicated and
  the bug shows up more clearly.

  When e.g., ovl_setattr() is hit from ovl_fallocate()'s call to
  file_remove_privs() then ATTR_KILL_SUID and ATTR_KILL_SGID might be
  raised but because the check in notify_change() is questioning the
  ATTR_KILL_SGID flag again by requiring S_IXGRP for it to be stripped
  the S_ISGID bit isn't removed even though it should be stripped:

      sys_fallocate()
      -> vfs_fallocate()
         -> ovl_fallocate()
            -> file_remove_privs()
               -> dentry_needs_remove_privs()
                  -> should_remove_suid()
               -> __remove_privs()
                  newattrs.ia_valid = ATTR_FORCE | kill;
                  -> notify_change()
                     -> ovl_setattr()
                        /* TAKE ON MOUNTER'S CREDS */
                        -> ovl_do_notify_change()
                           -> notify_change()
                        /* GIVE UP MOUNTER'S CREDS */
           /* TAKE ON MOUNTER'S CREDS */
           -> vfs_fallocate()
              -> xfs_file_fallocate()
                 -> file_modified()
                    -> __file_remove_privs()
                       -> dentry_needs_remove_privs()
                          -> should_remove_suid()
                       -> __remove_privs()
                          newattrs.ia_valid = attr_force | kill;
                          -> notify_change()

  The fix for all of this is to make file_remove_privs()'s
  should_remove_suid() helper perform the same checks as we already
  require in setattr_prepare() and setattr_copy() and have
  notify_change() not pointlessly requiring S_IXGRP again. It doesn't
  make any sense in the first place because the caller must calculate
  the flags via should_remove_suid() anyway which would raise
  ATTR_KILL_SGID

  Note that some xfstests will now fail as these patches will cause the
  setgid bit to be lost in certain conditions for unprivileged users
  modifying a setgid file when they would've been kept otherwise. I
  think this risk is worth taking and I explained and mentioned this
  multiple times on the list [2].

  Enforcing the rules consistently across write operations and
  chmod/chown will lead to losing the setgid bit in cases were it
  might've been retained before.

  While I've mentioned this a few times but it's worth repeating just to
  make sure that this is understood. For the sake of maintainability,
  consistency, and security this is a risk worth taking.

  If we really see regressions for workloads the fix is to have special
  setgid handling in the write path again with different semantics from
  chmod/chown and possibly additional duct tape for overlayfs. I'll
  update the relevant xfstests with if you should decide to merge this
  second setgid cleanup.

  Before that people should be aware that there might be failures for
  fstests where unprivileged users modify a setgid file"

Link: https://lore.kernel.org/linux-fsdevel/20221003123040.900827-1-amir73il@gmail.com [1]
Link: https://lore.kernel.org/linux-fsdevel/20221122142010.zchf2jz2oymx55qi@wittgenstein [2]

* tag 'fs.ovl.setgid.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping:
  fs: use consistent setgid checks in is_sxid()
  ovl: remove privs in ovl_fallocate()
  ovl: remove privs in ovl_copyfile()
  attr: use consistent sgid stripping checks
  attr: add setattr_should_drop_sgid()
  fs: move should_remove_suid()
  attr: add in_group_or_capable()
2022-12-12 19:03:10 -08:00
Linus Torvalds
6a518afcc2 Merge tag 'fs.acl.rework.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping
Pull VFS acl updates from Christian Brauner:
 "This contains the work that builds a dedicated vfs posix acl api.

  The origins of this work trace back to v5.19 but it took quite a while
  to understand the various filesystem specific implementations in
  sufficient detail and also come up with an acceptable solution.

  As we discussed and seen multiple times the current state of how posix
  acls are handled isn't nice and comes with a lot of problems: The
  current way of handling posix acls via the generic xattr api is error
  prone, hard to maintain, and type unsafe for the vfs until we call
  into the filesystem's dedicated get and set inode operations.

  It is already the case that posix acls are special-cased to death all
  the way through the vfs. There are an uncounted number of hacks that
  operate on the uapi posix acl struct instead of the dedicated vfs
  struct posix_acl. And the vfs must be involved in order to interpret
  and fixup posix acls before storing them to the backing store, caching
  them, reporting them to userspace, or for permission checking.

  Currently a range of hacks and duct tape exist to make this work. As
  with most things this is really no ones fault it's just something that
  happened over time. But the code is hard to understand and difficult
  to maintain and one is constantly at risk of introducing bugs and
  regressions when having to touch it.

  Instead of continuing to hack posix acls through the xattr handlers
  this series builds a dedicated posix acl api solely around the get and
  set inode operations.

  Going forward, the vfs_get_acl(), vfs_remove_acl(), and vfs_set_acl()
  helpers must be used in order to interact with posix acls. They
  operate directly on the vfs internal struct posix_acl instead of
  abusing the uapi posix acl struct as we currently do. In the end this
  removes all of the hackiness, makes the codepaths easier to maintain,
  and gets us type safety.

  This series passes the LTP and xfstests suites without any
  regressions. For xfstests the following combinations were tested:
   - xfs
   - ext4
   - btrfs
   - overlayfs
   - overlayfs on top of idmapped mounts
   - orangefs
   - (limited) cifs

  There's more simplifications for posix acls that we can make in the
  future if the basic api has made it.

  A few implementation details:

   - The series makes sure to retain exactly the same security and
     integrity module permission checks. Especially for the integrity
     modules this api is a win because right now they convert the uapi
     posix acl struct passed to them via a void pointer into the vfs
     struct posix_acl format to perform permission checking on the mode.

     There's a new dedicated security hook for setting posix acls which
     passes the vfs struct posix_acl not a void pointer. Basing checking
     on the posix acl stored in the uapi format is really unreliable.
     The vfs currently hacks around directly in the uapi struct storing
     values that frankly the security and integrity modules can't
     correctly interpret as evidenced by bugs we reported and fixed in
     this area. It's not necessarily even their fault it's just that the
     format we provide to them is sub optimal.

   - Some filesystems like 9p and cifs need access to the dentry in
     order to get and set posix acls which is why they either only
     partially or not even at all implement get and set inode
     operations. For example, cifs allows setxattr() and getxattr()
     operations but doesn't allow permission checking based on posix
     acls because it can't implement a get acl inode operation.

     Thus, this patch series updates the set acl inode operation to take
     a dentry instead of an inode argument. However, for the get acl
     inode operation we can't do this as the old get acl method is
     called in e.g., generic_permission() and inode_permission(). These
     helpers in turn are called in various filesystem's permission inode
     operation. So passing a dentry argument to the old get acl inode
     operation would amount to passing a dentry to the permission inode
     operation which we shouldn't and probably can't do.

     So instead of extending the existing inode operation Christoph
     suggested to add a new one. He also requested to ensure that the
     get and set acl inode operation taking a dentry are consistently
     named. So for this version the old get acl operation is renamed to
     ->get_inode_acl() and a new ->get_acl() inode operation taking a
     dentry is added. With this we can give both 9p and cifs get and set
     acl inode operations and in turn remove their complex custom posix
     xattr handlers.

     In the future I hope to get rid of the inode method duplication but
     it isn't like we have never had this situation. Readdir is just one
     example. And frankly, the overall gain in type safety and the more
     pleasant api wise are simply too big of a benefit to not accept
     this duplication for a while.

   - We've done a full audit of every codepaths using variant of the
     current generic xattr api to get and set posix acls and
     surprisingly it isn't that many places. There's of course always a
     chance that we might have missed some and if so I'm sure we'll find
     them soon enough.

     The crucial codepaths to be converted are obviously stacking
     filesystems such as ecryptfs and overlayfs.

     For a list of all callers currently using generic xattr api helpers
     see [2] including comments whether they support posix acls or not.

   - The old vfs generic posix acl infrastructure doesn't obey the
     create and replace semantics promised on the setxattr(2) manpage.
     This patch series doesn't address this. It really is something we
     should revisit later though.

  The patches are roughly organized as follows:

   (1) Change existing set acl inode operation to take a dentry
       argument (Intended to be a non-functional change)

   (2) Rename existing get acl method (Intended to be a non-functional
       change)

   (3) Implement get and set acl inode operations for filesystems that
       couldn't implement one before because of the missing dentry.
       That's mostly 9p and cifs (Intended to be a non-functional
       change)

   (4) Build posix acl api, i.e., add vfs_get_acl(), vfs_remove_acl(),
       and vfs_set_acl() including security and integrity hooks
       (Intended to be a non-functional change)

   (5) Implement get and set acl inode operations for stacking
       filesystems (Intended to be a non-functional change)

   (6) Switch posix acl handling in stacking filesystems to new posix
       acl api now that all filesystems it can stack upon support it.

   (7) Switch vfs to new posix acl api (semantical change)

   (8) Remove all now unused helpers

   (9) Additional regression fixes reported after we merged this into
       linux-next

  Thanks to Seth for a lot of good discussion around this and
  encouragement and input from Christoph"

* tag 'fs.acl.rework.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping: (36 commits)
  posix_acl: Fix the type of sentinel in get_acl
  orangefs: fix mode handling
  ovl: call posix_acl_release() after error checking
  evm: remove dead code in evm_inode_set_acl()
  cifs: check whether acl is valid early
  acl: make vfs_posix_acl_to_xattr() static
  acl: remove a slew of now unused helpers
  9p: use stub posix acl handlers
  cifs: use stub posix acl handlers
  ovl: use stub posix acl handlers
  ecryptfs: use stub posix acl handlers
  evm: remove evm_xattr_acl_change()
  xattr: use posix acl api
  ovl: use posix acl api
  ovl: implement set acl method
  ovl: implement get acl method
  ecryptfs: implement set acl method
  ecryptfs: implement get acl method
  ksmbd: use vfs_remove_acl()
  acl: add vfs_remove_acl()
  ...
2022-12-12 18:46:39 -08:00
Linus Torvalds
bd90741318 Merge tag 'pull-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull misc vfs updates from Al Viro:
 "misc pile"

* tag 'pull-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fs: sysv: Fix sysv_nblocks() returns wrong value
  get rid of INT_LIMIT, use type_max() instead
  btrfs: replace INT_LIMIT(loff_t) with OFFSET_MAX
  fs: simplify vfs_get_super
  fs: drop useless condition from inode_needs_update_time
2022-12-12 18:38:47 -08:00
Linus Torvalds
13c574fec8 Merge tag 'pull-namespace' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull namespace fix from Al Viro:
 "Fix weird corner case in copy_mnt_ns()"

* tag 'pull-namespace' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  copy_mnt_ns(): handle a corner case (overmounted mntns bindings) saner
2022-12-12 18:36:12 -08:00
Zhen Lei
4f1354d5c6 livepatch: Call klp_match_callback() in klp_find_callback() to avoid code duplication
The implementation of function klp_match_callback() is identical to the
partial implementation of function klp_find_callback(). So call function
klp_match_callback() in function klp_find_callback() instead of the
duplicated code.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Acked-by: Song Liu <song@kernel.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Suggested-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2022-12-12 18:30:58 -08:00
Linus Torvalds
75f4d9af8b Merge tag 'pull-iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull iov_iter updates from Al Viro:
 "iov_iter work; most of that is about getting rid of direction
  misannotations and (hopefully) preventing more of the same for the
  future"

* tag 'pull-iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  use less confusing names for iov_iter direction initializers
  iov_iter: saner checks for attempt to copy to/from iterator
  [xen] fix "direction" argument of iov_iter_kvec()
  [vhost] fix 'direction' argument of iov_iter_{init,bvec}()
  [target] fix iov_iter_bvec() "direction" argument
  [s390] memcpy_real(): WRITE is "data source", not destination...
  [s390] zcore: WRITE is "data source", not destination...
  [infiniband] READ is "data destination", not source...
  [fsi] WRITE is "data source", not destination...
  [s390] copy_oldmem_kernel() - WRITE is "data source", not destination
  csum_and_copy_to_iter(): handle ITER_DISCARD
  get rid of unlikely() on page_copy_sane() calls
2022-12-12 18:29:54 -08:00
Linus Torvalds
268369b171 Merge tag 'pull-alpha' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull alpha updates from Al Viro:
 "Alpha architecture cleanups and fixes.

  One thing *not* included is lazy FPU switching stuff - this pile is
  just the straightforward stuff"

* tag 'pull-alpha' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  alpha: ret_from_fork can go straight to ret_to_user
  alpha: syscall exit cleanup
  alpha: fix handling of a3 on straced syscalls
  alpha: fix syscall entry in !AUDUT_SYSCALL case
  alpha: _TIF_ALLWORK_MASK is unused
  alpha: fix TIF_NOTIFY_SIGNAL handling
2022-12-12 18:27:06 -08:00
Linus Torvalds
405b2fc663 Merge tag 'pull-elfcore' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull elf coredumping updates from Al Viro:
 "Unification of regset and non-regset sides of ELF coredump handling.

  Collecting per-thread register values is the only thing that needs to
  be ifdefed there..."

* tag 'pull-elfcore' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  [elf] get rid of get_note_info_size()
  [elf] unify regset and non-regset cases
  [elf][non-regset] use elf_core_copy_task_regs() for dumper as well
  [elf][non-regset] uninline elf_core_copy_task_fpregs() (and lose pt_regs argument)
  elf_core_copy_task_regs(): task_pt_regs is defined everywhere
  [elf][regset] simplify thread list handling in fill_note_info()
  [elf][regset] clean fill_note_info() a bit
  kill extern of vsyscall32_sysctl
  kill coredump_params->regs
  kill signal_pt_regs()
2022-12-12 18:18:34 -08:00
Linus Torvalds
8702f2c611 Merge tag 'mm-nonmm-stable-2022-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:

 - A ptrace API cleanup series from Sergey Shtylyov

 - Fixes and cleanups for kexec from ye xingchen

 - nilfs2 updates from Ryusuke Konishi

 - squashfs feature work from Xiaoming Ni: permit configuration of the
   filesystem's compression concurrency from the mount command line

 - A series from Akinobu Mita which addresses bound checking errors when
   writing to debugfs files

 - A series from Yang Yingliang to address rapidio memory leaks

 - A series from Zheng Yejian to address possible overflow errors in
   encode_comp_t()

 - And a whole shower of singleton patches all over the place

* tag 'mm-nonmm-stable-2022-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (79 commits)
  ipc: fix memory leak in init_mqueue_fs()
  hfsplus: fix bug causing custom uid and gid being unable to be assigned with mount
  rapidio: devices: fix missing put_device in mport_cdev_open
  kcov: fix spelling typos in comments
  hfs: Fix OOB Write in hfs_asc2mac
  hfs: fix OOB Read in __hfs_brec_find
  relay: fix type mismatch when allocating memory in relay_create_buf()
  ocfs2: always read both high and low parts of dinode link count
  io-mapping: move some code within the include guarded section
  kernel: kcsan: kcsan_test: build without structleak plugin
  mailmap: update email for Iskren Chernev
  eventfd: change int to __u64 in eventfd_signal() ifndef CONFIG_EVENTFD
  rapidio: fix possible UAF when kfifo_alloc() fails
  relay: use strscpy() is more robust and safer
  cpumask: limit visibility of FORCE_NR_CPUS
  acct: fix potential integer overflow in encode_comp_t()
  acct: fix accuracy loss for input value of encode_comp_t()
  linux/init.h: include <linux/build_bug.h> and <linux/stringify.h>
  rapidio: rio: fix possible name leak in rio_register_mport()
  rapidio: fix possible name leaks when rio_add_device() fails
  ...
2022-12-12 17:28:58 -08:00
Linus Torvalds
a7cacfb068 Merge tag 'docs-6.2' of git://git.lwn.net/linux
Pull documentation updates from Jonathan Corbet:
 "This was a not-too-busy cycle for documentation; highlights include:

   - The beginnings of a set of translations into Spanish, headed up by
     Carlos Bilbao

   - More Chinese translations

   - A change to the Sphinx "alabaster" theme by default for HTML
     generation.

     Unlike the previous default (Read the Docs), alabaster is shipped
     with Sphinx by default, reducing the number of other dependencies
     that need to be installed. It also (IMO) produces a cleaner and
     more readable result.

   - The ability to render the documentation into the texinfo format
     (something Sphinx could always do, we just never wired it up until
     now)

  Plus the usual collection of typo fixes, build-warning fixes, and
  minor updates"

* tag 'docs-6.2' of git://git.lwn.net/linux: (67 commits)
  Documentation/features: Use loongarch instead of loong
  Documentation/features-refresh.sh: Only sed the beginning "arch" of ARCH_DIR
  docs/zh_CN: Fix '.. only::' directive's expression
  docs/sp_SP: Add memory-barriers.txt Spanish translation
  docs/zh_CN/LoongArch: Update links of LoongArch ISA Vol1 and ELF psABI
  docs/LoongArch: Update links of LoongArch ISA Vol1 and ELF psABI
  Documentation/features: Update feature lists for 6.1
  Documentation: Fixed a typo in bootconfig.rst
  docs/sp_SP: Add process coding-style translation
  docs/sp_SP: Add kernel-docs.rst Spanish translation
  docs: Create translations/sp_SP/process/, move submitting-patches.rst
  docs: Add book to process/kernel-docs.rst
  docs: Retire old resources from kernel-docs.rst
  docs: Update maintainer of kernel-docs.rst
  Documentation: riscv: Document the sv57 VM layout
  Documentation: USB: correct possessive "its" usage
  math64: fix kernel-doc return value warnings
  math64: add kernel-doc for DIV64_U64_ROUND_UP
  math64: favor kernel-doc from header files
  doc: add texinfodocs and infodocs targets
  ...
2022-12-12 17:18:50 -08:00
Linus Torvalds
96f4263568 Merge tag 'rust-6.2' of https://github.com/Rust-for-Linux/linux
Pull rust updates from Miguel Ojeda:
 "The first set of changes after the merge, the major ones being:

   - String and formatting: new types 'CString', 'CStr', 'BStr' and
     'Formatter'; new macros 'c_str!', 'b_str!' and 'fmt!'.

   - Errors: the rest of the error codes from 'errno-base.h', as well as
     some 'From' trait implementations for the 'Error' type.

   - Printing: the rest of the 'pr_*!' levels and the continuation one
     'pr_cont!', as well as a new sample.

   - 'alloc' crate: new constructors 'try_with_capacity()' and
     'try_with_capacity_in()' for 'RawVec' and 'Vec'.

   - Procedural macros: new macros '#[vtable]' and 'concat_idents!', as
     well as better ergonomics for 'module!' users.

   - Asserting: new macros 'static_assert!', 'build_error!' and
     'build_assert!', as well as a new crate 'build_error' to support
     them.

   - Vocabulary types: new types 'Opaque' and 'Either'.

   - Debugging: new macro 'dbg!'"

* tag 'rust-6.2' of https://github.com/Rust-for-Linux/linux: (28 commits)
  rust: types: add `Opaque` type
  rust: types: add `Either` type
  rust: build_assert: add `build_{error,assert}!` macros
  rust: add `build_error` crate
  rust: static_assert: add `static_assert!` macro
  rust: std_vendor: add `dbg!` macro based on `std`'s one
  rust: str: add `fmt!` macro
  rust: str: add `CString` type
  rust: str: add `Formatter` type
  rust: str: add `c_str!` macro
  rust: str: add `CStr` unit tests
  rust: str: implement several traits for `CStr`
  rust: str: add `CStr` type
  rust: str: add `b_str!` macro
  rust: str: add `BStr` type
  rust: alloc: add `Vec::try_with_capacity{,_in}()` constructors
  rust: alloc: add `RawVec::try_with_capacity_in()` constructor
  rust: prelude: add `error::code::*` constant items
  rust: error: add `From` implementations for `Error`
  rust: error: add codes from `errno-base.h`
  ...
2022-12-12 16:59:00 -08:00
Linus Torvalds
eb45115381 Merge tag 'trace-tools-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing tools updates from Steven Rostedt:

 - New tool "rv" for starting and stopping runtime verification.
   Example:

      ./rv mon wip -r printk -v

   Enables the wake-in-preempt monitor and the printk reactor in verbose
   mode

 - Fix exit status of rtla usage() calls

* tag 'trace-tools-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  Documentation/rv: Add verification/rv man pages
  tools/rv: Add in-kernel monitor interface
  rv: Add rv tool
  rtla: Fix exit status when returning from calls to usage()
2022-12-12 16:48:48 -08:00