Commit Graph

1199812 Commits

Author SHA1 Message Date
Dmitry Torokhov
bf4ed21778 Merge branch 'next' into for-linus
Prepare input updates for 6.5 merge window.
2023-06-26 15:18:13 -07:00
Linus Torvalds
8c69e7afe9 Merge tag 'x86_alternatives_for_v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 instruction alternatives updates from Borislav Petkov:

 - Up until now the Fast Short Rep Mov optimizations implied the
   presence of the ERMS CPUID flag. AMD decoupled them with a BIOS
   setting so decouple that dependency in the kernel code too

 - Teach the alternatives machinery to handle relocations

 - Make debug_alternative accept flags in order to see only that set of
   patching done one is interested in

 - Other fixes, cleanups and optimizations to the patching code

* tag 'x86_alternatives_for_v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/alternative: PAUSE is not a NOP
  x86/alternatives: Add cond_resched() to text_poke_bp_batch()
  x86/nospec: Shorten RESET_CALL_DEPTH
  x86/alternatives: Add longer 64-bit NOPs
  x86/alternatives: Fix section mismatch warnings
  x86/alternative: Optimize returns patching
  x86/alternative: Complicate optimize_nops() some more
  x86/alternative: Rewrite optimize_nops() some
  x86/lib/memmove: Decouple ERMS from FSRM
  x86/alternative: Support relocations in alternatives
  x86/alternative: Make debug-alternative selective
2023-06-26 15:14:55 -07:00
Kemeng Shi
11b6890be0 ext4: get block from bh in ext4_free_blocks for fast commit replay
ext4_free_blocks will retrieve block from bh if block parameter is zero.
Retrieve block before ext4_free_blocks_simple to avoid potentially
passing wrong block to ext4_free_blocks_simple.

Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Cc: stable@kernel.org
Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Link: https://lore.kernel.org/r/20230603150327.3596033-9-shikemeng@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-06-26 18:13:05 -04:00
Linus Torvalds
aa35a4835e Merge tag 'ras_core_for_v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RAS updates from Borislav Petkov:

 - Add initial support for RAS hardware found on AMD server GPUs (MI200).

   Those GPUs and CPUs are connected together through the coherent
   fabric and the GPU memory controllers report errors through x86's MCA
   so EDAC needs to support them. The amd64_edac driver supports now HBM
   (High Bandwidth Memory) and thus such heterogeneous memory controller
   systems

 - Other small cleanups and improvements

* tag 'ras_core_for_v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  EDAC/amd64: Cache and use GPU node map
  EDAC/amd64: Add support for AMD heterogeneous Family 19h Model 30h-3Fh
  EDAC/amd64: Document heterogeneous system enumeration
  x86/MCE/AMD, EDAC/mce_amd: Decode UMC_V2 ECC errors
  x86/amd_nb: Re-sort and re-indent PCI defines
  x86/amd_nb: Add MI200 PCI IDs
  ras/debugfs: Fix error checking for debugfs_create_dir()
  x86/MCE: Check a hw error's address to determine proper recovery action
2023-06-26 15:09:18 -07:00
Linus Torvalds
e5ce2f196f Merge tag 'edac_updates_for_v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras
Pull EDAC updates from Borislav Petkov:

 - amd64_edac: Add support for Zen4 client hardware

 - amd64_edac: Remove the version string as it is useless and actively
   confusing when looking at backported versions of the driver

 - Add a driver for the Nuvoton NPCM memory controller

 - A debugfs error checking cleanup

* tag 'edac_updates_for_v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
  EDAC/npcm: Add NPCM memory controller driver
  dt-bindings: memory-controllers: nuvoton: Add NPCM memory controller
  EDAC/thunderx: Check debugfs file creation retval properly
  EDAC/amd64: Add support for ECC on family 19h model 60h-7Fh
  EDAC/amd64: Remove module version string
2023-06-26 15:06:42 -07:00
Stephen Boyd
7ed1cefbf1 Merge tag 'qcom-clk-for-6.5-2' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into clk-qcom
Pull Qualcomm clk driver updates from Bjorn Andersson:

This introduces Global Clock Controller for SDX75, LPASS clock
controllers for SC8280XP, video clock controller for SM8350, SM8450 and
SM8550, GPU clock controller for SM8450 and SM8550, RPMH clock support
for SDX75 and IPQ9574 support in APSS IPQ PLL driver.

Support for branch2 clocks with inverted off-bit is introduced and a
couple of fixes to Alpha PLLs handling of TEST_CTL updates.

The handling of active-only clocks in SMD RPM is improved, to ensure
votes are appropriately placed.

SC7180 camera GDSCs are made children of the titan_top GDSC.

A couple of fixes to the display clocks on QCM2290 and shared RCGs in
GCC are marked as such.

SDCC clocks for IPQ6018 and IPQ5332 are corrected to use floor ops, and
network-related resets on IPQ6018 are updated to cover all bits of each
reset.

Crypto clocks are added to IPQ9574 global clock controller, together
with a few cleanups.

Runtime PM is enabeld for SC8280XP GCC and GPUCC, and SM6375 GPUCC.

A few fixes for MSM8974 multi-media clock controller.

Support for some RCG clocks to be automatically controlled by downstream
branches, and added to SM8450 GCC clocks.

Further Kconfig depdenencies are introduce to avoid building Qualcomm
clock drivers on unrelated architectures.

Lastly, related DeviceTree binding updates are made.

The tail of this is not bisectable, due to the missing DeviceTree
binding include files. Rebase at this point in time is not desirable.

* tag 'qcom-clk-for-6.5-2' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux: (63 commits)
  clk: qcom: gcc-sc8280xp: Add runtime PM
  clk: qcom: gpucc-sc8280xp: Add runtime PM
  clk: qcom: mmcc-msm8974: fix MDSS_GDSC power flags
  clk: qcom: gpucc-sm6375: Enable runtime pm
  dt-bindings: clock: sm6375-gpucc: Add VDD_GX
  clk: qcom: gcc-sm6115: Add missing PLL config properties
  clk: qcom: clk-alpha-pll: Add a way to update some bits of test_ctl(_hi)
  clk: qcom: gcc-ipq6018: remove duplicate initializers
  clk: qcom: gcc-ipq9574: Enable crypto clocks
  dt-bindings: clock: Add crypto clock and reset definitions
  clk: qcom: Add lpass audio clock controller driver for SC8280XP
  clk: qcom: Add lpass clock controller driver for SC8280XP
  dt-bindings: clock: Add LPASS AUDIOCC and reset controller for SC8280XP
  dt-bindings: clock: Add LPASSCC and reset controller for SC8280XP
  dt-bindings: clock: qcom,mmcc: define clocks/clock-names for MSM8226
  clk: qcom: gpucc-sm8550: Add support for graphics clock controller
  clk: qcom: Add support for SM8450 GPUCC
  clk: qcom: gcc-sm8450: Enable hw_clk_ctrl
  clk: qcom: rcg2: Make hw_clk_ctrl toggleable
  dt-bindings: clock: qcom: Add SM8550 graphics clock controller
  ...
2023-06-26 14:56:06 -07:00
Linus Torvalds
88afbb21d4 Merge tag 'x86-core-2023-06-26' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 core updates from Thomas Gleixner:
 "A set of fixes for kexec(), reboot and shutdown issues:

   - Ensure that the WBINVD in stop_this_cpu() has been completed before
     the control CPU proceedes.

     stop_this_cpu() is used for kexec(), reboot and shutdown to park
     the APs in a HLT loop.

     The control CPU sends an IPI to the APs and waits for their CPU
     online bits to be cleared. Once they all are marked "offline" it
     proceeds.

     But stop_this_cpu() clears the CPU online bit before issuing
     WBINVD, which means there is no guarantee that the AP has reached
     the HLT loop.

     This was reported to cause intermittent reboot/shutdown failures
     due to some dubious interaction with the firmware.

     This is not only a problem of WBINVD. The code to actually "stop"
     the CPU which runs between clearing the online bit and reaching the
     HLT loop can cause large enough delays on its own (think
     virtualization). That's especially dangerous for kexec() as kexec()
     expects that all APs are in a safe state and not executing code
     while the boot CPU jumps to the new kernel. There are more issues
     vs kexec() which are addressed separately.

     Cure this by implementing an explicit synchronization point right
     before the AP reaches HLT. This guarantees that the AP has
     completed the full stop proceedure.

   - Fix the condition for WBINVD in stop_this_cpu().

     The WBINVD in stop_this_cpu() is required for ensuring that when
     switching to or from memory encryption no dirty data is left in the
     cache lines which might cause a write back in the wrong more later.

     This checks CPUID directly because the feature bit might have been
     cleared due to a command line option.

     But that CPUID check accesses leaf 0x8000001f::EAX unconditionally.
     Intel CPUs return the content of the highest supported leaf when a
     non-existing leaf is read, while AMD CPUs return all zeros for
     unsupported leafs.

     So the result of the test on Intel CPUs is lottery and on AMD its
     just correct by chance.

     While harmless it's incorrect and causes the conditional wbinvd()
     to be issued where not required, which caused the above issue to be
     unearthed.

   - Make kexec() robust against AP code execution

     Ashok observed triple faults when doing kexec() on a system which
     had been booted with "nosmt".

     It turned out that the SMT siblings which had been brought up
     partially are parked in mwait_play_dead() to enable power savings.

     mwait_play_dead() is monitoring the thread flags of the AP's idle
     task, which has been chosen as it's unlikely to be written to.

     But kexec() can overwrite the previous kernel text and data
     including page tables etc. When it overwrites the cache lines
     monitored by an AP that AP resumes execution after the MWAIT on
     eventually overwritten text, stack and page tables, which obviously
     might end up in a triple fault easily.

     Make this more robust in several steps:

      1) Use an explicit per CPU cache line for monitoring.

      2) Write a command to these cache lines to kick APs out of MWAIT
         before proceeding with kexec(), shutdown or reboot.

         The APs confirm the wakeup by writing status back and then
         enter a HLT loop.

      3) If the system uses INIT/INIT/STARTUP for AP bringup, park the
         APs in INIT state.

         HLT is not a guarantee that an AP won't wake up and resume
         execution. HLT is woken up by NMI and SMI. SMI puts the CPU
         back into HLT (+/- firmware bugs), but NMI is delivered to the
         CPU which executes the NMI handler. Same issue as the MWAIT
         scenario described above.

         Sending an INIT/INIT sequence to the APs puts them into wait
         for STARTUP state, which is safe against NMI.

     There is still an issue remaining which can't be fixed: #MCE

     If the AP sits in HLT and receives a broadcast #MCE it will try to
     handle it with the obvious consequences.

     INIT/INIT clears CR4.MCE in the AP which will cause a broadcast
     #MCE to shut down the machine.

     So there is a choice between fire (HLT) and frying pan (INIT).
     Frying pan has been chosen as it's at least preventing the NMI
     issue.

     On systems which are not using INIT/INIT/STARTUP there is not much
     which can be done right now, but at least the obvious and easy to
     trigger MWAIT issue has been addressed"

* tag 'x86-core-2023-06-26' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/smp: Put CPUs into INIT on shutdown if possible
  x86/smp: Split sending INIT IPI out into a helper function
  x86/smp: Cure kexec() vs. mwait_play_dead() breakage
  x86/smp: Use dedicated cache-line for mwait_play_dead()
  x86/smp: Remove pointless wmb()s from native_stop_other_cpus()
  x86/smp: Dont access non-existing CPUID leaf
  x86/smp: Make stop_other_cpus() more robust
2023-06-26 14:45:53 -07:00
Linus Torvalds
cd336f6562 Merge tag 'timers-core-2023-06-26' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer updates from Thomas Gleixner:
 "Time, timekeeping and related device driver updates:

  Core:

   - A set of fixes, cleanups and enhancements to the posix timer code:

       - Prevent another possible live lock scenario in the exit() path,
         which affects POSIX_CPU_TIMERS_TASK_WORK enabled architectures.

       - Fix a loop termination issue which was reported syzcaller/KSAN
         in the posix timer ID allocation code.

         That triggered a deeper look into the posix-timer code which
         unearthed more small issues.

       - Add missing READ/WRITE_ONCE() annotations

       - Fix or remove completely outdated comments

       - Document places which are subtle and completely undocumented.

   - Add missing hrtimer modes to the trace event decoder

   - Small cleanups and enhancements all over the place

  Drivers:

   - Rework the Hyper-V clocksource and sched clock setup code

   - Remove a deprecated clocksource driver

   - Small fixes and enhancements all over the place"

* tag 'timers-core-2023-06-26' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tip/tip: (39 commits)
  clocksource/drivers/cadence-ttc: Fix memory leak in ttc_timer_probe
  dt-bindings: timers: Add Ralink SoCs timer
  clocksource/drivers/hyper-v: Rework clocksource and sched clock setup
  dt-bindings: timer: brcm,kona-timer: convert to YAML
  clocksource/drivers/imx-gpt: Fold <soc/imx/timer.h> into its only user
  clk: imx: Drop inclusion of unused header <soc/imx/timer.h>
  hrtimer: Add missing sparse annotations to hrtimer locking
  clocksource/drivers/imx-gpt: Use only a single name for functions
  clocksource/drivers/loongson1: Move PWM timer to clocksource framework
  dt-bindings: timer: Add Loongson-1 clocksource
  MIPS: Loongson32: Remove deprecated PWM timer clocksource
  clocksource/drivers/ingenic-timer: Use pm_sleep_ptr() macro
  tracing/timer: Add missing hrtimer modes to decode_hrtimer_mode().
  posix-timers: Add sys_ni_posix_timers() prototype
  tick/rcu: Fix bogus ratelimit condition
  alarmtimer: Remove unnecessary (void *) cast
  alarmtimer: Remove unnecessary initialization of variable 'ret'
  posix-timers: Refer properly to CONFIG_HIGH_RES_TIMERS
  posix-timers: Polish coding style in a few places
  posix-timers: Remove pointless comments
  ...
2023-06-26 14:10:45 -07:00
Linus Torvalds
9244724fbf Merge tag 'smp-core-2023-06-26' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull SMP updates from Thomas Gleixner:
 "A large update for SMP management:

   - Parallel CPU bringup

     The reason why people are interested in parallel bringup is to
     shorten the (kexec) reboot time of cloud servers to reduce the
     downtime of the VM tenants.

     The current fully serialized bringup does the following per AP:

       1) Prepare callbacks (allocate, intialize, create threads)
       2) Kick the AP alive (e.g. INIT/SIPI on x86)
       3) Wait for the AP to report alive state
       4) Let the AP continue through the atomic bringup
       5) Let the AP run the threaded bringup to full online state

     There are two significant delays:

       #3 The time for an AP to report alive state in start_secondary()
          on x86 has been measured in the range between 350us and 3.5ms
          depending on vendor and CPU type, BIOS microcode size etc.

       #4 The atomic bringup does the microcode update. This has been
          measured to take up to ~8ms on the primary threads depending
          on the microcode patch size to apply.

     On a two socket SKL server with 56 cores (112 threads) the boot CPU
     spends on current mainline about 800ms busy waiting for the APs to
     come up and apply microcode. That's more than 80% of the actual
     onlining procedure.

     This can be reduced significantly by splitting the bringup
     mechanism into two parts:

       1) Run the prepare callbacks and kick the AP alive for each AP
          which needs to be brought up.

          The APs wake up, do their firmware initialization and run the
          low level kernel startup code including microcode loading in
          parallel up to the first synchronization point. (#1 and #2
          above)

       2) Run the rest of the bringup code strictly serialized per CPU
          (#3 - #5 above) as it's done today.

          Parallelizing that stage of the CPU bringup might be possible
          in theory, but it's questionable whether required surgery
          would be justified for a pretty small gain.

     If the system is large enough the first AP is already waiting at
     the first synchronization point when the boot CPU finished the
     wake-up of the last AP. That reduces the AP bringup time on that
     SKL from ~800ms to ~80ms, i.e. by a factor ~10x.

     The actual gain varies wildly depending on the system, CPU,
     microcode patch size and other factors. There are some
     opportunities to reduce the overhead further, but that needs some
     deep surgery in the x86 CPU bringup code.

     For now this is only enabled on x86, but the core functionality
     obviously works for all SMP capable architectures.

   - Enhancements for SMP function call tracing so it is possible to
     locate the scheduling and the actual execution points. That allows
     to measure IPI delivery time precisely"

* tag 'smp-core-2023-06-26' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tip/tip: (45 commits)
  trace,smp: Add tracepoints for scheduling remotelly called functions
  trace,smp: Add tracepoints around remotelly called functions
  MAINTAINERS: Add CPU HOTPLUG entry
  x86/smpboot: Fix the parallel bringup decision
  x86/realmode: Make stack lock work in trampoline_compat()
  x86/smp: Initialize cpu_primary_thread_mask late
  cpu/hotplug: Fix off by one in cpuhp_bringup_mask()
  x86/apic: Fix use of X{,2}APIC_ENABLE in asm with older binutils
  x86/smpboot/64: Implement arch_cpuhp_init_parallel_bringup() and enable it
  x86/smpboot: Support parallel startup of secondary CPUs
  x86/smpboot: Implement a bit spinlock to protect the realmode stack
  x86/apic: Save the APIC virtual base address
  cpu/hotplug: Allow "parallel" bringup up to CPUHP_BP_KICK_AP_STATE
  x86/apic: Provide cpu_primary_thread mask
  x86/smpboot: Enable split CPU startup
  cpu/hotplug: Provide a split up CPUHP_BRINGUP mechanism
  cpu/hotplug: Reset task stack state in _cpu_up()
  cpu/hotplug: Remove unused state functions
  riscv: Switch to hotplug core state synchronization
  parisc: Switch to hotplug core state synchronization
  ...
2023-06-26 13:59:56 -07:00
Linus Torvalds
7cffdbe360 Merge tag 'x86-boot-2023-06-26' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 boot updates from Thomas Gleixner:
 "Initialize FPU late.

  Right now FPU is initialized very early during boot. There is no real
  requirement to do so. The only requirement is to have it done before
  alternatives are patched.

  That's done in check_bugs() which does way more than what the function
  name suggests.

  So first rename check_bugs() to arch_cpu_finalize_init() which makes
  it clear what this is about.

  Move the invocation of arch_cpu_finalize_init() earlier in
  start_kernel() as it has to be done before fork_init() which needs to
  know the FPU register buffer size.

  With those prerequisites the FPU initialization can be moved into
  arch_cpu_finalize_init(), which removes it from the early and fragile
  part of the x86 bringup"

* tag 'x86-boot-2023-06-26' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mem_encrypt: Unbreak the AMD_MEM_ENCRYPT=n build
  x86/fpu: Move FPU initialization into arch_cpu_finalize_init()
  x86/fpu: Mark init functions __init
  x86/fpu: Remove cpuinfo argument from init functions
  x86/init: Initialize signal frame size late
  init, x86: Move mem_encrypt_init() into arch_cpu_finalize_init()
  init: Invoke arch_cpu_finalize_init() earlier
  init: Remove check_bugs() leftovers
  um/cpu: Switch to arch_cpu_finalize_init()
  sparc/cpu: Switch to arch_cpu_finalize_init()
  sh/cpu: Switch to arch_cpu_finalize_init()
  mips/cpu: Switch to arch_cpu_finalize_init()
  m68k/cpu: Switch to arch_cpu_finalize_init()
  loongarch/cpu: Switch to arch_cpu_finalize_init()
  ia64/cpu: Switch to arch_cpu_finalize_init()
  ARM: cpu: Switch to arch_cpu_finalize_init()
  x86/cpu: Switch to arch_cpu_finalize_init()
  init: Provide arch_cpu_finalize_init()
2023-06-26 13:39:10 -07:00
Linus Torvalds
0017387938 Merge tag 'irq-core-2023-06-26' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
 "Updates for the interrupt subsystem:

  Core:

   - Convert the interrupt descriptor storage to a maple tree to
     overcome the limitations of the radixtree + fixed size bitmap.

     This allows us to handle very large servers with a huge number of
     guests without imposing a huge memory overhead on everyone

   - Implement optional retriggering of interrupts which utilize the
     fasteoi handler to work around a GICv3 architecture issue

  Drivers:

   - A set of fixes and updates for the Loongson/Loongarch related
     drivers

   - Workaound for an ASR8601 integration hickup which ends up with CPU
     numbering which can't be represented in the GIC implementation

   - The usual set of boring fixes and updates all over the place"

* tag 'irq-core-2023-06-26' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
  Revert "irqchip/mxs: Include linux/irqchip/mxs.h"
  irqchip/jcore-aic: Fix missing allocation of IRQ descriptors
  irqchip/stm32-exti: Fix warning on initialized field overwritten
  irqchip/stm32-exti: Add STM32MP15xx IWDG2 EXTI to GIC map
  irqchip/gicv3: Add a iort_pmsi_get_dev_id() prototype
  irqchip/mxs: Include linux/irqchip/mxs.h
  irqchip/clps711x: Remove unused clps711x_intc_init() function
  irqchip/mmp: Remove non-DT codepath
  irqchip/ftintc010: Mark all function static
  irqdomain: Include internals.h for function prototypes
  irqchip/loongson-eiointc: Add DT init support
  dt-bindings: interrupt-controller: Add Loongson EIOINTC
  irqchip/loongson-eiointc: Fix irq affinity setting during resume
  irqchip/loongson-liointc: Add IRQCHIP_SKIP_SET_WAKE flag
  irqchip/loongson-liointc: Fix IRQ trigger polarity
  irqchip/loongson-pch-pic: Fix potential incorrect hwirq assignment
  irqchip/loongson-pch-pic: Fix initialization of HT vector register
  irqchip/gic-v3-its: Enable RESEND_WHEN_IN_PROGRESS for LPIs
  genirq: Allow fasteoi handler to resend interrupts on concurrent handling
  genirq: Expand doc for PENDING and REPLAY flags
  ...
2023-06-26 13:34:39 -07:00
Linus Torvalds
cef2dd7653 Merge tag 'core-debugobjects-2023-06-26' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull debugobjects update from Thomas Gleixner:
 "A single update for debug objects:

   - Recheck whether debug objects is enabled before reporting a problem
     to avoid spamming the logs with messages which are caused by a
     concurrent OOM"

* tag 'core-debugobjects-2023-06-26' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  debugobjects: Recheck debug_objects_enabled before reporting
2023-06-26 13:33:00 -07:00
Jakub Kicinski
61dc651cdf Merge tag 'nf-next-23-06-26' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next
Pablo Neira Ayuso says:

====================
Netfilter/IPVS updates for net-next

1) Allow slightly larger IPVS connection table size from Kconfig for
   64-bit arch, from Abhijeet Rastogi.

2) Since IPVS connection table might be larger than 2^20 after previous
   patch, allow to limit it depending on the available memory.
   Moreover, use kvmalloc. From Julian Anastasov.

3) Do not rebuild VLAN header in nft_payload when matching source and
   destination MAC address.

4) Remove nested rcu read lock side in ip_set_test(), from Florian Westphal.

5) Allow to update set size, also from Florian.

6) Improve NAT tuple selection when connection is closing,
   from Florian Westphal.

7) Support for resetting set element stateful expression, from Phil Sutter.

8) Use NLA_POLICY_MAX to narrow down maximum attribute value in nf_tables,
   from Florian Westphal.

* tag 'nf-next-23-06-26' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
  netfilter: nf_tables: limit allowed range via nla_policy
  netfilter: nf_tables: Introduce NFT_MSG_GETSETELEM_RESET
  netfilter: snat: evict closing tcp entries on reply tuple collision
  netfilter: nf_tables: permit update of set size
  netfilter: ipset: remove rcu_read_lock_bh pair from ip_set_test
  netfilter: nft_payload: rebuild vlan header when needed
  ipvs: dynamically limit the connection hash table
  ipvs: increase ip_vs_conn_tab_bits range for 64BIT
====================

Link: https://lore.kernel.org/r/20230626064749.75525-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-26 12:59:18 -07:00
Linus Torvalds
a0433f8cae Merge tag 'for-6.5/block-2023-06-23' of git://git.kernel.dk/linux
Pull block updates from Jens Axboe:

 - NVMe pull request via Keith:
      - Various cleanups all around (Irvin, Chaitanya, Christophe)
      - Better struct packing (Christophe JAILLET)
      - Reduce controller error logs for optional commands (Keith)
      - Support for >=64KiB block sizes (Daniel Gomez)
      - Fabrics fixes and code organization (Max, Chaitanya, Daniel
        Wagner)

 - bcache updates via Coly:
      - Fix a race at init time (Mingzhe Zou)
      - Misc fixes and cleanups (Andrea, Thomas, Zheng, Ye)

 - use page pinning in the block layer for dio (David)

 - convert old block dio code to page pinning (David, Christoph)

 - cleanups for pktcdvd (Andy)

 - cleanups for rnbd (Guoqing)

 - use the unchecked __bio_add_page() for the initial single page
   additions (Johannes)

 - fix overflows in the Amiga partition handling code (Michael)

 - improve mq-deadline zoned device support (Bart)

 - keep passthrough requests out of the IO schedulers (Christoph, Ming)

 - improve support for flush requests, making them less special to deal
   with (Christoph)

 - add bdev holder ops and shutdown methods (Christoph)

 - fix the name_to_dev_t() situation and use cases (Christoph)

 - decouple the block open flags from fmode_t (Christoph)

 - ublk updates and cleanups, including adding user copy support (Ming)

 - BFQ sanity checking (Bart)

 - convert brd from radix to xarray (Pankaj)

 - constify various structures (Thomas, Ivan)

 - more fine grained persistent reservation ioctl capability checks
   (Jingbo)

 - misc fixes and cleanups (Arnd, Azeem, Demi, Ed, Hengqi, Hou, Jan,
   Jordy, Li, Min, Yu, Zhong, Waiman)

* tag 'for-6.5/block-2023-06-23' of git://git.kernel.dk/linux: (266 commits)
  scsi/sg: don't grab scsi host module reference
  ext4: Fix warning in blkdev_put()
  block: don't return -EINVAL for not found names in devt_from_devname
  cdrom: Fix spectre-v1 gadget
  block: Improve kernel-doc headers
  blk-mq: don't insert passthrough request into sw queue
  bsg: make bsg_class a static const structure
  ublk: make ublk_chr_class a static const structure
  aoe: make aoe_class a static const structure
  block/rnbd: make all 'class' structures const
  block: fix the exclusive open mask in disk_scan_partitions
  block: add overflow checks for Amiga partition support
  block: change all __u32 annotations to __be32 in affs_hardblocks.h
  block: fix signed int overflow in Amiga partition support
  block: add capacity validation in bdev_add_partition()
  block: fine-granular CAP_SYS_ADMIN for Persistent Reservation
  block: disallow Persistent Reservation on partitions
  reiserfs: fix blkdev_put() warning from release_journal_dev()
  block: fix wrong mode for blkdev_get_by_dev() from disk_scan_partitions()
  block: document the holder argument to blkdev_get_by_path
  ...
2023-06-26 12:47:20 -07:00
Linus Torvalds
0aa69d53ac Merge tag 'for-6.5/io_uring-2023-06-23' of git://git.kernel.dk/linux
Pull io_uring updates from Jens Axboe:
 "Nothing major in this release, just a bunch of cleanups and some
  optimizations around networking mostly.

   - clean up file request flags handling (Christoph)

   - clean up request freeing and CQ locking (Pavel)

   - support for using pre-registering the io_uring fd at setup time
     (Josh)

   - Add support for user allocated ring memory, rather than having the
     kernel allocate it. Mostly for packing rings into a huge page (me)

   - avoid an unnecessary double retry on receive (me)

   - maintain ordering for task_work, which also improves performance
     (me)

   - misc cleanups/fixes (Pavel, me)"

* tag 'for-6.5/io_uring-2023-06-23' of git://git.kernel.dk/linux: (39 commits)
  io_uring: merge conditional unlock flush helpers
  io_uring: make io_cq_unlock_post static
  io_uring: inline __io_cq_unlock
  io_uring: fix acquire/release annotations
  io_uring: kill io_cq_unlock()
  io_uring: remove IOU_F_TWQ_FORCE_NORMAL
  io_uring: don't batch task put on reqs free
  io_uring: move io_clean_op()
  io_uring: inline io_dismantle_req()
  io_uring: remove io_free_req_tw
  io_uring: open code io_put_req_find_next
  io_uring: add helpers to decode the fixed file file_ptr
  io_uring: use io_file_from_index in io_msg_grab_file
  io_uring: use io_file_from_index in __io_sync_cancel
  io_uring: return REQ_F_ flags from io_file_get_flags
  io_uring: remove io_req_ffs_set
  io_uring: remove a confusing comment above io_file_get_flags
  io_uring: remove the mode variable in io_file_get_flags
  io_uring: remove __io_file_supports_nowait
  io_uring: wait interruptibly for request completions on exit
  ...
2023-06-26 12:30:26 -07:00
Linus Torvalds
3eccc0c886 Merge tag 'for-6.5/splice-2023-06-23' of git://git.kernel.dk/linux
Pull splice updates from Jens Axboe:
 "This kills off ITER_PIPE to avoid a race between truncate,
  iov_iter_revert() on the pipe and an as-yet incomplete DMA to a bio
  with unpinned/unref'ed pages from an O_DIRECT splice read. This causes
  memory corruption.

  Instead, we either use (a) filemap_splice_read(), which invokes the
  buffered file reading code and splices from the pagecache into the
  pipe; (b) copy_splice_read(), which bulk-allocates a buffer, reads
  into it and then pushes the filled pages into the pipe; or (c) handle
  it in filesystem-specific code.

  Summary:

   - Rename direct_splice_read() to copy_splice_read()

   - Simplify the calculations for the number of pages to be reclaimed
     in copy_splice_read()

   - Turn do_splice_to() into a helper, vfs_splice_read(), so that it
     can be used by overlayfs and coda to perform the checks on the
     lower fs

   - Make vfs_splice_read() jump to copy_splice_read() to handle
     direct-I/O and DAX

   - Provide shmem with its own splice_read to handle non-existent pages
     in the pagecache. We don't want a ->read_folio() as we don't want
     to populate holes, but filemap_get_pages() requires it

   - Provide overlayfs with its own splice_read to call down to a lower
     layer as overlayfs doesn't provide ->read_folio()

   - Provide coda with its own splice_read to call down to a lower layer
     as coda doesn't provide ->read_folio()

   - Direct ->splice_read to copy_splice_read() in tty, procfs, kernfs
     and random files as they just copy to the output buffer and don't
     splice pages

   - Provide wrappers for afs, ceph, ecryptfs, ext4, f2fs, nfs, ntfs3,
     ocfs2, orangefs, xfs and zonefs to do locking and/or revalidation

   - Make cifs use filemap_splice_read()

   - Replace pointers to generic_file_splice_read() with pointers to
     filemap_splice_read() as DIO and DAX are handled in the caller;
     filesystems can still provide their own alternate ->splice_read()
     op

   - Remove generic_file_splice_read()

   - Remove ITER_PIPE and its paraphernalia as generic_file_splice_read
     was the only user"

* tag 'for-6.5/splice-2023-06-23' of git://git.kernel.dk/linux: (31 commits)
  splice: kdoc for filemap_splice_read() and copy_splice_read()
  iov_iter: Kill ITER_PIPE
  splice: Remove generic_file_splice_read()
  splice: Use filemap_splice_read() instead of generic_file_splice_read()
  cifs: Use filemap_splice_read()
  trace: Convert trace/seq to use copy_splice_read()
  zonefs: Provide a splice-read wrapper
  xfs: Provide a splice-read wrapper
  orangefs: Provide a splice-read wrapper
  ocfs2: Provide a splice-read wrapper
  ntfs3: Provide a splice-read wrapper
  nfs: Provide a splice-read wrapper
  f2fs: Provide a splice-read wrapper
  ext4: Provide a splice-read wrapper
  ecryptfs: Provide a splice-read wrapper
  ceph: Provide a splice-read wrapper
  afs: Provide a splice-read wrapper
  9p: Add splice_read wrapper
  net: Make sock_splice_read() use copy_splice_read() by default
  tty, proc, kernfs, random: Use copy_splice_read()
  ...
2023-06-26 11:52:12 -07:00
Linus Torvalds
cc423f6337 Merge tag 'for-6.5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs updates from David Sterba:
 "Mainly core changes, refactoring and optimizations.

  Performance is improved in some areas, overall there may be a
  cumulative improvement due to refactoring that removed lookups in the
  IO path or simplified IO submission tracking.

  Core:

   - submit IO synchronously for fast checksums (crc32c and xxhash),
     remove high priority worker kthread

   - read extent buffer in one go, simplify IO tracking, bio submission
     and locking

   - remove additional tracking of redirtied extent buffers, originally
     added for zoned mode but actually not needed

   - track ordered extent pointer in bio to avoid rbtree lookups during
     IO

   - scrub, use recovered data stripes as cache to avoid unnecessary
     read

   - in zoned mode, optimize logical to physical mappings of extents

   - remove PageError handling, not set by VFS nor writeback

   - cleanups, refactoring, better structure packing

   - lots of error handling improvements

   - more assertions, lockdep annotations

   - print assertion failure with the exact line where it happens

   - tracepoint updates

   - more debugging prints

  Performance:

   - speedup in fsync(), better tracking of inode logged status can
     avoid transaction commit

   - IO path structures track logical offsets in data structures and
     does not need to look it up

  User visible changes:

   - don't commit transaction for every created subvolume, this can
     reduce time when many subvolumes are created in a batch

   - print affected files when relocation fails

   - trigger orphan file cleanup during START_SYNC ioctl

  Notable fixes:

   - fix crash when disabling quota and relocation

   - fix crashes when removing roots from drity list

   - fix transacion abort during relocation when converting from newer
     profiles not covered by fallback

   - in zoned mode, stop reclaiming block groups if filesystem becomes
     read-only

   - fix rare race condition in tree mod log rewind that can miss some
     btree node slots

   - with enabled fsverity, drop up-to-date page bit in case the
     verification fails"

* tag 'for-6.5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (194 commits)
  btrfs: fix race between quota disable and relocation
  btrfs: add comment to struct btrfs_fs_info::dirty_cowonly_roots
  btrfs: fix race when deleting free space root from the dirty cow roots list
  btrfs: fix race when deleting quota root from the dirty cow roots list
  btrfs: tracepoints: also show actual number of the outstanding extents
  btrfs: update i_version in update_dev_time
  btrfs: make btrfs_compressed_bioset static
  btrfs: add handling for RAID1C23/DUP to btrfs_reduce_alloc_profile
  btrfs: scrub: remove btrfs_fs_info::scrub_wr_completion_workers
  btrfs: scrub: remove scrub_ctx::csum_list member
  btrfs: do not BUG_ON after failure to migrate space during truncation
  btrfs: do not BUG_ON on failure to get dir index for new snapshot
  btrfs: send: do not BUG_ON() on unexpected symlink data extent
  btrfs: do not BUG_ON() when dropping inode items from log root
  btrfs: replace BUG_ON() at split_item() with proper error handling
  btrfs: do not BUG_ON() on tree mod log failures at btrfs_del_ptr()
  btrfs: do not BUG_ON() on tree mod log failures at insert_ptr()
  btrfs: do not BUG_ON() on tree mod log failure at insert_new_root()
  btrfs: do not BUG_ON() on tree mod log failures at push_nodes_for_insert()
  btrfs: abort transaction at update_ref_for_cow() when ref count is zero
  ...
2023-06-26 11:41:38 -07:00
Linus Torvalds
e940efa936 Merge tag 'zonefs-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs
Pull zonefs updates from Damien Le Moal:

 - Modify the synchronous direct write path to use iomap instead of
   manually coding issuing zone append write BIOs (me)

 - Use the FMODE_CAN_ODIRECT file flag to indicate support from direct
   IO instead of using the old way with noop direct_io methods
   (Christoph)

* tag 'zonefs-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs:
  zonefs: set FMODE_CAN_ODIRECT instead of a dummy direct_IO method
  zonefs: use iomap for synchronous direct writes
2023-06-26 11:29:29 -07:00
Linus Torvalds
098c5dd9cf Merge tag 'erofs-for-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs
Pull erofs updates from Gao Xiang:
 "No outstanding new feature for this cycle.

  Most of these commits are decompression cleanups which are part of the
  ongoing development for subpage/folio compression support as well as
  xattr cleanups for the upcoming xattr bloom filter optimization [1].

  In addition, there are bugfixes to address some corner cases of
  compressed images due to global data de-duplication and arm64 16k
  pages.

  Summary:

   - Fix rare I/O hang on deduplicated compressed images due to loop
     hooked chains

   - Fix compact compression layout of 16k blocks on arm64 devices

   - Fix atomic context detection of async decompression

   - Decompression/Xattr code cleanups"

Link: https://lore.kernel.org/r/20230621083209.116024-1-jefflexu@linux.alibaba.com [1]

* tag 'erofs-for-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
  erofs: clean up zmap.c
  erofs: remove unnecessary goto
  erofs: Fix detection of atomic context
  erofs: use separate xattr parsers for listxattr/getxattr
  erofs: unify inline/shared xattr iterators for listxattr/getxattr
  erofs: make the size of read data stored in buffer_ofs
  erofs: unify xattr_iter structures
  erofs: use absolute position in xattr iterator
  erofs: fix compact 4B support for 16k block size
  erofs: convert erofs_read_metabuf() to erofs_bread() for xattr
  erofs: use poison pointer to replace the hard-coded address
  erofs: use struct lockref to replace handcrafted approach
  erofs: adapt managed inode operations into folios
  erofs: kill hooked chains to avoid loops on deduplicated compressed images
  erofs: avoid on-stack pagepool directly passed by arguments
  erofs: allocate extra bvec pages directly instead of retrying
  erofs: clean up z_erofs_pcluster_readmore()
  erofs: remove the member readahead from struct z_erofs_decompress_frontend
  erofs: fold in z_erofs_decompress()
2023-06-26 11:00:18 -07:00
Bjorn Helgaas
6ecac465ee Merge branch 'pci/controller/remove-void-callbacks'
- Convert platform_device .remove() callbacks to return void instead of a
  mostly useless int (Uwe Kleine-König)

* pci/controller/remove-void-callbacks:
  PCI: xgene-msi: Convert to platform remove callback returning void
  PCI: tegra: Convert to platform remove callback returning void
  PCI: rockchip-host: Convert to platform remove callback returning void
  PCI: mvebu: Convert to platform remove callback returning void
  PCI: mt7621: Convert to platform remove callback returning void
  PCI: mediatek-gen3: Convert to platform remove callback returning void
  PCI: mediatek: Convert to platform remove callback returning void
  PCI: iproc: Convert to platform remove callback returning void
  PCI: hisi-error: Convert to platform remove callback returning void
  PCI: dwc: Convert to platform remove callback returning void
  PCI: j721e: Convert to platform remove callback returning void
  PCI: brcmstb: Convert to platform remove callback returning void
  PCI: altera-msi: Convert to platform remove callback returning void
  PCI: altera: Convert to platform remove callback returning void
  PCI: aardvark: Convert to platform remove callback returning void
2023-06-26 13:00:00 -05:00
Bjorn Helgaas
d8c226ac1f Merge branch 'pci/controller/endpoint'
- Change "PCI Endpoint Virtual NTB driver" Kconfig prompt to be different
  from "PCI Endpoint NTB driver" (Shunsuke Mie)

- Automatically create a function specific attributes group for endpoint
  drivers to avoid reference counting issues (Damien Le Moal)

- Move and unexport pci_epf_type_add_cfs() (Damien Le Moal)

- Reinitialize EPF test DMA transfer completion before submitting it to
  avoid losing the completion notification (Damien Le Moal)

- Fix EPF test DMA transfer completion detection (Damien Le Moal)

- Submit EPF test DMA transfers with dmaengine_submit(), not tx_submit()
  (Damien Le Moal)

- Simplify EPF test read/write/copy functions (Damien Le Moal)

- Simplify EPF test "raise IRQ" interface (Damien Le Moal)

- Simplify EPF test IRQ command execution (Damien Le Moal)

- Improve EPF test command/status register handling (Damien Le Moal)

- Free IRQs before removing device (Damien Le Moal)

- Reinitialize IRQ completions for every test (Damien Le Moal)

- Don't write status in IRQ handler to avoid race (Damien Le Moal)

- Fix dma_chan direction in data transfer test (Yoshihiro Shimoda)

- Return pci_epf_type_add_cfs() error if EPF has no driver (Damien Le Moal)

- Add kernel-doc for pci_epc_raise_irq() and pci_epc_map_msi_irq() MSI
  vector parameters (Manivannan Sadhasivam)

- Pass EPF device ID to driver probe functions (Manivannan Sadhasivam)

- Return -EALREADY if EPC has already been started/stopped (Manivannan
  Sadhasivam)

- Add linkdown notifier support and use it in qcom-ep (Manivannan
  Sadhasivam)

- Add Bus Master Enable event support and use it in qcom-ep (Manivannan
  Sadhasivam)

- Add Qualcomm Modem Host Interface (MHI) endpoint driver (Manivannan
  Sadhasivam)

- Add Layerscape PME interrupt handling to manage link-up notification
  (Frank Li)

* pci/controller/endpoint:
  PCI: layerscape: Add the endpoint linkup notifier support
  PCI: endpoint: pci-epf-vntb: Fix typo in comments
  MAINTAINERS: Add PCI MHI endpoint function driver under MHI bus
  PCI: endpoint: Add PCI Endpoint function driver for MHI bus
  PCI: qcom-ep: Add support for BME notification
  PCI: qcom-ep: Add support for Link down notification
  PCI: endpoint: Add BME notifier support
  PCI: endpoint: Add linkdown notifier support
  PCI: endpoint: Return error if EPC is started/stopped multiple times
  PCI: endpoint: Pass EPF device ID to the probe function
  PCI: endpoint: Add missing documentation about the MSI/MSI-X range
  PCI: endpoint: Improve pci_epf_type_add_cfs()
  PCI: endpoint: functions/pci-epf-test: Fix dma_chan direction
  misc: pci_endpoint_test: Simplify pci_endpoint_test_msi_irq()
  misc: pci_endpoint_test: Do not write status in IRQ handler
  misc: pci_endpoint_test: Re-init completion for every test
  misc: pci_endpoint_test: Free IRQs before removing the device
  PCI: epf-test: Simplify transfers result print
  PCI: epf-test: Simplify DMA support checks
  PCI: epf-test: Cleanup request result handling
  PCI: epf-test: Cleanup pci_epf_test_cmd_handler()
  PCI: epf-test: Improve handling of command and status registers
  PCI: epf-test: Simplify IRQ test commands execution
  PCI: epf-test: Simplify pci_epf_test_raise_irq()
  PCI: epf-test: Simplify read/write/copy test functions
  PCI: epf-test: Use dmaengine_submit() to initiate DMA transfer
  PCI: epf-test: Fix DMA transfer completion detection
  PCI: epf-test: Fix DMA transfer completion initialization
  PCI: endpoint: Move pci_epf_type_add_cfs() code
  PCI: endpoint: Automatically create a function specific attributes group
  PCI: endpoint: Fix a Kconfig prompt of vNTB driver
2023-06-26 13:00:00 -05:00
Bjorn Helgaas
b5abb12cdd Merge branch 'pci/controller/vmd'
- Reset VMD config register between soft reboots (Nirmal Patel)

- Capture pci_reset_bus() return value instead of printing junk when it
  fails (Xinghui Li)

* pci/controller/vmd:
  PCI: vmd: Fix uninitialized variable usage in vmd_enable_domain()
  PCI: vmd: Reset VMD config register between soft reboots
2023-06-26 12:59:59 -05:00
Bjorn Helgaas
9f5eb1bf55 Merge branch 'pci/controller/rockchip'
- Remove writes to unused registers (Rick Wertenbroek)

- Write endpoint Device ID using correct register (Rick Wertenbroek)

- Assert PCI Configuration Enable bit after probe so endpoint responds
  instead of generating Request Retry Status messages (Rick Wertenbroek)

- Poll waiting for PHY PLLs to lock (Rick Wertenbroek)

- Update RK3399 example DT binding to be valid (Rick Wertenbroek)

- Use RK3399 PCIE_CLIENT_LEGACY_INT_CTRL to generate INTx instead of
  manually generating PCIe message (Rick Wertenbroek)

- Use multiple windows to avoid address translation conflicts (Rick
  Wertenbroek)

- Use u32 (not u16) when accessing 32-bit registers (Rick Wertenbroek)

- Hide MSI-X Capability, since RK3399 can't generate MSI-X (Rick
  Wertenbroek)

- Set endpoint controller required alignment to 256 (Damien Le Moal)

* pci/controller/rockchip:
  PCI: rockchip: Set address alignment for endpoint mode
  PCI: rockchip: Don't advertise MSI-X in PCIe capabilities
  PCI: rockchip: Use u32 variable to access 32-bit registers
  PCI: rockchip: Fix window mapping and address translation for endpoint
  PCI: rockchip: Fix legacy IRQ generation for RK3399 PCIe endpoint core
  dt-bindings: PCI: Update the RK3399 example to a valid one
  PCI: rockchip: Add poll and timeout to wait for PHY PLLs to be locked
  PCI: rockchip: Assert PCI Configuration Enable bit after probe
  PCI: rockchip: Write PCI Device ID to correct register
  PCI: rockchip: Remove writes to unused registers
2023-06-26 12:59:59 -05:00
Bjorn Helgaas
9cd5f2cec7 Merge branch 'pci/controller/rcar'
- Remove unused static pcie_base and pcie_dev (Geert Uytterhoeven)

* pci/controller/rcar:
  PCI: rcar: Use correct product family name for Renesas R-Car
  PCI: rcar-host: Remove unused static pcie_base and pcie_dev
2023-06-26 12:59:59 -05:00
Bjorn Helgaas
5c13b3c19a Merge branch 'pci/controller/qcom'
- Disable register write access after init for IP v2.3.3, v2.9.0
  (Manivannan Sadhasivam)

- Use DWC helpers for enabling/disabling writes to DBI registers
  (Manivannan Sadhasivam)

- Hide slot hotplug capability for IP v1.0.0, v1.9.0, v2.1.0, v2.3.2,
  v2.3.3, v2.7.0, v2.9.0 (Manivannan Sadhasivam)

- Reuse v2.3.2 post-init sequence for v2.4.0 (Manivannan Sadhasivam)

-

* pci/controller/qcom:
  PCI: qcom: Do not advertise hotplug capability for IP v2.1.0
  PCI: qcom: Do not advertise hotplug capability for IP v1.0.0
  PCI: qcom: Use post init sequence of IP v2.3.2 for v2.4.0
  PCI: qcom: Do not advertise hotplug capability for IP v2.3.2
  PCI: qcom: Do not advertise hotplug capability for IPs v2.3.3 and v2.9.0
  PCI: qcom: Do not advertise hotplug capability for IPs v2.7.0 and v1.9.0
  PCI: qcom: Disable write access to read only registers for IP v2.9.0
  PCI: qcom: Use DWC helpers for modifying the read-only DBI registers
  PCI: qcom: Disable write access to read only registers for IP v2.3.3
2023-06-26 12:59:59 -05:00
Bjorn Helgaas
69fa3ef3d2 Merge branch 'pci/pci/ftpci100'
- Release clock resources on error paths (Junyan Ye)

* pci/pci/ftpci100:
  PCI: ftpci100: Release the clock resources
2023-06-26 12:59:58 -05:00
Bjorn Helgaas
99f7b80906 Merge branch 'pci/controller/dwc'
- Wait for link to come up only if we've initiated link training (Ajay
  Agarwal)

- Save and restore imx6 Root Port MSI control to work around hardware
  defect (Richard Zhu)

* pci/controller/dwc:
  PCI: imx6: Save and restore root port MSI control in suspend and resume
  PCI: dwc: Wait for link up only if link is started
2023-06-26 12:59:58 -05:00
Bjorn Helgaas
375328faa2 Merge branch 'pci/controller/cadence'
- Wait for link retrain to complete when working around the J721E i2085
  erratum with Gen2 mode (Siddharth Vadapalli)

* pci/controller/cadence:
  PCI: cadence: Fix Gen2 Link Retraining process
2023-06-26 12:59:58 -05:00
Bjorn Helgaas
41370553c0 Merge branch 'pci/controller/dt'
- Add Qualcomm SDX65 endpoint DT compatible string (Rohit Agarwal)

* pci/controller/dt:
  dt-bindings: PCI: qcom: Add SDX65 SoC
2023-06-26 12:59:57 -05:00
Bjorn Helgaas
30fec3b884 Merge branch 'pci/misc'
- Add pci_clear_master() stub for non-CONFIG_PCI (Sui Jingfeng)

- Correct documentation typos (Randy Dunlap)

* pci/misc:
  Documentation: PCI: correct spelling
  PCI: Add pci_clear_master() stub for non-CONFIG_PCI
  PCI: Expand comment about sorting pci_ids.h entries
2023-06-26 12:59:57 -05:00
Bjorn Helgaas
283810ac54 Merge branch 'pci/virtualization'
- Delay extra 250ms after FLR of Solidigm P44 Pro NVMe to avoid KVM hang
  when guest is rebooted (Mike Pastore)

- Add function 1 DMA alias quirk for Marvell 88SE9235 (Robin Murphy)

* pci/virtualization:
  PCI: Add function 1 DMA alias quirk for Marvell 88SE9235
  PCI: Delay after FLR of Solidigm P44 Pro NVMe
2023-06-26 12:59:57 -05:00
Bjorn Helgaas
d0b7b3a422 Merge branch 'pci/resource'
- When we coalesce host bridge windows, remove invalidated resources from
  the resource tree so future allocations work correctly (Ross Lagerwall)

* pci/resource:
  PCI: Release resource invalidated by coalescing
2023-06-26 12:59:57 -05:00
Bjorn Helgaas
7e229f0e05 Merge branch 'pci/pm'
- Reduce wait time for secondary bus to be ready to speed up resume (Mika
  Westerberg)

- Avoid putting EloPOS E2/S2/H2 (as well as Elo i2) PCIe Ports in D3cold
  (Ondrej Zary)

- Call _REG when transitioning D-states so AML that uses the PCI config
  space OpRegion works, which fixes some ASMedia GPIO controllers (Mario
  Limonciello)

* pci/pm:
  PCI/ACPI: Call _REG when transitioning D-states
  PCI/ACPI: Validate acpi_pci_set_power_state() parameter
  PCI/PM: Avoid putting EloPOS E2/S2/H2 PCIe Ports in D3cold
  PCI/PM: Shorten pci_bridge_wait_for_secondary_bus() wait time for slow links
2023-06-26 12:59:56 -05:00
Bjorn Helgaas
db5ccb2eda Merge branch 'pci/hotplug'
- Simplify Attention Button logging (Bjorn Helgaas)

- Cancel bringup sequence if card is not present, to keep from blinking
  Power Indicator indefinitely (Rongguang Wei)

- Reassign bridge resources if necessary for ACPI hotplug (Igor Mammedov)

* pci/hotplug:
  PCI: acpiphp: Reassign resources on bridge if necessary
  PCI: pciehp: Cancel bringup sequence if card is not present
  PCI: pciehp: Simplify Attention Button logging
2023-06-26 12:59:56 -05:00
Bjorn Helgaas
1abb473903 Merge branch 'pci/enumeration'
- Add PCI_EXT_CAP_ID_PL_32GT define (Ben Dooks)

- Propagate firmware node by calling device_set_node() for better
  modularity (Andy Shevchenko)

- Discover Data Link Layer Link Active Reporting earlier so quirks can take
  advantage of it (Maciej W. Rozycki)

- Use cached Data Link Layer Link Active Reporting capability in pciehp,
  powerpc/eeh, and mlx5 (Maciej W. Rozycki)

- Run quirk for devices that require OS to clear Retrain Link earlier, so
  later quirks can rely on it (Maciej W. Rozycki)

- Export pcie_retrain_link() for use outside ASPM (Maciej W. Rozycki)

- Add Data Link Layer Link Active Reporting as another way for
  pcie_retrain_link() to determine the link is up (Maciej W. Rozycki)

- Work around link training failures (especially on the ASMedia ASM2824
  switch) by training first at 2.5GT/s and then attempting higher rates
  (Maciej W. Rozycki)

* pci/enumeration:
  PCI: Add failed link recovery for device reset events
  PCI: Work around PCIe link training failures
  PCI: Use pcie_wait_for_link_status() in pcie_wait_for_link_delay()
  PCI: Add support for polling DLLLA to pcie_retrain_link()
  PCI: Export pcie_retrain_link() for use outside ASPM
  PCI: Export PCIe link retrain timeout
  PCI: Execute quirk_enable_clear_retrain_link() earlier
  PCI/ASPM: Factor out waiting for link training to complete
  PCI/ASPM: Avoid unnecessary pcie_link_state use
  PCI/ASPM: Use distinct local vars in pcie_retrain_link()
  net/mlx5: Rely on dev->link_active_reporting
  powerpc/eeh: Rely on dev->link_active_reporting
  PCI: pciehp: Rely on dev->link_active_reporting
  PCI: Initialize dev->link_active_reporting earlier
  PCI: of: Propagate firmware node by calling device_set_node()
  PCI: Add PCI_EXT_CAP_ID_PL_32GT define

# Conflicts:
#	drivers/pci/pcie/aspm.c
2023-06-26 12:59:56 -05:00
Bjorn Helgaas
0f32114ea0 Merge branch 'pci/aspm'
- Disable ASPM on MFD function removal to avoid use-after-free (Ding Hui)

- Tighten up pci_enable_link_state() and pci_disable_link_state()
  interfaces so they don't enable/disable states the driver didn't specify
  (Ajay Agarwal)

- Avoid link retraining race that can happen if ASPM sets link control
  parameters while the link is in the midst of training for some other
  reason (Ilpo Järvinen)

* pci/aspm:
  PCI/ASPM: Avoid link retraining race
  PCI/ASPM: Factor out pcie_wait_for_retrain()
  PCI/ASPM: Return 0 or -ETIMEDOUT from  pcie_retrain_link()
  PCI/ASPM: Remove unnecessary ASPM_STATE_L1SS check
  PCI/ASPM: Rename L1.2-specific functions from 'l1ss' to 'l12'
  PCI/ASPM: Set ASPM_STATE_L1 when driver enables L1.1 or L1.2
  PCI/ASPM: Set only ASPM_STATE_L1 when driver enables L1
  PCI/ASPM: Disable only ASPM_STATE_L1 when driver disables L1
  PCI/ASPM: Disable ASPM on MFD function removal to avoid use-after-free
2023-06-26 12:59:55 -05:00
Bjorn Helgaas
a274a4e65f Merge branch 'pci/aer'
- Unexport pci_save_aer_state() since it's only used in drivers/pci/ (Bjorn
  Helgaas)

- Drop recommendation for drivers to configure AER Capability, since the
  PCI core does this for all devices (Dave Jiang, Bjorn Helgaas)

* pci/aer:
  Documentation: PCI: Tidy AER documentation
  Documentation: PCI: Update cross references to .rst files
  Documentation: PCI: Drop recommendation to configure AER Capability
  PCI: Unexport pci_save_aer_state()
2023-06-26 12:59:55 -05:00
Linus Torvalds
74774e243c Merge tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux
Pull fsverity updates from Eric Biggers:
 "Several updates for fs/verity/:

   - Do all hashing with the shash API instead of with the ahash API.

     This simplifies the code and reduces API overhead. It should also
     make things slightly easier for XFS's upcoming support for
     fsverity. It does drop fsverity's support for off-CPU hash
     accelerators, but that support was incomplete and not known to be
     used

   - Update and export fsverity_get_digest() so that it's ready for
     overlayfs's upcoming support for fsverity checking of lowerdata

   - Improve the documentation for builtin signature support

   - Fix a bug in the large folio support"

* tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux:
  fsverity: improve documentation for builtin signature support
  fsverity: rework fsverity_get_digest() again
  fsverity: simplify error handling in verify_data_block()
  fsverity: don't use bio_first_page_all() in fsverity_verify_bio()
  fsverity: constify fsverity_hash_alg
  fsverity: use shash API instead of ahash API
2023-06-26 10:56:13 -07:00
Linus Torvalds
4d483ab702 Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/linux
Pull fscrypt update from Eric Biggers:
 "Just one flex array conversion patch"

* tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/linux:
  fscrypt: Replace 1-element array with flexible array
2023-06-26 10:54:55 -07:00
Linus Torvalds
f7976a6493 Merge tag 'nfsd-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Pull nfsd updates from Chuck Lever:

 - Clean-ups in the READ path in anticipation of MSG_SPLICE_PAGES

 - Better NUMA awareness when allocating pages and other objects

 - A number of minor clean-ups to XDR encoding

 - Elimination of a race when accepting a TCP socket

 - Numerous observability enhancements

* tag 'nfsd-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (46 commits)
  nfsd: remove redundant assignments to variable len
  svcrdma: Fix stale comment
  NFSD: Distinguish per-net namespace initialization
  nfsd: move init of percpu reply_cache_stats counters back to nfsd_init_net
  SUNRPC: Address RCU warning in net/sunrpc/svc.c
  SUNRPC: Use sysfs_emit in place of strlcpy/sprintf
  SUNRPC: Remove transport class dprintk call sites
  SUNRPC: Fix comments for transport class registration
  svcrdma: Remove an unused argument from __svc_rdma_put_rw_ctxt()
  svcrdma: trace cc_release calls
  svcrdma: Convert "might sleep" comment into a code annotation
  NFSD: Add an nfsd4_encode_nfstime4() helper
  SUNRPC: Move initialization of rq_stime
  SUNRPC: Optimize page release in svc_rdma_sendto()
  svcrdma: Prevent page release when nothing was received
  svcrdma: Revert 2a1e4f21d8 ("svcrdma: Normalize Send page handling")
  SUNRPC: Revert 579900670a ("svcrdma: Remove unused sc_pages field")
  SUNRPC: Revert cc93ce9529 ("svcrdma: Retain the page backing rq_res.head[0].iov_base")
  NFSD: add encoding of op_recall flag for write delegation
  NFSD: Add "official" reviewers for this subsystem
  ...
2023-06-26 10:48:57 -07:00
Linus Torvalds
c0a572d9d3 Merge tag 'v6.5/vfs.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs mount updates from Christian Brauner:
 "This contains the work to extend move_mount() to allow adding a mount
  beneath the topmost mount of a mount stack.

  There are two LWN articles about this. One covers the original patch
  series in [1]. The other in [2] summarizes the session and roughly the
  discussion between Al and me at LSFMM. The second article also goes
  into some good questions from attendees.

  Since all details are found in the relevant commit with a technical
  dive into semantics and locking at the end I'm only adding the
  motivation and core functionality for this from commit message and
  leave out the invasive details. The code is also heavily commented and
  annotated as well which was explicitly requested.

  TL;DR:

    > mount -t ext4 /dev/sda /mnt
      |
      └─/mnt    /dev/sda    ext4

    > mount --beneath -t xfs /dev/sdb /mnt
      |
      └─/mnt    /dev/sdb    xfs
        └─/mnt  /dev/sda    ext4

    > umount /mnt
      |
      └─/mnt    /dev/sdb    xfs

  The longer motivation is that various distributions are adding or are
  in the process of adding support for system extensions and in the
  future configuration extensions through various tools. A more detailed
  explanation on system and configuration extensions can be found on the
  manpage which is listed below at [3].

  System extension images may – dynamically at runtime — extend the
  /usr/ and /opt/ directory hierarchies with additional files. This is
  particularly useful on immutable system images where a /usr/ and/or
  /opt/ hierarchy residing on a read-only file system shall be extended
  temporarily at runtime without making any persistent modifications.

  When one or more system extension images are activated, their /usr/
  and /opt/ hierarchies are combined via overlayfs with the same
  hierarchies of the host OS, and the host /usr/ and /opt/ overmounted
  with it ("merging"). When they are deactivated, the mount point is
  disassembled — again revealing the unmodified original host version of
  the hierarchy ("unmerging"). Merging thus makes the extension's
  resources suddenly appear below the /usr/ and /opt/ hierarchies as if
  they were included in the base OS image itself. Unmerging makes them
  disappear again, leaving in place only the files that were shipped
  with the base OS image itself.

  System configuration images are similar but operate on directories
  containing system or service configuration.

  On nearly all modern distributions mount propagation plays a crucial
  role and the rootfs of the OS is a shared mount in a peer group
  (usually with peer group id 1):

     TARGET  SOURCE  FSTYPE  PROPAGATION  MNT_ID  PARENT_ID
     /       /       ext4    shared:1     29      1

  On such systems all services and containers run in a separate mount
  namespace and are pivot_root()ed into their rootfs. A separate mount
  namespace is almost always used as it is the minimal isolation
  mechanism services have. But usually they are even much more isolated
  up to the point where they almost become indistinguishable from
  containers.

  Mount propagation again plays a crucial role here. The rootfs of all
  these services is a slave mount to the peer group of the host rootfs.
  This is done so the service will receive mount propagation events from
  the host when certain files or directories are updated.

  In addition, the rootfs of each service, container, and sandbox is
  also a shared mount in its separate peer group:

     TARGET  SOURCE  FSTYPE  PROPAGATION         MNT_ID  PARENT_ID
     /       /       ext4    shared:24 master:1  71      47

  For people not too familiar with mount propagation, the master:1 means
  that this is a slave mount to peer group 1. Which as one can see is
  the host rootfs as indicated by shared:1 above. The shared:24
  indicates that the service rootfs is a shared mount in a separate peer
  group with peer group id 24.

  A service may run other services. Such nested services will also have
  a rootfs mount that is a slave to the peer group of the outer service
  rootfs mount.

  For containers things are just slighly different. A container's rootfs
  isn't a slave to the service's or host rootfs' peer group. The rootfs
  mount of a container is simply a shared mount in its own peer group:

     TARGET                    SOURCE  FSTYPE  PROPAGATION  MNT_ID  PARENT_ID
     /home/ubuntu/debian-tree  /       ext4    shared:99    61      60

  So whereas services are isolated OS components a container is treated
  like a separate world and mount propagation into it is restricted to a
  single well known mount that is a slave to the peer group of the
  shared mount /run on the host:

     TARGET                  SOURCE              FSTYPE  PROPAGATION  MNT_ID  PARENT_ID
     /propagate/debian-tree  /run/host/incoming  tmpfs   master:5     71      68

  Here, the master:5 indicates that this mount is a slave to the peer
  group with peer group id 5. This allows to propagate mounts into the
  container and served as a workaround for not being able to insert
  mounts into mount namespaces directly. But the new mount api does
  support inserting mounts directly. For the interested reader the
  blogpost in [4] might be worth reading where I explain the old and the
  new approach to inserting mounts into mount namespaces.

  Containers of course, can themselves be run as services. They often
  run full systems themselves which means they again run services and
  containers with the exact same propagation settings explained above.

  The whole system is designed so that it can be easily updated,
  including all services in various fine-grained ways without having to
  enter every single service's mount namespace which would be
  prohibitively expensive. The mount propagation layout has been
  carefully chosen so it is possible to propagate updates for system
  extensions and configurations from the host into all services.

  The simplest model to update the whole system is to mount on top of
  /usr, /opt, or /etc on the host. The new mount on /usr, /opt, or /etc
  will then propagate into every service. This works cleanly the first
  time. However, when the system is updated multiple times it becomes
  necessary to unmount the first update on /opt, /usr, /etc and then
  propagate the new update. But this means, there's an interval where
  the old base system is accessible. This has to be avoided to protect
  against downgrade attacks.

  The vfs already exposes a mechanism to userspace whereby mounts can be
  mounted beneath an existing mount. Such mounts are internally referred
  to as "tucked". The patch series exposes the ability to mount beneath
  a top mount through the new MOVE_MOUNT_BENEATH flag for the
  move_mount() system call. This allows userspace to seamlessly upgrade
  mounts. After this series the only thing that will have changed is
  that mounting beneath an existing mount can be done explicitly instead
  of just implicitly.

  The crux is that the proposed mechanism already exists and that it is
  so powerful as to cover cases where mounts are supposed to be updated
  with new versions. Crucially, it offers an important flexibility.
  Namely that updates to a system may either be forced or can be delayed
  and the umount of the top mount be left to a service if it is a
  cooperative one"

Link: https://lwn.net/Articles/927491 [1]
Link: https://lwn.net/Articles/934094 [2]
Link: https://man7.org/linux/man-pages/man8/systemd-sysext.8.html [3]
Link: https://brauner.io/2023/02/28/mounting-into-mount-namespaces.html [4]
Link: https://github.com/flatcar/sysext-bakery
Link: https://fedoraproject.org/wiki/Changes/Unified_Kernel_Support_Phase_1
Link: https://fedoraproject.org/wiki/Changes/Unified_Kernel_Support_Phase_2
Link: https://github.com/systemd/systemd/pull/26013

* tag 'v6.5/vfs.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  fs: allow to mount beneath top mount
  fs: use a for loop when locking a mount
  fs: properly document __lookup_mnt()
  fs: add path_mounted()
2023-06-26 10:27:04 -07:00
Linus Torvalds
1f2300a738 Merge tag 'v6.5/vfs.file' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs file handling updates from Christian Brauner:
 "This contains Amir's work to fix a long-standing problem where an
  unprivileged overlayfs mount can be used to avoid fanotify permission
  events that were requested for an inode or superblock on the
  underlying filesystem.

  Some background about files opened in overlayfs. If a file is opened
  in overlayfs @file->f_path will refer to a "fake" path. What this
  means is that while @file->f_inode will refer to inode of the
  underlying layer, @file->f_path refers to an overlayfs
  {dentry,vfsmount} pair. The reasons for doing this are out of scope
  here but it is the reason why the vfs has been providing the
  open_with_fake_path() helper for overlayfs for very long time now. So
  nothing new here.

  This is for sure not very elegant and everyone including the overlayfs
  maintainers agree. Improving this significantly would involve more
  fragile and potentially rather invasive changes.

  In various codepaths access to the path of the underlying filesystem
  is needed for such hybrid file. The best example is fsnotify where
  this becomes security relevant. Passing the overlayfs
  @file->f_path->dentry will cause fsnotify to skip generating fsnotify
  events registered on the underlying inode or superblock.

  To fix this we extend the vfs provided open_with_fake_path() concept
  for overlayfs to create a backing file container that holds the real
  path and to expose a helper that can be used by relevant callers to
  get access to the path of the underlying filesystem through the new
  file_real_path() helper. This pattern is similar to what we do in
  d_real() and d_real_inode().

  The first beneficiary is fsnotify and fixes the security sensitive
  problem mentioned above.

  There's a couple of nice cleanups included as well.

  Over time, the old open_with_fake_path() helper added specifically for
  overlayfs a long time ago started to get used in other places such as
  cachefiles. Even though cachefiles have nothing to do with hybrid
  files.

  The only reason cachefiles used that concept was that files opened
  with open_with_fake_path() aren't charged against the caller's open
  file limit by raising FMODE_NOACCOUNT. It's just mere coincidence that
  both overlayfs and cachefiles need to ensure to not overcharge the
  caller for their internal open calls.

  So this work disentangles FMODE_NOACCOUNT use cases and backing file
  use-cases by adding the FMODE_BACKING flag which indicates that the
  file can be used to retrieve the backing file of another filesystem.
  (Fyi, Jens will be sending you a really nice cleanup from Christoph
  that gets rid of 3 FMODE_* flags otherwise this would be the last
  fmode_t bit we'd be using.)

  So now overlayfs becomes the sole user of the renamed
  open_with_fake_path() helper which is now named backing_file_open().
  For internal kernel users such as cachefiles that are only interested
  in FMODE_NOACCOUNT but not in FMODE_BACKING we add a new
  kernel_file_open() helper which opens a file without being charged
  against the caller's open file limit. All new helpers are properly
  documented and clearly annotated to mention their special uses.

  We also rename vfs_tmpfile_open() to kernel_tmpfile_open() to clearly
  distinguish it from vfs_tmpfile() and align it the other kernel_*()
  internal helpers"

* tag 'v6.5/vfs.file' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  ovl: enable fsnotify events on underlying real files
  fs: use backing_file container for internal files with "fake" f_path
  fs: move kmem_cache_zalloc() into alloc_empty_file*() helpers
  fs: use a helper for opening kernel internal files
  fs: rename {vfs,kernel}_tmpfile_open()
2023-06-26 10:14:36 -07:00
Linus Torvalds
2eedfa9e27 Merge tag 'v6.5/vfs.rename.locking' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs rename locking updates from Christian Brauner:
 "This contains the work from Jan to fix problems with cross-directory
  renames originally reported in [1].

  To quickly sum it up some filesystems (so far we know at least about
  ext4, udf, f2fs, ocfs2, likely also reiserfs, gfs2 and others) need to
  lock the directory when it is being renamed into another directory.

  This is because we need to update the parent pointer in the directory
  in that case and if that races with other operations on the directory,
  in particular a conversion from one directory format into another, bad
  things can happen.

  So far we've done the locking in the filesystem code but recently
  Darrick pointed out in [2] that the RENAME_EXCHANGE case was missing.
  That one is particularly nasty because RENAME_EXCHANGE can arbitrarily
  mix regular files and directories and proper lock ordering is not
  achievable in the filesystems alone.

  This patch set adds locking into vfs_rename() so that not only parent
  directories but also moved inodes, regardless of whether they are
  directories or not, are locked when calling into the filesystem.

  This means establishing a locking order for unrelated directories. New
  helpers are added for this purpose and our documentation is updated to
  cover this in detail.

  The locking is now actually easier to follow as we now always lock
  source and target. We've always locked the target independent of
  whether it was a directory or file and we've always locked source if
  it was a regular file. The exact details for why this came about can
  be found in [3] and [4]"

Link: https://lore.kernel.org/all/20230117123735.un7wbamlbdihninm@quack3 [1]
Link: https://lore.kernel.org/all/20230517045836.GA11594@frogsfrogsfrogs [2]
Link: https://lore.kernel.org/all/20230526-schrebergarten-vortag-9cd89694517e@brauner [3]
Link: https://lore.kernel.org/all/20230530-seenotrettung-allrad-44f4b00139d4@brauner [4]

* tag 'v6.5/vfs.rename.locking' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  fs: Restrict lock_two_nondirectories() to non-directory inodes
  fs: Lock moved directories
  fs: Establish locking order for unrelated directories
  Revert "f2fs: fix potential corruption when moving a directory"
  Revert "udf: Protect rename against modification of moved directory"
  ext4: Remove ext4 locking of moved directory
2023-06-26 10:01:26 -07:00
Rafael J. Wysocki
a8460ba594 Merge tag 'thermal-v6.5-rc1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/thermal/linux into thermal
Pull thermal control updates for 6.5-rc1 from Daniel Lezcano:

"- Add DT bindings for SM6375, MSM8226 and QCM2290 Qcom platforms (Konrad
   Dybcio)

 - Add DT bindings and support for QCom MSM8226 (Matti Lehtimäki)

 - Add DT bindings for QCom ipq9574 (Praveenkumar I)

 - Convert bcm2835 DT bindings to the yaml schema (Stefan Wahren)

 - Allow selecting the bang-bang governor as default (Thierry Reding)

 - Refactor and prepare the code to set the scene for RCar Gen4
   (Wolfram Sang)

 - Cleanup and fixes for the QCom tsens drivers. Add DT bindings and
   calibration for the MSM8909 platform (Stephan Gerhold)

 - Revert a patch introducing a wrong usage of devm_of_iomap() on the
   Mediatek platform (Ricardo Cañuelo)

 - Fix the clock vs reset ordering in order to conform to the
   documentation on the sun8i (Christophe JAILLET)

 - Prevent setting up undocumented registers, enable the only described
   sensors and add the version 2.1 on the Qoriq sensor (Peng Fan)

 - Add DT bindings and support for the Armada AP807 (Alex Leibovich)

 - Update the mlx5 driver with the recent thermal changes (Daniel
   Lezcano)

 - Convert to platform remove callback returning void on STM32 (Uwe
   Kleine-König)

 - Add an error information printing for devm_thermal_add_hwmon_sysfs()
   and remove the error from the Sun8i, Amlogic, i.MX, TI, K3, Tegra,
   Qoriq, Mediateka and QCom (Yangtao Li)

 - Register as hwmon sensor for the Generic ADC (Chen-Yu Tsai)

 - Use the dev_err_probe() function in the QCom tsens alarm driver
   (Luca Weiss)"

* tag 'thermal-v6.5-rc1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/thermal/linux: (38 commits)
  thermal/drivers/qcom/temp-alarm: Use dev_err_probe
  thermal/drivers/generic-adc: Register thermal zones as hwmon sensors
  thermal/drivers/mediatek/lvts_thermal: Remove redundant msg in lvts_ctrl_start()
  thermal/drivers/qcom: Remove redundant msg at probe time
  thermal/drivers/ti-soc: Remove redundant msg in ti_thermal_expose_sensor()
  thermal/drivers/qoriq: Remove redundant msg in qoriq_tmu_register_tmu_zone()
  thermal/drivers/tegra: Remove redundant msg in tegra_tsensor_register_channel()
  drivers/thermal/k3: Remove redundant msg in k3_bandgap_probe()
  thermal/drivers/imx: Remove redundant msg in imx8mm_tmu_probe() and imx_sc_thermal_probe()
  thermal/drivers/amlogic: Remove redundant msg in amlogic_thermal_probe()
  thermal/drivers/sun8i: Remove redundant msg in sun8i_ths_register()
  thermal/hwmon: Add error information printing for devm_thermal_add_hwmon_sysfs()
  thermal/drivers/stm32: Convert to platform remove callback returning void
  net/mlx5: Update the driver with the recent thermal changes
  thermal/drivers/armada: Add support for AP807 thermal data
  dt-bindings: armada-thermal: Add armada-ap807-thermal compatible
  thermal/drivers/qoriq: Support version 2.1
  thermal/drivers/qoriq: Only enable supported sensors
  thermal/drivers/qoriq: No need to program site adjustment register
  thermal/drivers/mediatek/lvts_thermal: Register thermal zones as hwmon sensors
  ...
2023-06-26 18:56:58 +02:00
Linus Torvalds
64bf6ae93e Merge tag 'v6.5/vfs.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull misc vfs updates from Christian Brauner:
 "Miscellaneous features, cleanups, and fixes for vfs and individual fs

  Features:

   - Use mode 0600 for file created by cachefilesd so it can be run by
     unprivileged users. This aligns them with directories which are
     already created with mode 0700 by cachefilesd

   - Reorder a few members in struct file to prevent some false sharing
     scenarios

   - Indicate that an eventfd is used a semaphore in the eventfd's
     fdinfo procfs file

   - Add a missing uapi header for eventfd exposing relevant uapi
     defines

   - Let the VFS protect transitions of a superblock from read-only to
     read-write in addition to the protection it already provides for
     transitions from read-write to read-only. Protecting read-only to
     read-write transitions allows filesystems such as ext4 to perform
     internal writes, keeping writers away until the transition is
     completed

  Cleanups:

   - Arnd removed the architecture specific arch_report_meminfo()
     prototypes and added a generic one into procfs.h. Note, we got a
     report about a warning in amdpgpu codepaths that suggested this was
     bisectable to this change but we concluded it was a false positive

   - Remove unused parameters from split_fs_names()

   - Rename put_and_unmap_page() to unmap_and_put_page() to let the name
     reflect the order of the cleanup operation that has to unmap before
     the actual put

   - Unexport buffer_check_dirty_writeback() as it is not used outside
     of block device aops

   - Stop allocating aio rings from highmem

   - Protecting read-{only,write} transitions in the VFS used open-coded
     barriers in various places. Replace them with proper little helpers
     and document both the helpers and all barrier interactions involved
     when transitioning between read-{only,write} states

   - Use flexible array members in old readdir codepaths

  Fixes:

   - Use the correct type __poll_t for epoll and eventfd

   - Replace all deprecated strlcpy() invocations, whose return value
     isn't checked with an equivalent strscpy() call

   - Fix some kernel-doc warnings in fs/open.c

   - Reduce the stack usage in jffs2's xattr codepaths finally getting
     rid of this: fs/jffs2/xattr.c:887:1: error: the frame size of 1088
     bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
     royally annoying compilation warning

   - Use __FMODE_NONOTIFY instead of FMODE_NONOTIFY where an int and not
     fmode_t is required to avoid fmode_t to integer degradation
     warnings

   - Create coredumps with O_WRONLY instead of O_RDWR. There's a long
     explanation in that commit how O_RDWR is actually a bug which we
     found out with the help of Linus and git archeology

   - Fix "no previous prototype" warnings in the pipe codepaths

   - Add overflow calculations for remap_verify_area() as a signed
     addition overflow could be triggered in xfstests

   - Fix a null pointer dereference in sysv

   - Use an unsigned variable for length calculations in jfs avoiding
     compilation warnings with gcc 13

   - Fix a dangling pipe pointer in the watch queue codepath

   - The legacy mount option parser provided as a fallback by the VFS
     for filesystems not yet converted to the new mount api did prefix
     the generated mount option string with a leading ',' causing issues
     for some filesystems

   - Fix a repeated word in a comment in fs.h

   - autofs: Update the ctime when mtime is updated as mandated by
     POSIX"

* tag 'v6.5/vfs.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (27 commits)
  readdir: Replace one-element arrays with flexible-array members
  fs: Provide helpers for manipulating sb->s_readonly_remount
  fs: Protect reconfiguration of sb read-write from racing writes
  eventfd: add a uapi header for eventfd userspace APIs
  autofs: set ctime as well when mtime changes on a dir
  eventfd: show the EFD_SEMAPHORE flag in fdinfo
  fs/aio: Stop allocating aio rings from HIGHMEM
  fs: Fix comment typo
  fs: unexport buffer_check_dirty_writeback
  fs: avoid empty option when generating legacy mount string
  watch_queue: prevent dangling pipe pointer
  fs.h: Optimize file struct to prevent false sharing
  highmem: Rename put_and_unmap_page() to unmap_and_put_page()
  cachefiles: Allow the cache to be non-root
  init: remove unused names parameter in split_fs_names()
  jfs: Use unsigned variable for length calculations
  fs/sysv: Null check to prevent null-ptr-deref bug
  fs: use UB-safe check for signed addition overflow in remap_verify_area
  procfs: consolidate arch_report_meminfo declaration
  fs: pipe: reveal missing function protoypes
  ...
2023-06-26 09:50:21 -07:00
Linus Torvalds
5c1c88cddb Merge tag 'v6.5/fs.ntfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull ntfs updates from Christian Brauner:
 "A pile of various smaller fixes for ntfs"

* tag 'v6.5/fs.ntfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  ntfs: do not dereference a null ctx on error
  ntfs: Remove unneeded semicolon
  ntfs: Correct spelling
  ntfs: remove redundant initialization to pointer cb_sb_start
2023-06-26 09:47:39 -07:00
Linus Torvalds
1f268d6d2c Merge tag 'auxdisplay-6.5' of https://github.com/ojeda/linux
Pull auxdisplay update from Miguel Ojeda:
 "A single cleanup for i2c drivers to switch them back to use
  '.probe()'"

* tag 'auxdisplay-6.5' of https://github.com/ojeda/linux:
  auxdisplay: Switch i2c drivers back to use .probe()
2023-06-26 09:42:03 -07:00
Linus Torvalds
a1257b5e3b Merge tag 'rust-6.5' of https://github.com/Rust-for-Linux/linux
Pull rust updates from Miguel Ojeda:
 "A fairly small one in terms of feature additions. Most of the changes
  in terms of lines come from the upgrade to the new version of the
  toolchain (which in turn is big due to the vendored 'alloc' crate).

  Upgrade to Rust 1.68.2:

   - This is the first such upgrade, and we will try to update it often
     from now on, in order to remain close to the latest release, until
     a minimum version (which is "in the future") can be established.

     The upgrade brings the stabilization of 4 features we used (and 2
     more that we used in our old 'rust' branch).

     Commit 3ed03f4da0 ("rust: upgrade to Rust 1.68.2") contains the
     details and rationale.

  pin-init API:

   - Several internal improvements and fixes to the pin-init API, e.g.
     allowing to use 'Self' in a struct definition with '#[pin_data]'.

  'error' module:

   - New 'name()' method for the 'Error' type (with 'errname()'
     integration), used to implement the 'Debug' trait for 'Error'.

   - Add error codes from 'include/linux/errno.h' to the list of Rust
     'Error' constants.

   - Allow specifying error type on the 'Result' type (with the default
     still being our usual 'Error' type).

  'str' module:

   - 'TryFrom' implementation for 'CStr', and new 'to_cstring()' method
     based on it.

  'sync' module:

   - Implement 'AsRef' trait for 'Arc', allowing to use 'Arc' in code
     that is generic over smart pointer types.

   - Add 'ptr_eq' method to 'Arc' for easier, less error prone
     comparison between two 'Arc' pointers.

   - Reword the 'Send' safety comment for 'Arc', and avoid referencing
     it from the 'Sync' one.

  'task' module:

   - Implement 'Send' marker for 'Task'.

  'types' module:

   - Implement 'Send' and 'Sync' markers for 'ARef<T>' when 'T' is
     'AlwaysRefCounted', 'Send' and 'Sync'.

  Other changes:

   - Documentation improvements and '.gitattributes' change to start
     using the Rust diff driver"

* tag 'rust-6.5' of https://github.com/Rust-for-Linux/linux:
  rust: error: `impl Debug` for `Error` with `errname()` integration
  rust: task: add `Send` marker to `Task`
  rust: specify when `ARef` is thread safe
  rust: sync: reword the `Arc` safety comment for `Sync`
  rust: sync: reword the `Arc` safety comment for `Send`
  rust: sync: implement `AsRef<T>` for `Arc<T>`
  rust: sync: add `Arc::ptr_eq`
  rust: error: add missing error codes
  rust: str: add conversion from `CStr` to `CString`
  rust: error: allow specifying error type on `Result`
  rust: init: update macro expansion example in docs
  rust: macros: replace Self with the concrete type in #[pin_data]
  rust: macros: refactor generics parsing of `#[pin_data]` into its own function
  rust: macros: fix usage of `#[allow]` in `quote!`
  docs: rust: point directly to the standalone installers
  .gitattributes: set diff driver for Rust source code files
  rust: upgrade to Rust 1.68.2
  rust: arc: fix intra-doc link in `Arc<T>::init`
  rust: alloc: clarify what is the upstream version
2023-06-26 09:35:50 -07:00
Linus Torvalds
9d9a9bf07e Merge tag 's390-6.4-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Alexander Gordeev:

 - Use correct type for size of memory allocated for ELF core header on
   kernel crash.

 - Fix insecure W+X mapping warning when KASAN shadow memory range is
   not aligned on page boundary.

 - Avoid allocation of short by one page KASAN shadow memory when the
   original memory range is less than (PAGE_SIZE << 3).

 - Fix virtual vs physical address confusion in physical memory
   enumerator. It is not a real issue, since virtual and physical
   addresses are currently the same.

 - Set CONFIG_NET_TC_SKB_EXT=y in s390 config files as it is required
   for offloading TC as well as bridges on switchdev capable ConnectX
   devices.

* tag 's390-6.4-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/defconfigs: set CONFIG_NET_TC_SKB_EXT=y
  s390/boot: fix physmem_info virtual vs physical address confusion
  s390/kasan: avoid short by one page shadow memory
  s390/kasan: fix insecure W+X mapping warning
  s390/crash: use the correct type for memory allocation
2023-06-26 09:31:06 -07:00
Rob Herring
74f02dd739 dt-bindings: auxdisplay: holtek: Add missing type for "linux,no-autorepeat"
"linux,no-autorepeat" is missing a type, add it.

Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20230613201049.2824028-1-robh@kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
2023-06-26 10:26:52 -06:00