Commit Graph

83402 Commits

Author SHA1 Message Date
Christian König
245a4a7b53 dma-buf: generalize dma_fence unwrap & merging v3
Introduce a dma_fence_unwrap_merge() macro which allows to unwrap fences
which potentially can be containers as well and then merge them back
together into a flat dma_fence_array.

v2: rename the function, add some more comments about how the wrapper is
    used, move filtering of signaled fences into the unwrap iterator,
    add complex selftest which covers more cases.
v3: fix signaled fence filtering once more

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20220518135844.3338-5-christian.koenig@amd.com
2022-05-30 14:24:04 +02:00
Christian König
8f61973718 dma-buf: return only unsignaled fences in dma_fence_unwrap_for_each v3
dma_fence_chain containers cleanup signaled fences automatically, so
filter those out from arrays as well.

v2: fix missing walk over the array
v3: massively simplify the patch and actually update the description.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20220518135844.3338-4-christian.koenig@amd.com
2022-05-30 14:23:21 +02:00
Christian König
01357a5a45 dma-buf: cleanup dma_fence_unwrap implementation
Move the code from the inline functions into exported functions.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20220518135844.3338-3-christian.koenig@amd.com
2022-05-30 14:16:32 +02:00
Javier Martinez Canillas
3367aa7d74 fbdev: Restart conflicting fb removal loop when unregistering devices
Drivers that want to remove registered conflicting framebuffers prior to
register their own framebuffer, call to remove_conflicting_framebuffers().

This function takes the registration_lock mutex, to prevent a race when
drivers register framebuffer devices. But if a conflicting framebuffer
device is found, the underlaying platform device is unregistered and this
will lead to the platform driver .remove callback to be called. Which in
turn will call to unregister_framebuffer() that takes the same lock.

To prevent this, a struct fb_info.forced_out field was used as indication
to unregister_framebuffer() whether the mutex has to be grabbed or not.

But this could be unsafe, since the fbdev core is making assumptions about
what drivers may or may not do in their .remove callbacks. Allowing to run
these callbacks with the registration_lock held can cause deadlocks, since
the fbdev core has no control over what drivers do in their removal path.

A better solution is to drop the lock before platform_device_unregister(),
so unregister_framebuffer() can take it when called from the fbdev driver.
The lock is acquired again after the device has been unregistered and at
this point the removal loop can be restarted.

Since the conflicting framebuffer device has already been removed, the
loop would just finish when no more conflicting framebuffers are found.

Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20220511113039.1252432-1-javierm@redhat.com
2022-05-13 13:48:28 +02:00
Thomas Zimmermann
e80eec1b87 fbdev: Rename pagelist to pagereflist for deferred I/O
Rename various instances of pagelist to pagereflist. The list now
stores pageref structures, so the new name is more appropriate.

In their write-back helpers, several fbdev drivers refer to the
pageref list in struct fb_deferred_io instead of using the one
supplied as argument to the function. Convert them over to the
supplied one. It's the same instance, so no change of behavior
occurs.

v4:
	* fix commit message (Javier)

Suggested-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220429100834.18898-5-tzimmermann@suse.de
2022-05-03 16:04:22 +02:00
Thomas Zimmermann
56c134f7f1 fbdev: Track deferred-I/O pages in pageref struct
Store the per-page state for fbdev's deferred I/O in struct
fb_deferred_io_pageref. Maintain a list of pagerefs for the pages
that have to be written back to video memory. Update all affected
drivers.

As with pages before, fbdev acquires a pageref when an mmaped page
of the framebuffer is being written to. It holds the pageref in a
list of all currently written pagerefs until it flushes the written
pages to video memory. Writeback occurs periodically. After writeback
fbdev releases all pagerefs and builds up a new dirty list until the
next writeback occurs.

Using pagerefs has a number of benefits.

For pages of the framebuffer, the deferred I/O code used struct
page.lru as an entry into the list of dirty pages. The lru field is
owned by the page cache, which makes deferred I/O incompatible with
some memory pages (e.g., most notably DRM's GEM SHMEM allocator).
struct fb_deferred_io_pageref now provides an entry into a list of
dirty framebuffer pages, freeing lru for use with the page cache.

Drivers also assumed that struct page.index is the page offset into
the framebuffer. This is not true for DRM buffers, which are located
at various offset within a mapped area. struct fb_deferred_io_pageref
explicitly stores an offset into the framebuffer. struct page.index
is now only the page offset into the mapped area.

These changes will allow DRM to use fbdev deferred I/O without an
intermediate shadow buffer.

v3:
	* use pageref->offset for sorting
	* fix grammar in comment
v2:
	* minor fixes in commit message

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220429100834.18898-3-tzimmermann@suse.de
2022-05-03 16:04:22 +02:00
Dave Airlie
e954d2c94d Backmerge tag 'v5.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into drm-next
Linux 5.18-rc5

There was a build fix for arm I wanted in drm-next, so backmerge rather then cherry-pick.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2022-05-03 16:08:48 +10:00
Linus Torvalds
b2da7df52e Merge tag 'x86_urgent_for_v5.18_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:

 - A fix to disable PCI/MSI[-X] masking for XEN_HVM guests as that is
   solely controlled by the hypervisor

 - A build fix to make the function prototype (__warn()) as visible as
   the definition itself

 - A bunch of objtool annotation fixes which have accumulated over time

 - An ORC unwinder fix to handle bad input gracefully

 - Well, we thought the microcode gets loaded in time in order to
   restore the microcode-emulated MSRs but we thought wrong. So there's
   a fix for that to have the ordering done properly

 - Add new Intel model numbers

 - A spelling fix

* tag 'x86_urgent_for_v5.18_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/pci/xen: Disable PCI/MSI[-X] masking for XEN_HVM guests
  bug: Have __warn() prototype defined unconditionally
  x86/Kconfig: fix the spelling of 'becoming' in X86_KERNEL_IBT config
  objtool: Use offstr() to print address of missing ENDBR
  objtool: Print data address for "!ENDBR" data warnings
  x86/xen: Add ANNOTATE_NOENDBR to startup_xen()
  x86/uaccess: Add ENDBR to __put_user_nocheck*()
  x86/retpoline: Add ANNOTATE_NOENDBR for retpolines
  x86/static_call: Add ANNOTATE_NOENDBR to static call trampoline
  objtool: Enable unreachable warnings for CLANG LTO
  x86,objtool: Explicitly mark idtentry_body()s tail REACHABLE
  x86,objtool: Mark cpu_startup_entry() __noreturn
  x86,xen,objtool: Add UNWIND hint
  lib/strn*,objtool: Enforce user_access_begin() rules
  MAINTAINERS: Add x86 unwinding entry
  x86/unwind/orc: Recheck address range after stack info was updated
  x86/cpu: Load microcode during restore_processor_state()
  x86/cpu: Add new Alderlake and Raptorlake CPU model numbers
2022-05-01 10:03:36 -07:00
Linus Torvalds
da1b4042bd Merge tag 'usb-5.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
 "Here are a number of small USB driver fixes for 5.18-rc5 for some
  reported issues and new quirks. They include:

   - dwc3 driver fixes

   - xhci driver fixes

   - typec driver fixes

   - new usb-serial driver ids

   - added new USB devices to existing quirk tables

   - other tiny fixes

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'usb-5.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (31 commits)
  usb: phy: generic: Get the vbus supply
  usb: dwc3: gadget: Return proper request status
  usb: dwc3: pci: add support for the Intel Meteor Lake-P
  usb: dwc3: core: Only handle soft-reset in DCTL
  usb: gadget: configfs: clear deactivation flag in configfs_composite_unbind()
  usb: misc: eud: Fix an error handling path in eud_probe()
  usb: core: Don't hold the device lock while sleeping in do_proc_control()
  usb: dwc3: Try usb-role-switch first in dwc3_drd_init
  usb: dwc3: core: Fix tx/rx threshold settings
  usb: mtu3: fix USB 3.0 dual-role-switch from device to host
  xhci: Enable runtime PM on second Alderlake controller
  usb: dwc3: fix backwards compat with rockchip devices
  dt-bindings: usb: samsung,exynos-usb2: add missing required reg
  usb: misc: fix improper handling of refcount in uss720_probe()
  USB: Fix ehci infinite suspend-resume loop issue in zhaoxin
  usb: typec: tcpm: Fix undefined behavior due to shift overflowing the constant
  usb: typec: rt1719: Fix build error without CONFIG_POWER_SUPPLY
  usb: typec: ucsi: Fix role swapping
  usb: typec: ucsi: Fix reuse of completion structure
  usb: xhci: tegra:Fix PM usage reference leak of tegra_xusb_unpowergate_partitions
  ...
2022-04-30 09:58:46 -07:00
Linus Torvalds
249aca0d3d Merge tag 'net-5.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
 "Including fixes from bluetooth, bpf and netfilter.

  Current release - new code bugs:

   - bridge: switchdev: check br_vlan_group() return value

   - use this_cpu_inc() to increment net->core_stats, fix preempt-rt

  Previous releases - regressions:

   - eth: stmmac: fix write to sgmii_adapter_base

  Previous releases - always broken:

   - netfilter: nf_conntrack_tcp: re-init for syn packets only,
     resolving issues with TCP fastopen

   - tcp: md5: fix incorrect tcp_header_len for incoming connections

   - tcp: fix F-RTO may not work correctly when receiving DSACK

   - tcp: ensure use of most recently sent skb when filling rate samples

   - tcp: fix potential xmit stalls caused by TCP_NOTSENT_LOWAT

   - virtio_net: fix wrong buf address calculation when using xdp

   - xsk: fix forwarding when combining copy mode with busy poll

   - xsk: fix possible crash when multiple sockets are created

   - bpf: lwt: fix crash when using bpf_skb_set_tunnel_key() from
     bpf_xmit lwt hook

   - sctp: null-check asoc strreset_chunk in sctp_generate_reconf_event

   - wireguard: device: check for metadata_dst with skb_valid_dst()

   - netfilter: update ip6_route_me_harder to consider L3 domain

   - gre: make o_seqno start from 0 in native mode

   - gre: switch o_seqno to atomic to prevent races in collect_md mode

  Misc:

   - add Eric Dumazet to networking maintainers

   - dt: dsa: realtek: remove realtek,rtl8367s string

   - netfilter: flowtable: Remove the empty file"

* tag 'net-5.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (65 commits)
  tcp: fix F-RTO may not work correctly when receiving DSACK
  Revert "ibmvnic: Add ethtool private flag for driver-defined queue limits"
  net: enetc: allow tc-etf offload even with NETIF_F_CSUM_MASK
  ixgbe: ensure IPsec VF<->PF compatibility
  MAINTAINERS: Update BNXT entry with firmware files
  netfilter: nft_socket: only do sk lookups when indev is available
  net: fec: add missing of_node_put() in fec_enet_init_stop_mode()
  bnx2x: fix napi API usage sequence
  tls: Skip tls_append_frag on zero copy size
  Add Eric Dumazet to networking maintainers
  netfilter: conntrack: fix udp offload timeout sysctl
  netfilter: nf_conntrack_tcp: re-init for syn packets only
  net: dsa: lantiq_gswip: Don't set GSWIP_MII_CFG_RMII_CLK
  net: Use this_cpu_inc() to increment net->core_stats
  Bluetooth: hci_sync: Cleanup hci_conn if it cannot be aborted
  Bluetooth: hci_event: Fix creating hci_conn object on error status
  Bluetooth: hci_event: Fix checking for invalid handle on error status
  ice: fix use-after-free when deinitializing mailbox snapshot
  ice: wait 5 s for EMP reset after firmware flash
  ice: Protect vf_state check by cfg_lock in ice_vc_process_vf_msg()
  ...
2022-04-28 12:34:50 -07:00
Dave Airlie
9bda072a7b Merge tag 'drm-intel-gt-next-2022-04-27' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
UAPI Changes:

- GuC hwconfig support and query (John Harrison, Rodrigo Vivi, Tvrtko Ursulin)
- Sysfs support for multi-tile devices (Andi Shyti, Sujaritha Sundaresan)
- Per client GPU utilisation via fdinfo (Tvrtko Ursulin, Ashutosh Dixit)
- Add DRM_I915_QUERY_GEOMETRY_SUBSLICES (Matt Atwood)

Cross-subsystem Changes:

- Add GSC as a MEI auxiliary device (Tomas Winkler, Alexander Usyskin)

Core Changes:

- Document fdinfo format specification (Tvrtko Ursulin)

Driver Changes:

- Fix prime_mmap to work when using LMEM (Gwan-gyeong Mun)
- Fix vm open count and remove vma refcount (Thomas Hellström)
- Fixup setting screen_size (Matthew Auld)
- Opportunistically apply ALLOC_CONTIGIOUS (Matthew Auld)
- Limit where we apply TTM_PL_FLAG_CONTIGUOUS (Matthew Auld)
- Drop aux table invalidation on FlatCCS platforms (Matt Roper)
- Add missing boundary check in vm_access (Mastan Katragadda)
- Update topology dumps for Xe_HP (Matt Roper)
- Add support for steered register writes (Matt Roper)
- Add steering info to GuC register save/restore list (Daniele Ceraolo Spurio)
- Small PCI BAR enabling (Matthew Auld, Akeem G Abodunrin, CQ Tang)
- Add preemption changes for Wa_14015141709 (Akeem G Abodunrin)
- Add logical mapping for video decode engines (Matthew Brost)
- Don't evict unmappable VMAs when pinning with PIN_MAPPABLE (v2) (Vivek Kasireddy)
- GuC error capture support (Alan Previn, Daniele Ceraolo Spurio)
- avoid concurrent writes to aux_inv (Fei Yang)
- Add Wa_22014226127 (José Roberto de Souza)
- Sunset igpu legacy mmap support based on GRAPHICS_VER_FULL (Matt Roper)
- Evict and restore of compressed objects (Ramalingam C)
- Update to GuC version 70.1.1 (John Harrison)
- Add Wa_22011802037 force cs halt (Tilak Tangudu)
- Enable Wa_22011802037 for gen12 GuC based platforms (Umesh Nerlige Ramappa)
- GuC based workarounds for DG2 (Vinay Belgaumkar, John Harrison, Matthew Brost, José Roberto de Souza)
- consider min_page_size when migrating (Matthew Auld)

- Prep work for next GuC firmware release (John Harrison)
- Support platforms with CCS engines but no RCS (Matt Roper, Stuart Summers)
- Don't overallocate subslice storage (Matt Roper)
- Reduce stack usage in debugfs due to SSEU (John Harrison)
- Report steering details in debugfs (Matt Roper)
- Refactor some x86-ism out to prepare for non-x86 builds (Michael Cheng)
- add lmem_size modparam (CQ Tang)
- Refactor for non-x86 driver builds (Casey Bowman)
- Centralize computation of freq caps (Ashutosh Dixit)

- Update dma_buf_ops.unmap_dma_buf callback to use drm_gem_unmap_dma_buf() (Gwan-gyeong Mun)
- Limit the async bind to bind_async_flags (Matthew Auld)
- Stop checking for NULL vma->obj (Matthew Auld)
- Reduce overzealous alignment constraints for GGTT (Matthew Auld)
- Remove GEN12_SFC_DONE_MAX from register defs header (Matt Roper)
- Fix renamed struct field (Lucas De Marchi)
- Do not return '0' if there is nothing to return (Andi Shyti)
- fix i915_reg_t initialization (Jani Nikula)
- move the migration sanity check (Matthew Auld)
- handle more rounding in selftests (Matthew Auld)
- Perf and i915 query kerneldoc updates (Matt Roper)
- Use i915_probe_error instead of drm_err (Vinay Belgaumkar)
- sanity check object size in the buddy allocator (Matthew Auld)
- fixup selftests min_alignment usage (Matthew Auld)
- tweak selftests misaligned_case (Matthew Auld)

Signed-off-by: Dave Airlie <airlied@redhat.com>

# Conflicts:
#	drivers/gpu/drm/i915/i915_vma.c
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Ymkfy8FjsG2JrodK@tursulin-mobl2
2022-04-28 15:32:29 +10:00
Mikulas Patocka
e5be15767e hex2bin: make the function hex_to_bin constant-time
The function hex2bin is used to load cryptographic keys into device
mapper targets dm-crypt and dm-integrity.  It should take constant time
independent on the processed data, so that concurrently running
unprivileged code can't infer any information about the keys via
microarchitectural convert channels.

This patch changes the function hex_to_bin so that it contains no
branches and no memory accesses.

Note that this shouldn't cause performance degradation because the size
of the new function is the same as the size of the old function (on
x86-64) - and the new function causes no branch misprediction penalties.

I compile-tested this function with gcc on aarch64 alpha arm hppa hppa64
i386 ia64 m68k mips32 mips64 powerpc powerpc64 riscv sh4 s390x sparc32
sparc64 x86_64 and with clang on aarch64 arm hexagon i386 mips32 mips64
powerpc powerpc64 s390x sparc32 sparc64 x86_64 to verify that there are
no branches in the generated code.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-27 10:57:33 -07:00
Linus Torvalds
03498b7131 Merge tag 'mtd/fixes-for-5.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux
Pull MTD fixes from Miquel Raynal:
 "Core fix:

   - Fix a possible data corruption of the 'part' field in mtd_info

  Rawnand fixes:

   - Fix the check on the return value of wait_for_completion_timeout

   - Fix wrong ECC parameters for mt7622

   - Fix a possible memory corruption that might panic in the Qcom
     driver"

* tag 'mtd/fixes-for-5.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
  mtd: rawnand: qcom: fix memory corruption that causes panic
  mtd: fix 'part' field data corruption in mtd_info
  mtd: rawnand: Fix return value check of wait_for_completion_timeout
  mtd: rawnand: fix ecc parameters for mt7622
2022-04-27 10:14:52 -07:00
Sebastian Andrzej Siewior
6510ea973d net: Use this_cpu_inc() to increment net->core_stats
The macro dev_core_stats_##FIELD##_inc() disables preemption and invokes
netdev_core_stats_alloc() to return a per-CPU pointer.
netdev_core_stats_alloc() will allocate memory on its first invocation
which breaks on PREEMPT_RT because it requires non-atomic context for
memory allocation.

This can be avoided by enabling preemption in netdev_core_stats_alloc()
assuming the caller always disables preemption.

It might be better to replace local_inc() with this_cpu_inc() now that
dev_core_stats_##FIELD##_inc() gained a preempt-disable section and does
not rely on already disabled preemption. This results in less
instructions on x86-64:
local_inc:
|          incl %gs:__preempt_count(%rip)  # __preempt_count
|          movq    488(%rdi), %rax # _1->core_stats, _22
|          testq   %rax, %rax      # _22
|          je      .L585   #,
|          add %gs:this_cpu_off(%rip), %rax        # this_cpu_off, tcp_ptr__
|  .L586:
|          testq   %rax, %rax      # _27
|          je      .L587   #,
|          incq (%rax)            # _6->a.counter
|  .L587:
|          decl %gs:__preempt_count(%rip)  # __preempt_count

this_cpu_inc(), this patch:
|         movq    488(%rdi), %rax # _1->core_stats, _5
|         testq   %rax, %rax      # _5
|         je      .L591   #,
| .L585:
|         incq %gs:(%rax) # _18->rx_dropped

Use unsigned long as type for the counter. Use this_cpu_inc() to
increment the counter. Use a plain read of the counter.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/YmbO0pxgtKpCw4SY@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-26 17:32:30 -07:00
Linus Torvalds
13bc32bad7 Merge tag 'drm-fixes-2022-04-23' of git://anongit.freedesktop.org/drm/drm
Pull more drm fixes from Dave Airlie:
 "Maarten was away, so Maxine stepped up and sent me the drm-fixes
  merge, so no point leaving it for another week.

  The big change is an OF revert around bridge/panels, it may have some
  driver fallout, but hopefully this revert gets them shook out in the
  next week easier.

  Otherwise it's a bunch of locking/refcounts across drivers, a radeon
  dma_resv logic fix and some raspberry pi panel fixes.

  panel:
   - revert of patch that broke panel/bridge issues

  dma-buf:
   - remove unused header file.

  amdgpu:
   - partial revert of locking change

  radeon:
   - fix dma_resv logic inversion

  panel:
   - pi touchscreen panel init fixes

  vc4:
   - build fix
   - runtime pm refcount fix

  vmwgfx:
   - refcounting fix"

* tag 'drm-fixes-2022-04-23' of git://anongit.freedesktop.org/drm/drm:
  drm/amdgpu: partial revert "remove ctx->lock" v2
  Revert "drm: of: Lookup if child node has panel or bridge"
  Revert "drm: of: Properly try all possible cases for bridge/panel detection"
  drm/vc4: Use pm_runtime_resume_and_get to fix pm_runtime_get_sync() usage
  drm/vmwgfx: Fix gem refcounting and memory evictions
  drm/vc4: Fix build error when CONFIG_DRM_VC4=y && CONFIG_RASPBERRYPI_FIRMWARE=m
  drm/panel/raspberrypi-touchscreen: Initialise the bridge in prepare
  drm/panel/raspberrypi-touchscreen: Avoid NULL deref if not initialised
  dma-buf-map: remove renamed header file
  drm/radeon: fix logic inversion in radeon_sync_resv
2022-04-23 09:57:30 -07:00
Dave Airlie
c18a2a280c Merge tag 'drm-misc-fixes-2022-04-22' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
Two fixes for the raspberrypi panel initialisation, one fix for a logic
inversion in radeon, a build and pm refcounting fix for vc4, two reverts
for drm_of_get_bridge that caused a number of regression and a locking
regression for amdgpu.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20220422084403.2xrhf3jusdej5yo4@houat
2022-04-23 15:00:44 +10:00
Linus Torvalds
bb4ce2c658 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
 "The main and larger change here is a workaround for AMD's lack of
  cache coherency for encrypted-memory guests.

  I have another patch pending, but it's waiting for review from the
  architecture maintainers.

  RISC-V:

   - Remove 's' & 'u' as valid ISA extension

   - Do not allow disabling the base extensions 'i'/'m'/'a'/'c'

  x86:

   - Fix NMI watchdog in guests on AMD

   - Fix for SEV cache incoherency issues

   - Don't re-acquire SRCU lock in complete_emulated_io()

   - Avoid NULL pointer deref if VM creation fails

   - Fix race conditions between APICv disabling and vCPU creation

   - Bugfixes for disabling of APICv

   - Preserve BSP MSR_KVM_POLL_CONTROL across suspend/resume

  selftests:

   - Do not use bitfields larger than 32-bits, they differ between GCC
     and clang"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  kvm: selftests: introduce and use more page size-related constants
  kvm: selftests: do not use bitfields larger than 32-bits for PTEs
  KVM: SEV: add cache flush to solve SEV cache incoherency issues
  KVM: SVM: Flush when freeing encrypted pages even on SME_COHERENT CPUs
  KVM: SVM: Simplify and harden helper to flush SEV guest page(s)
  KVM: selftests: Silence compiler warning in the kvm_page_table_test
  KVM: x86/pmu: Update AMD PMC sample period to fix guest NMI-watchdog
  x86/kvm: Preserve BSP MSR_KVM_POLL_CONTROL across suspend/resume
  KVM: SPDX style and spelling fixes
  KVM: x86: Skip KVM_GUESTDBG_BLOCKIRQ APICv update if APICv is disabled
  KVM: x86: Pend KVM_REQ_APICV_UPDATE during vCPU creation to fix a race
  KVM: nVMX: Defer APICv updates while L2 is active until L1 is active
  KVM: x86: Tag APICv DISABLE inhibit, not ABSENT, if APICv is disabled
  KVM: Initialize debugfs_dentry when a VM is created to avoid NULL deref
  KVM: Add helpers to wrap vcpu->srcu_idx and yell if it's abused
  KVM: RISC-V: Use kvm_vcpu.srcu_idx, drop RISC-V's unnecessary copy
  KVM: x86: Don't re-acquire SRCU lock in complete_emulated_io()
  RISC-V: KVM: Restrict the extensions that can be disabled
  RISC-V: KVM: Remove 's' & 'u' as valid ISA extension
2022-04-22 17:58:36 -07:00
Nico Pache
e4a38402c3 oom_kill.c: futex: delay the OOM reaper to allow time for proper futex cleanup
The pthread struct is allocated on PRIVATE|ANONYMOUS memory [1] which
can be targeted by the oom reaper.  This mapping is used to store the
futex robust list head; the kernel does not keep a copy of the robust
list and instead references a userspace address to maintain the
robustness during a process death.

A race can occur between exit_mm and the oom reaper that allows the oom
reaper to free the memory of the futex robust list before the exit path
has handled the futex death:

    CPU1                               CPU2
    --------------------------------------------------------------------
    page_fault
    do_exit "signal"
    wake_oom_reaper
                                        oom_reaper
                                        oom_reap_task_mm (invalidates mm)
    exit_mm
    exit_mm_release
    futex_exit_release
    futex_cleanup
    exit_robust_list
    get_user (EFAULT- can't access memory)

If the get_user EFAULT's, the kernel will be unable to recover the
waiters on the robust_list, leaving userspace mutexes hung indefinitely.

Delay the OOM reaper, allowing more time for the exit path to perform
the futex cleanup.

Reproducer: https://gitlab.com/jsavitz/oom_futex_reproducer

Based on a patch by Michal Hocko.

Link: https://elixir.bootlin.com/glibc/glibc-2.35/source/nptl/allocatestack.c#L370 [1]
Link: https://lkml.kernel.org/r/20220414144042.677008-1-npache@redhat.com
Fixes: 2129258024 ("mm: oom: let oom_reap_task and exit_mmap run concurrently")
Signed-off-by: Joel Savitz <jsavitz@redhat.com>
Signed-off-by: Nico Pache <npache@redhat.com>
Co-developed-by: Joel Savitz <jsavitz@redhat.com>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Rafael Aquini <aquini@redhat.com>
Cc: Waiman Long <longman@redhat.com>
Cc: Herton R. Krzesinski <herton@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ben Segall <bsegall@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Daniel Bristot de Oliveira <bristot@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Joel Savitz <jsavitz@redhat.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-21 20:01:10 -07:00
Christophe Leroy
5f24d5a579 mm, hugetlb: allow for "high" userspace addresses
This is a fix for commit f6795053da ("mm: mmap: Allow for "high"
userspace addresses") for hugetlb.

This patch adds support for "high" userspace addresses that are
optionally supported on the system and have to be requested via a hint
mechanism ("high" addr parameter to mmap).

Architectures such as powerpc and x86 achieve this by making changes to
their architectural versions of hugetlb_get_unmapped_area() function.
However, arm64 uses the generic version of that function.

So take into account arch_get_mmap_base() and arch_get_mmap_end() in
hugetlb_get_unmapped_area().  To allow that, move those two macros out
of mm/mmap.c into include/linux/sched/mm.h

If these macros are not defined in architectural code then they default
to (TASK_SIZE) and (base) so should not introduce any behavioural
changes to architectures that do not define them.

For the time being, only ARM64 is affected by this change.

Catalin (ARM64) said
 "We should have fixed hugetlb_get_unmapped_area() as well when we added
  support for 52-bit VA. The reason for commit f6795053da was to
  prevent normal mmap() from returning addresses above 48-bit by default
  as some user-space had hard assumptions about this.

  It's a slight ABI change if you do this for hugetlb_get_unmapped_area()
  but I doubt anyone would notice. It's more likely that the current
  behaviour would cause issues, so I'd rather have them consistent.

  Basically when arm64 gained support for 52-bit addresses we did not
  want user-space calling mmap() to suddenly get such high addresses,
  otherwise we could have inadvertently broken some programs (similar
  behaviour to x86 here). Hence we added commit f6795053da. But we
  missed hugetlbfs which could still get such high mmap() addresses. So
  in theory that's a potential regression that should have bee addressed
  at the same time as commit f6795053da (and before arm64 enabled
  52-bit addresses)"

Link: https://lkml.kernel.org/r/ab847b6edb197bffdfe189e70fb4ac76bfe79e0d.1650033747.git.christophe.leroy@csgroup.eu
Fixes: f6795053da ("mm: mmap: Allow for "high" userspace addresses")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: <stable@vger.kernel.org>	[5.0.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-21 20:01:09 -07:00
Shakeel Butt
9b3016154c memcg: sync flush only if periodic flush is delayed
Daniel Dao has reported [1] a regression on workloads that may trigger a
lot of refaults (anon and file).  The underlying issue is that flushing
rstat is expensive.  Although rstat flush are batched with (nr_cpus *
MEMCG_BATCH) stat updates, it seems like there are workloads which
genuinely do stat updates larger than batch value within short amount of
time.  Since the rstat flush can happen in the performance critical
codepaths like page faults, such workload can suffer greatly.

This patch fixes this regression by making the rstat flushing
conditional in the performance critical codepaths.  More specifically,
the kernel relies on the async periodic rstat flusher to flush the stats
and only if the periodic flusher is delayed by more than twice the
amount of its normal time window then the kernel allows rstat flushing
from the performance critical codepaths.

Now the question: what are the side-effects of this change? The worst
that can happen is the refault codepath will see 4sec old lruvec stats
and may cause false (or missed) activations of the refaulted page which
may under-or-overestimate the workingset size.  Though that is not very
concerning as the kernel can already miss or do false activations.

There are two more codepaths whose flushing behavior is not changed by
this patch and we may need to come to them in future.  One is the
writeback stats used by dirty throttling and second is the deactivation
heuristic in the reclaim.  For now keeping an eye on them and if there
is report of regression due to these codepaths, we will reevaluate then.

Link: https://lore.kernel.org/all/CA+wXwBSyO87ZX5PVwdHm-=dBjZYECGmfnydUicUyrQqndgX2MQ@mail.gmail.com [1]
Link: https://lkml.kernel.org/r/20220304184040.1304781-1-shakeelb@google.com
Fixes: 1f828223b7 ("memcg: flush lruvec stats in the refault")
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Reported-by: Daniel Dao <dqminh@cloudflare.com>
Tested-by: Ivan Babrou <ivan@cloudflare.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Koutný <mkoutny@suse.com>
Cc: Frank Hofmann <fhofmann@cloudflare.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-21 20:01:09 -07:00
Naoya Horiguchi
405ce05123 mm/hwpoison: fix race between hugetlb free/demotion and memory_failure_hugetlb()
There is a race condition between memory_failure_hugetlb() and hugetlb
free/demotion, which causes setting PageHWPoison flag on the wrong page.
The one simple result is that wrong processes can be killed, but another
(more serious) one is that the actual error is left unhandled, so no one
prevents later access to it, and that might lead to more serious results
like consuming corrupted data.

Think about the below race window:

  CPU 1                                   CPU 2
  memory_failure_hugetlb
  struct page *head = compound_head(p);
                                          hugetlb page might be freed to
                                          buddy, or even changed to another
                                          compound page.

  get_hwpoison_page -- page is not what we want now...

The current code first does prechecks roughly and then reconfirms after
taking refcount, but it's found that it makes code overly complicated,
so move the prechecks in a single hugetlb_lock range.

A newly introduced function, try_memory_failure_hugetlb(), always takes
hugetlb_lock (even for non-hugetlb pages).  That can be improved, but
memory_failure() is rare in principle, so should not be a big problem.

Link: https://lkml.kernel.org/r/20220408135323.1559401-2-naoya.horiguchi@linux.dev
Fixes: 761ad8d7c7 ("mm: hwpoison: introduce memory_failure_hugetlb()")
Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Reported-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-21 20:01:09 -07:00
Mingwei Zhang
683412ccf6 KVM: SEV: add cache flush to solve SEV cache incoherency issues
Flush the CPU caches when memory is reclaimed from an SEV guest (where
reclaim also includes it being unmapped from KVM's memslots).  Due to lack
of coherency for SEV encrypted memory, failure to flush results in silent
data corruption if userspace is malicious/broken and doesn't ensure SEV
guest memory is properly pinned and unpinned.

Cache coherency is not enforced across the VM boundary in SEV (AMD APM
vol.2 Section 15.34.7). Confidential cachelines, generated by confidential
VM guests have to be explicitly flushed on the host side. If a memory page
containing dirty confidential cachelines was released by VM and reallocated
to another user, the cachelines may corrupt the new user at a later time.

KVM takes a shortcut by assuming all confidential memory remain pinned
until the end of VM lifetime. Therefore, KVM does not flush cache at
mmu_notifier invalidation events. Because of this incorrect assumption and
the lack of cache flushing, malicous userspace can crash the host kernel:
creating a malicious VM and continuously allocates/releases unpinned
confidential memory pages when the VM is running.

Add cache flush operations to mmu_notifier operations to ensure that any
physical memory leaving the guest VM get flushed. In particular, hook
mmu_notifier_invalidate_range_start and mmu_notifier_release events and
flush cache accordingly. The hook after releasing the mmu lock to avoid
contention with other vCPUs.

Cc: stable@vger.kernel.org
Suggested-by: Sean Christpherson <seanjc@google.com>
Reported-by: Mingwei Zhang <mizhang@google.com>
Signed-off-by: Mingwei Zhang <mizhang@google.com>
Message-Id: <20220421031407.2516575-4-mizhang@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-21 15:41:00 -04:00
Tomas Winkler
1e3dc1d862 drm/i915/gsc: add gsc as a mei auxiliary device
GSC is a graphics system controller, it provides
a chassis controller for graphics discrete cards.

There are two MEI interfaces in GSC: HECI1 and HECI2.

Both interfaces are on the BAR0 at offsets 0x00258000 and 0x00259000.
GSC is a GT Engine (class 4: instance 6). HECI1 interrupt is signaled
via bit 15 and HECI2 via bit 14 in the interrupt register.

This patch exports GSC as auxiliary device for mei driver to bind to
for HECI2 interface and prepares for HECI1 interface as
it will follow up soon.

CC: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Vitaly Lubart <vitaly.lubart@intel.com>
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220419193314.526966-2-daniele.ceraolospurio@intel.com
2022-04-21 11:33:56 -07:00
Sean Christopherson
2031f28768 KVM: Add helpers to wrap vcpu->srcu_idx and yell if it's abused
Add wrappers to acquire/release KVM's SRCU lock when stashing the index
in vcpu->src_idx, along with rudimentary detection of illegal usage,
e.g. re-acquiring SRCU and thus overwriting vcpu->src_idx.  Because the
SRCU index is (currently) either 0 or 1, illegal nesting bugs can go
unnoticed for quite some time and only cause problems when the nested
lock happens to get a different index.

Wrap the WARNs in PROVE_RCU=y, and make them ONCE, otherwise KVM will
likely yell so loudly that it will bring the kernel to its knees.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Tested-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <20220415004343.2203171-4-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-21 13:16:11 -04:00
Borislav Petkov
8d084b2eae usb: typec: tcpm: Fix undefined behavior due to shift overflowing the constant
Fix:

  drivers/usb/typec/tcpm/tcpm.c: In function ‘run_state_machine’:
  drivers/usb/typec/tcpm/tcpm.c:4724:3: error: case label does not reduce to an integer constant
     case BDO_MODE_TESTDATA:
     ^~~~

See https://lore.kernel.org/r/YkwQ6%2BtIH8GQpuct@zn.tnic for the gory
details as to why it triggers with older gccs only.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: linux-usb@vger.kernel.org
Link: https://lore.kernel.org/r/20220405151517.29753-8-bp@alien8.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-21 18:33:56 +02:00
Oleksandr Ocheretnyi
37c5f9e80e mtd: fix 'part' field data corruption in mtd_info
Commit 46b5889cc2 ("mtd: implement proper partition handling")
started using "mtd_get_master_ofs()" in mtd callbacks to determine
memory offsets by means of 'part' field from mtd_info, what previously
was smashed accessing 'master' field in the mtd_set_dev_defaults() method.
That provides wrong offset what causes hardware access errors.

Just make 'part', 'master' as separate fields, rather than using
union type to avoid 'part' data corruption when mtd_set_dev_defaults()
is called.

Fixes: 46b5889cc2 ("mtd: implement proper partition handling")
Signed-off-by: Oleksandr Ocheretnyi <oocheret@cisco.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20220417184649.449289-1-oocheret@cisco.com
2022-04-21 09:29:05 +02:00
Peter Zijlstra
d4e5268a08 x86,objtool: Mark cpu_startup_entry() __noreturn
GCC-8 isn't clever enough to figure out that cpu_start_entry() is a
noreturn while objtool is. This results in code after the call in
start_secondary(). Give GCC a hand so that they all agree on things.

  vmlinux.o: warning: objtool: start_secondary()+0x10e: unreachable

Reported-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220408094718.383658532@infradead.org
2022-04-19 21:58:48 +02:00
Song Liu
559089e0a9 vmalloc: replace VM_NO_HUGE_VMAP with VM_ALLOW_HUGE_VMAP
Huge page backed vmalloc memory could benefit performance in many cases.
However, some users of vmalloc may not be ready to handle huge pages for
various reasons: hardware constraints, potential pages split, etc.
VM_NO_HUGE_VMAP was introduced to allow vmalloc users to opt-out huge
pages.  However, it is not easy to track down all the users that require
the opt-out, as the allocation are passed different stacks and may cause
issues in different layers.

To address this issue, replace VM_NO_HUGE_VMAP with an opt-in flag,
VM_ALLOW_HUGE_VMAP, so that users that benefit from huge pages could ask
specificially.

Also, remove vmalloc_no_huge() and add opt-in helper vmalloc_huge().

Fixes: fac54e2bfb ("x86/Kconfig: Select HAVE_ARCH_HUGE_VMALLOC with HAVE_ARCH_HUGE_VMAP")
Link: https://lore.kernel.org/netdev/14444103-d51b-0fb3-ee63-c3f182f0b546@molgen.mpg.de/"
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Song Liu <song@kernel.org>
Reviewed-by: Rik van Riel <riel@surriel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-19 12:08:57 -07:00
Christian Brauner
705191b03d fs: fix acl translation
Last cycle we extended the idmapped mounts infrastructure to support
idmapped mounts of idmapped filesystems (No such filesystem yet exist.).
Since then, the meaning of an idmapped mount is a mount whose idmapping
is different from the filesystems idmapping.

While doing that work we missed to adapt the acl translation helpers.
They still assume that checking for the identity mapping is enough.  But
they need to use the no_idmapping() helper instead.

Note, POSIX ACLs are always translated right at the userspace-kernel
boundary using the caller's current idmapping and the initial idmapping.
The order depends on whether we're coming from or going to userspace.
The filesystem's idmapping doesn't matter at the border.

Consequently, if a non-idmapped mount is passed we need to make sure to
always pass the initial idmapping as the mount's idmapping and not the
filesystem idmapping.  Since it's irrelevant here it would yield invalid
ids and prevent setting acls for filesystems that are mountable in a
userns and support posix acls (tmpfs and fuse).

I verified the regression reported in [1] and verified that this patch
fixes it.  A regression test will be added to xfstests in parallel.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=215849 [1]
Fixes: bd303368b7 ("fs: support mapped mounts of mapped filesystems")
Cc: Seth Forshee <sforshee@digitalocean.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: <stable@vger.kernel.org> # 5.17
Cc: <regressions@lists.linux.dev>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-19 10:19:02 -07:00
Paul Cercueil
40f458b781 Merge drm/drm-next into drm-misc-next
drm/drm-next has a build fix for the NewVision NV3052C panel
(drivers/gpu/drm/panel/panel-newvision-nv3052c.c), which needs to be
merged back to drm-misc-next, as it was failing to build there.

Signed-off-by: Paul Cercueil <paul@crapouillou.net>
2022-04-18 20:46:55 +01:00
Linus Torvalds
de6e933668 Merge tag 'gpio-fixes-for-v5.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
 "A single fix for gpio-sim and two patches for GPIO ACPI pulled from
  Andy:

   - fix the set/get_multiple() callbacks in gpio-sim

   - use correct format characters in gpiolib-acpi

   - use an unsigned type for pins in gpiolib-acpi"

* tag 'gpio-fixes-for-v5.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio: sim: fix setting and getting multiple lines
  gpiolib: acpi: Convert type for pin to be unsigned
  gpiolib: acpi: use correct format characters
2022-04-16 17:01:43 -07:00
Linus Torvalds
92edbe32e3 Merge tag 'random-5.18-rc3-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random
Pull random number generator fixes from Jason Donenfeld:

 - Per your suggestion, random reads now won't fail if there's a page
   fault after some non-zero amount of data has been read, which makes
   the behavior consistent with all other reads in the kernel.

 - Rather than an inconsistent mix of random_get_entropy() returning an
   unsigned long or a cycles_t, now it just returns an unsigned long.

 - A memcpy() was replaced with an memmove(), because the addresses are
   sometimes overlapping. In practice the destination is always before
   the source, so not really an issue, but better to be correct than
   not.

* tag 'random-5.18-rc3-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
  random: use memmove instead of memcpy for remaining 32 bytes
  random: make random_get_entropy() return an unsigned long
  random: allow partial reads if later user copies fail
2022-04-16 16:42:53 -07:00
Bartosz Golaszewski
0ebb4fbe31 Merge tag 'intel-gpio-v5.18-2' of gitolite.kernel.org:pub/scm/linux/kernel/git/andy/linux-gpio-intel into gpio/for-current
intel-gpio for v5.18-2

* Couple of fixes related to handling unsigned value of the pin from ACPI

gpiolib:
 -  acpi: Convert type for pin to be unsigned
 -  acpi: use correct format characters
2022-04-16 21:57:00 +02:00
Linus Torvalds
59250f8a7f Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "14 patches.

  Subsystems affected by this patch series: MAINTAINERS, binfmt, and
  mm (tmpfs, secretmem, kasan, kfence, pagealloc, zram, compaction,
  hugetlb, vmalloc, and kmemleak)"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm: kmemleak: take a full lowmem check in kmemleak_*_phys()
  mm/vmalloc: fix spinning drain_vmap_work after reading from /proc/vmcore
  revert "fs/binfmt_elf: use PT_LOAD p_align values for static PIE"
  revert "fs/binfmt_elf: fix PT_LOAD p_align values for loaders"
  hugetlb: do not demote poisoned hugetlb pages
  mm: compaction: fix compiler warning when CONFIG_COMPACTION=n
  mm: fix unexpected zeroed page mapping with zram swap
  mm, page_alloc: fix build_zonerefs_node()
  mm, kfence: support kmem_dump_obj() for KFENCE objects
  kasan: fix hw tags enablement when KUNIT tests are disabled
  irq_work: use kasan_record_aux_stack_noalloc() record callstack
  mm/secretmem: fix panic when growing a memfd_secret
  tmpfs: fix regressions from wider use of ZERO_PAGE
  MAINTAINERS: Broadcom internal lists aren't maintainers
2022-04-15 15:57:18 -07:00
Marco Elver
2dfe63e61c mm, kfence: support kmem_dump_obj() for KFENCE objects
Calling kmem_obj_info() via kmem_dump_obj() on KFENCE objects has been
producing garbage data due to the object not actually being maintained
by SLAB or SLUB.

Fix this by implementing __kfence_obj_info() that copies relevant
information to struct kmem_obj_info when the object was allocated by
KFENCE; this is called by a common kmem_obj_info(), which also calls the
slab/slub/slob specific variant now called __kmem_obj_info().

For completeness, kmem_dump_obj() now displays if the object was
allocated by KFENCE.

Link: https://lore.kernel.org/all/20220323090520.GG16885@xsang-OptiPlex-9020/
Link: https://lkml.kernel.org/r/20220406131558.3558585-1-elver@google.com
Fixes: b89fb5ef0c ("mm, kfence: insert KFENCE hooks for SLUB")
Fixes: d3fb45f370 ("mm, kfence: insert KFENCE hooks for SLAB")
Signed-off-by: Marco Elver <elver@google.com>
Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Reported-by: kernel test robot <oliver.sang@intel.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>	[slab]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-15 14:49:55 -07:00
Linus Torvalds
fb649bda6f Merge tag 'block-5.18-2022-04-15' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:

 - Moving of lower_48_bits() to the block layer and a fix for the
   unaligned_be48 added with that originally (Alexander, Keith)

 - Fix a bad WARN_ON() for trim size checking (Ming)

 - A polled IO timeout fix for null_blk (Ming)

 - Silence IO error printing for dead disks (Christoph)

 - Compat mode range fix (Khazhismel)

 - NVMe pull request via Christoph:
     - Tone down the error logging added this merge window a bit
       (Chaitanya Kulkarni)
     - Quirk devices with non-unique unique identifiers (Christoph)

* tag 'block-5.18-2022-04-15' of git://git.kernel.dk/linux-block:
  block: don't print I/O error warning for dead disks
  block/compat_ioctl: fix range check in BLKGETSIZE
  nvme-pci: disable namespace identifiers for Qemu controllers
  nvme-pci: disable namespace identifiers for the MAXIO MAP1002/1202
  nvme: add a quirk to disable namespace identifiers
  nvme: don't print verbose errors for internal passthrough requests
  block: null_blk: end timed out poll request
  block: fix offset/size check in bio_trim()
  asm-generic: fix __get_unaligned_be48() on 32 bit platforms
  block: move lower_48_bits() to block
2022-04-15 11:38:55 -07:00
Linus Torvalds
38a5e3fb17 Merge tag 'vfio-v5.18-rc3' of https://github.com/awilliam/linux-vfio
Pull vfio fix from Alex Williamson:

 - Fix VF token checking for vfio-pci variant drivers (Jason Gunthorpe)

* tag 'vfio-v5.18-rc3' of https://github.com/awilliam/linux-vfio:
  vfio/pci: Fix vf_token mechanism when device-specific VF drivers are used
2022-04-14 16:06:56 -07:00
Linus Torvalds
ec9c57a732 Merge tag 'fscache-fixes-20220413' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
Pull fscache fixes from David Howells:
 "Here's a collection of fscache and cachefiles fixes and misc small
  cleanups. The two main fixes are:

   - Add a missing unmark of the inode in-use mark in an error path.

   - Fix a KASAN slab-out-of-bounds error when setting the xattr on a
     cachefiles volume due to the wrong length being given to memcpy().

  In addition, there's the removal of an unused parameter, removal of an
  unused Kconfig option, conditionalising a bit of procfs-related stuff
  and some doc fixes"

* tag 'fscache-fixes-20220413' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  fscache: remove FSCACHE_OLD_API Kconfig option
  fscache: Use wrapper fscache_set_cache_state() directly when relinquishing
  fscache: Move fscache_cookies_seq_ops specific code under CONFIG_PROC_FS
  fscache: Remove the cookie parameter from fscache_clear_page_bits()
  docs: filesystems: caching/backend-api.rst: fix an object withdrawn API
  docs: filesystems: caching/backend-api.rst: correct two relinquish APIs use
  cachefiles: Fix KASAN slab-out-of-bounds in cachefiles_set_volume_xattr
  cachefiles: unmark inode in use in error path
2022-04-14 10:51:20 -07:00
Jason Gunthorpe
1ef3342a93 vfio/pci: Fix vf_token mechanism when device-specific VF drivers are used
get_pf_vdev() tries to check if a PF is a VFIO PF by looking at the driver:

       if (pci_dev_driver(physfn) != pci_dev_driver(vdev->pdev)) {

However now that we have multiple VF and PF drivers this is no longer
reliable.

This means that security tests realted to vf_token can be skipped by
mixing and matching different VFIO PCI drivers.

Instead of trying to use the driver core to find the PF devices maintain a
linked list of all PF vfio_pci_core_device's that we have called
pci_enable_sriov() on.

When registering a VF just search the list to see if the PF is present and
record the match permanently in the struct. PCI core locking prevents a PF
from passing pci_disable_sriov() while VF drivers are attached so the VFIO
owned PF becomes a static property of the VF.

In common cases where vfio does not own the PF the global list remains
empty and the VF's pointer is statically NULL.

This also fixes a lockdep splat from recursive locking of the
vfio_group::device_lock between vfio_device_get_from_name() and
vfio_device_get_from_dev(). If the VF and PF share the same group this
would deadlock.

Fixes: ff53edf6d6 ("vfio/pci: Split the pci_driver code out of vfio_pci_core.c")
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/0-v3-876570980634+f2e8-vfio_vf_token_jgg@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-04-13 11:37:44 -06:00
Karol Herbst
f8e6b7babf dma-buf-map: remove renamed header file
commit 7938f42181 ("dma-buf-map: Rename to iosys-map") already renamed
this file, but it got brought back by a merge.

Delete it for real this time.

Fixes: 30424ebae8 ("Merge tag 'drm-intel-gt-next-2022-02-17' of git://anongit.freedesktop.org/drm/drm-intel into drm-intel-next")
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20220411134404.524776-1-kherbst@redhat.com
2022-04-13 15:27:38 +02:00
Jason A. Donenfeld
b0c3e796f2 random: make random_get_entropy() return an unsigned long
Some implementations were returning type `unsigned long`, while others
that fell back to get_cycles() were implicitly returning a `cycles_t` or
an untyped constant int literal. That makes for weird and confusing
code, and basically all code in the kernel already handled it like it
was an `unsigned long`. I recently tried to handle it as the largest
type it could be, a `cycles_t`, but doing so doesn't really help with
much.

Instead let's just make random_get_entropy() return an unsigned long all
the time. This also matches the commonly used `arch_get_random_long()`
function, so now RDRAND and RDTSC return the same sized integer, which
means one can fallback to the other more gracefully.

Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Theodore Ts'o <tytso@mit.edu>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-04-13 13:58:57 +02:00
Linus Torvalds
c1488c9751 Merge tag 'nfsd-5.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Pull nfsd fixes from Chuck Lever:

 - Fix a write performance regression

 - Fix crashes during request deferral on RDMA transports

* tag 'nfsd-5.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
  SUNRPC: Fix the svc_deferred_event trace class
  SUNRPC: Fix NFSD's request deferral on RDMA transports
  nfsd: Clean up nfsd_file_put()
  nfsd: Fix a write performance regression
  SUNRPC: Return true/false (not 1/0) from bool functions
2022-04-12 14:23:19 -10:00
Dave Airlie
b85ffe47c4 Merge tag 'drm-misc-next-2022-04-07' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for 5.19:

UAPI Changes:

Cross-subsystem Changes:

Core Changes:
  - atomic: Add atomic_print_state to private objects
  - edid: Constify the EDID parsing API, rework of the API
  - dma-buf: Add dma_resv_replace_fences, dma_resv_get_singleton, make
    dma_resv_excl_fence private
  - format: Support monochrome formats
  - fbdev: fixes for cfb_imageblit and sys_imageblit, pagelist
    corruption fix
  - selftests: several small fixes
  - ttm: Rework bulk move handling

Driver Changes:
  - Switch all relevant drivers to drm_mode_copy or drm_mode_duplicate
  - bridge: conversions to devm_drm_of_get_bridge and panel_bridge,
    autosuspend for analogix_dp, audio support for it66121, DSI to DPI
    support for tc358767, PLL fixes and I2C support for icn6211
  - bridge_connector: Enable HPD if supported
  - etnaviv: fencing improvements
  - gma500: GEM and GTT improvements, connector handling fixes
  - komeda: switch to plane reset helper
  - mediatek: MIPI DSI improvements
  - omapdrm: GEM improvements
  - panel: DT bindings fixes for st7735r, few fixes for ssd130x, new
    panels: ltk035c5444t, B133UAN01, NV3052C
  - qxl: Allow to run on arm64
  - sysfb: Kconfig rework, support for VESA graphic mode selection
  - vc4: Add a tracepoint for CL submissions, HDMI YUV output,
    HDMI and clock improvements
  - virtio: Remove restriction of non-zero blob_flags,
  - vmwgfx: support for CursorMob and CursorBypass 4, various
    improvements and small fixes

[airlied: fixup conflict with newvision panel callbacks]
Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20220407085940.pnflvjojs4qw4b77@houat
2022-04-12 17:44:27 +10:00
Keith Busch
868e6139c5 block: move lower_48_bits() to block
The function is not generally applicable enough to be included in the core
kernel header. Move it to block since it's the only subsystem using it.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Link: https://lore.kernel.org/r/20220327173316.315-1-kbusch@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-04-11 19:18:27 -06:00
Linus Torvalds
33563138ac Merge tag 'driver-core-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
 "Here are two small driver core changes for 5.18-rc2.

  They are the final bits in the removal of the default_attrs field in
  struct kobj_type. I had to wait until after 5.18-rc1 for all of the
  changes to do this came in through different development trees, and
  then one new user snuck in. So this series has two changes:

   - removal of the default_attrs field in the powerpc/pseries/vas code.

     The change has been acked by the PPC maintainers to come through
     this tree

   - removal of default_attrs from struct kobj_type now that all
     in-kernel users are removed.

     This cleans up the kobject code a little bit and removes some
     duplicated functionality that confused people (now there is only
     one way to do default groups)

  Both of these have been in linux-next for all of this week with no
  reported problems"

* tag 'driver-core-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  kobject: kobj_type: remove default_attrs
  powerpc/pseries/vas: use default_groups in kobj_type
2022-04-10 09:55:09 -10:00
Linus Torvalds
50c94de67c Merge tag 'locking_urgent_for_v5.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixes from Borislav Petkov:

 - Allow the compiler to optimize away unused percpu accesses and change
   the local_lock_* macros back to inline functions

 - A couple of fixes to static call insn patching

* tag 'locking_urgent_for_v5.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  Revert "mm/page_alloc: mark pagesets as __maybe_unused"
  Revert "locking/local_lock: Make the empty local_lock_*() function a macro."
  x86/percpu: Remove volatile from arch_raw_cpu_ptr().
  static_call: Remove __DEFINE_STATIC_CALL macro
  static_call: Properly initialise DEFINE_STATIC_CALL_RET0()
  static_call: Don't make __static_call_return0 static
  x86,static_call: Fix __static_call_return0 for i386
2022-04-10 06:56:46 -10:00
Linus Torvalds
fa3b895da8 Merge tag 'gpio-fixes-for-v5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fix from Bartosz Golaszewski:

 - fix a race condition with consumers accessing the fields of GPIO IRQ
   chips before they're fully initialized

* tag 'gpio-fixes-for-v5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio: Restrict usage of GPIO chip irq members before initialization
2022-04-09 18:17:43 -10:00
Linus Torvalds
911b2b9516 Merge branch 'akpm' (patches from Andrew)
Merge fixes from Andrew Morton:
 "9 patches.

  Subsystems affected by this patch series: mm (migration, highmem,
  sparsemem, mremap, mempolicy, and memcg), lz4, mailmap, and
  MAINTAINERS"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  MAINTAINERS: add Tom as clang reviewer
  mm/list_lru.c: revert "mm/list_lru: optimize memcg_reparent_list_lru_node()"
  mailmap: update Vasily Averin's email address
  mm/mempolicy: fix mpol_new leak in shared_policy_replace
  mmmremap.c: avoid pointless invalidate_range_start/end on mremap(old_size=0)
  mm/sparsemem: fix 'mem_section' will never be NULL gcc 12 warning
  lz4: fix LZ4_decompress_safe_partial read out of bound
  highmem: fix checks in __kmap_local_sched_{in,out}
  mm: migrate: use thp_order instead of HPAGE_PMD_ORDER for new page allocation.
2022-04-08 14:31:41 -10:00
Waiman Long
a431dbbc54 mm/sparsemem: fix 'mem_section' will never be NULL gcc 12 warning
The gcc 12 compiler reports a "'mem_section' will never be NULL" warning
on the following code:

    static inline struct mem_section *__nr_to_section(unsigned long nr)
    {
    #ifdef CONFIG_SPARSEMEM_EXTREME
        if (!mem_section)
                return NULL;
    #endif
        if (!mem_section[SECTION_NR_TO_ROOT(nr)])
                return NULL;
       :

It happens with CONFIG_SPARSEMEM_EXTREME off.  The mem_section definition
is

    #ifdef CONFIG_SPARSEMEM_EXTREME
    extern struct mem_section **mem_section;
    #else
    extern struct mem_section mem_section[NR_SECTION_ROOTS][SECTIONS_PER_ROOT];
    #endif

In the !CONFIG_SPARSEMEM_EXTREME case, mem_section is a static
2-dimensional array and so the check "!mem_section[SECTION_NR_TO_ROOT(nr)]"
doesn't make sense.

Fix this warning by moving the "!mem_section[SECTION_NR_TO_ROOT(nr)]"
check up inside the CONFIG_SPARSEMEM_EXTREME block and adding an
explicit NR_SECTION_ROOTS check to make sure that there is no
out-of-bound array access.

Link: https://lkml.kernel.org/r/20220331180246.2746210-1-longman@redhat.com
Fixes: 3e347261a8 ("sparsemem extreme implementation")
Signed-off-by: Waiman Long <longman@redhat.com>
Reported-by: Justin Forbes <jforbes@redhat.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Rafael Aquini <aquini@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-08 14:20:36 -10:00
Yue Hu
2c547f2998 fscache: Remove the cookie parameter from fscache_clear_page_bits()
The cookie is not used at all, remove it and update the usage in io.c
and afs/write.c (which is the only user outside of fscache currently)
at the same time.

[DH: Amended the documentation also]

Signed-off-by: Yue Hu <huyue2@coolpad.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: linux-cachefs@redhat.com
Link: https://listman.redhat.com/archives/linux-cachefs/2022-April/006659.html
2022-04-08 23:54:37 +01:00