I found a bug in the previous version and this patch fixes the gap from
upstream version.
Fixes: fcc385fd44 ("FROMGIT: f2fs: factor out discard_cmd usage from general rb_tree use")
Signed-off-by: Jaegeuk Kim <jaegeuk@google.com>
(cherry picked from commit e39836183be8
https: //git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs.git dev)
Change-Id: I4dbfb9f1f2cc956685a7c4de5fcfbba705c30cfb
Add a vendor hook for pagecache hit/miss and other
vendor specific functions.
Bug: 174088128
Bug: 172987241
Signed-off-by: Chiawei Wang <chiaweiwang@google.com>
Change-Id: Ie9f14a69a86b8ed81de766e44e30f2eba1d9bd84
Signed-off-by: Richard Chang <richardycc@google.com>
(cherry picked from commit db158b4ae0)
Add a vendor hook for costly order page counting
and other vendor specific functions.
Bug: 174521902
Bug: 172987241
Signed-off-by: Chiawei Wang <chiaweiwang@google.com>
Change-Id: I89206727a462548cc3500b695d85c83ff003eec7
Signed-off-by: Richard Chang <richardycc@google.com>
(cherry picked from commit 369de37804)
This reverts commit 3df32812eb which is
commit b1a37ed00d upstream.
It breaks the Android KABI and if needed, should come back in an
abi-safe way.
Bug: 161946584
Cc: Lee Jones <joneslee@google.com>
Change-Id: I1f160797720e8bdf4960542e711fd17940a975d9
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
This reverts commit 02904e8a2f which is
commit 1c5d422124 upstream.
It breaks the Android KABI and if needed, should come back in an
abi-safe way.
Bug: 161946584
Cc: Lee Jones <joneslee@google.com>
Change-Id: I9a460d9dbc41512ee71ff607e875f2da9be7f9f6
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Even if we have multiple queues in the plug list, chances that they
are very interspersed is minimal. Don't bother spending CPU cycles
sorting the list.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Change-Id: Ia85d5c75ef4f2bf3f90e4d3408cffec5c41dcfe2
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Bug: 274474142
(cherry picked from commit df87eb0fce)
Signed-off-by: Bart Van Assche <bvanassche@google.com>
In addition to reverting commit 7b05bf7710 ("Revert "block/mq-deadline:
Prioritize high-priority requests""), this patch uses 'jiffies' instead
of ktime_get() in the code for aging lower priority requests.
This patch has been tested as follows:
Measured QD=1/jobs=1 IOPS for nullb with the mq-deadline scheduler.
Result without and with this patch: 555 K IOPS.
Measured QD=1/jobs=8 IOPS for nullb with the mq-deadline scheduler.
Result without and with this patch: about 380 K IOPS.
Ran the following script:
set -e
scriptdir=$(dirname "$0")
if [ -e /sys/module/scsi_debug ]; then modprobe -r scsi_debug; fi
modprobe scsi_debug ndelay=1000000 max_queue=16
sd=''
while [ -z "$sd" ]; do
sd=$(basename /sys/bus/pseudo/drivers/scsi_debug/adapter*/host*/target*/*/block/*)
done
echo $((100*1000)) > "/sys/block/$sd/queue/iosched/prio_aging_expire"
if [ -e /sys/fs/cgroup/io.prio.class ]; then
cd /sys/fs/cgroup
echo restrict-to-be >io.prio.class
echo +io > cgroup.subtree_control
else
cd /sys/fs/cgroup/blkio/
echo restrict-to-be >blkio.prio.class
fi
echo $$ >cgroup.procs
mkdir -p hipri
cd hipri
if [ -e io.prio.class ]; then
echo none-to-rt >io.prio.class
else
echo none-to-rt >blkio.prio.class
fi
{ "${scriptdir}/max-iops" -a1 -d32 -j1 -e mq-deadline "/dev/$sd" >& ~/low-pri.txt & }
echo $$ >cgroup.procs
"${scriptdir}/max-iops" -a1 -d32 -j1 -e mq-deadline "/dev/$sd" >& ~/hi-pri.txt
Result:
* 11000 IOPS for the high-priority job
* 40 IOPS for the low-priority job
If the prio aging expiry time is changed from 100s into 0, the IOPS results
change into 6712 and 6796 IOPS.
The max-iops script is a script that runs fio with the following arguments:
--bs=4K --gtod_reduce=1 --ioengine=libaio --ioscheduler=${arg_e} --runtime=60
--norandommap --rw=read --thread --buffered=0 --numjobs=${arg_j}
--iodepth=${arg_d} --iodepth_batch_submit=${arg_a}
--iodepth_batch_complete=$((arg_d / 2)) --name=${positional_argument_1}
--filename=${positional_argument_1}
Cc: Damien Le Moal <damien.lemoal@wdc.com>
Cc: Niklas Cassel <Niklas.Cassel@wdc.com>
Cc: Hannes Reinecke <hare@suse.de>
Change-Id: I6eea845db892741089014853e7f5c5756b44288e
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Link: https://lore.kernel.org/r/20210927220328.1410161-5-bvanassche@acm.org
[axboe: @latest -> @latest_start]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Bug: 274474142
(cherry picked from commit 322cff70d4)
Signed-off-by: Bart Van Assche <bvanassche@google.com>
Calculating the sum over all CPUs of per-CPU counters frequently is
inefficient. Hence switch from per-CPU to individual counters. Three
counters are protected by the mq-deadline spinlock since these are
only accessed from contexts that already hold that spinlock. The fourth
counter is atomic because protecting it with the mq-deadline spinlock
would trigger lock contention.
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com>
Cc: Hannes Reinecke <hare@suse.de>
Change-Id: If9a323c47dfa6aa1c61d0d43a0b1bfed92e137d8
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20210927220328.1410161-4-bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Bug: 274474142
(cherry picked from commit bce0363ed8)
Signed-off-by: Bart Van Assche <bvanassche@google.com>
The scheduler .insert_requests() callback is called when a request is
queued for the first time and also when it is requeued. Only count a
request the first time it is queued. Additionally, since the mq-deadline
scheduler only performs zone locking for requests that have been
inserted, skip the zone unlock code for requests that have not been
inserted into the mq-deadline scheduler.
Fixes: 38ba64d12d ("block/mq-deadline: Track I/O statistics")
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com>
Cc: Hannes Reinecke <hare@suse.de>
Change-Id: I75923e60be67bd6da62ac25acd7d0635151d99f5
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20210927220328.1410161-2-bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Bug: 274474142
(cherry picked from commit e2c7275dc0)
Signed-off-by: Bart Van Assche <bvanassche@google.com>
The attempts so far to make write pipelining work are unsuccessful.
Revert commit 10d6ef4ce0 until write
pipelining works reliably.
Bug: 274474142
Change-Id: Ie7fd92c40ddefd1803b15329a3b1bd1d94012365
Signed-off-by: Bart Van Assche <bvanassche@google.com>
Add new vendor hook when cpuset of task changed. This allows Pixel to
find a more energy efficient CPU instead of random distribution.
Bug: 236775946
Change-Id: I407637c85e2ea93585877312f090981fee848979
Signed-off-by: Jing-Ting Wu <Jing-Ting.Wu@mediatek.com>
Signed-off-by: Will McVicker <willmcvicker@google.com>
This reverts commit a027f0d72e. Multiple
partners have requested for this hook which has resulted in two
different versions -- android_rvh_set_cpus_allowed_by_task and
android_rvh_set_cpus_allowed_ptr_locked. These have since been
consolidated into a single vendor hook on android-mainline
(https://r.android.com/2135713). So let's update this branch to only
use android_rvh_set_cpus_allowed_by_task().
Bug: 236775946
Change-Id: I86f08021d6d87be96f559e133ccd09031bd1b8cd
Signed-off-by: Will McVicker <willmcvicker@google.com>
hwrng core uses two buffers that can be mixed in the
virtio-rng queue.
If the buffer is provided with wait=0 it is enqueued in the
virtio-rng queue but unused by the caller.
On the next call, core provides another buffer but the
first one is filled instead and the new one queued.
And the caller reads the data from the new one that is not
updated, and the data in the first one are lost.
To avoid this mix, virtio-rng needs to use its own unique
internal buffer at a cost of a data copy to the caller buffer.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Link: https://lore.kernel.org/r/20211028101111.128049-2-lvivier@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit bf3175bc50)
Bug: 249566340
Change-Id: Ica2fd680de4bb359923b94dae48c00f6207a6876
Signed-off-by: Alistair Delva <adelva@google.com>
Changes in 5.15.104
xfrm: Allow transport-mode states with AF_UNSPEC selector
drm/panfrost: Don't sync rpm suspension after mmu flushing
cifs: Move the in_send statistic to __smb_send_rqst()
drm/meson: fix 1px pink line on GXM when scaling video overlay
clk: HI655X: select REGMAP instead of depending on it
docs: Correct missing "d_" prefix for dentry_operations member d_weak_revalidate
scsi: mpt3sas: Fix NULL pointer access in mpt3sas_transport_port_add()
ALSA: hda: Match only Intel devices with CONTROLLER_IN_GPU()
netfilter: nft_nat: correct length for loading protocol registers
netfilter: nft_masq: correct length for loading protocol registers
netfilter: nft_redir: correct length for loading protocol registers
netfilter: nft_redir: correct value of inet type `.maxattrs`
scsi: core: Fix a procfs host directory removal regression
tcp: tcp_make_synack() can be called from process context
nfc: pn533: initialize struct pn533_out_arg properly
ipvlan: Make skb->skb_iif track skb->dev for l3s mode
i40e: Fix kernel crash during reboot when adapter is in recovery mode
vdpa_sim: not reset state in vdpasim_queue_ready
vdpa_sim: set last_used_idx as last_avail_idx in vdpasim_queue_ready
PCI: s390: Fix use-after-free of PCI resources with per-function hotplug
drm/i915/display: Workaround cursor left overs with PSR2 selective fetch enabled
drm/i915/display/psr: Use drm damage helpers to calculate plane damaged area
drm/i915/display/psr: Handle plane and pipe restrictions at every page flip
drm/i915/display: clean up comments
drm/i915/psr: Use calculated io and fast wake lines
net/smc: fix NULL sndbuf_desc in smc_cdc_tx_handler()
qed/qed_dev: guard against a possible division by zero
net: dsa: mt7530: remove now incorrect comment regarding port 5
net: dsa: mt7530: set PLL frequency and trgmii only when trgmii is used
loop: Fix use-after-free issues
net: tunnels: annotate lockless accesses to dev->needed_headroom
net: phy: smsc: bail out in lan87xx_read_status if genphy_read_status fails
nfc: st-nci: Fix use after free bug in ndlc_remove due to race condition
net/smc: fix deadlock triggered by cancel_delayed_work_syn()
net: usb: smsc75xx: Limit packet length to skb->len
drm/bridge: Fix returned array size name for atomic_get_input_bus_fmts kdoc
block: null_blk: Fix handling of fake timeout request
nvme: fix handling single range discard request
nvmet: avoid potential UAF in nvmet_req_complete()
block: sunvdc: add check for mdesc_grab() returning NULL
ice: xsk: disable txq irq before flushing hw
net: dsa: mv88e6xxx: fix max_mtu of 1492 on 6165, 6191, 6220, 6250, 6290
ravb: avoid PHY being resumed when interface is not up
sh_eth: avoid PHY being resumed when interface is not up
ipv4: Fix incorrect table ID in IOCTL path
net: usb: smsc75xx: Move packet length check to prevent kernel panic in skb_pull
net/iucv: Fix size of interrupt data
selftests: net: devlink_port_split.py: skip test if no suitable device available
qed/qed_mng_tlv: correctly zero out ->min instead of ->hour
ethernet: sun: add check for the mdesc_grab()
bonding: restore IFF_MASTER/SLAVE flags on bond enslave ether type change
bonding: restore bond's IFF_SLAVE flag if a non-eth dev enslave fails
hwmon: (adt7475) Display smoothing attributes in correct order
hwmon: (adt7475) Fix masking of hysteresis registers
hwmon: (xgene) Fix use after free bug in xgene_hwmon_remove due to race condition
hwmon: (ina3221) return prober error code
hwmon: (ucd90320) Add minimum delay between bus accesses
hwmon: tmp512: drop of_match_ptr for ID table
kconfig: Update config changed flag before calling callback
hwmon: (adm1266) Set `can_sleep` flag for GPIO chip
hwmon: (ltc2992) Set `can_sleep` flag for GPIO chip
media: m5mols: fix off-by-one loop termination error
mmc: atmel-mci: fix race between stop command and start of next command
jffs2: correct logic when creating a hole in jffs2_write_begin
ext4: fail ext4_iget if special inode unallocated
ext4: update s_journal_inum if it changes after journal replay
ext4: fix task hung in ext4_xattr_delete_inode
drm/amdkfd: Fix an illegal memory access
net/9p: fix bug in client create for .L
sh: intc: Avoid spurious sizeof-pointer-div warning
drm/amd/display: fix shift-out-of-bounds in CalculateVMAndRowBytes
ext4: fix possible double unlock when moving a directory
tty: serial: fsl_lpuart: skip waiting for transmission complete when UARTCTRL_SBK is asserted
serial: 8250_em: Fix UART port type
serial: 8250_fsl: fix handle_irq locking
firmware: xilinx: don't make a sleepable memory allocation from an atomic context
s390/ipl: add missing intersection check to ipl_report handling
interconnect: fix mem leak when freeing nodes
interconnect: exynos: fix node leak in probe PM QoS error path
tracing: Make splice_read available again
tracing: Check field value in hist_field_name()
tracing: Make tracepoint lockdep check actually test something
cifs: Fix smb2_set_path_size()
KVM: nVMX: add missing consistency checks for CR0 and CR4
ALSA: hda: intel-dsp-config: add MTL PCI id
ALSA: hda/realtek: Fix the speaker output on Samsung Galaxy Book2 Pro
Revert "riscv: mm: notify remote harts about mmu cache updates"
riscv: asid: Fixup stale TLB entry cause application crash
drm/shmem-helper: Remove another errant put in error path
drm/sun4i: fix missing component unbind on bind errors
drm/amd/pm: Fix sienna cichlid incorrect OD volage after resume
mptcp: fix possible deadlock in subflow_error_report
mptcp: add ro_after_init for tcp{,v6}_prot_override
mptcp: avoid setting TCP_CLOSE state twice
mptcp: fix lockdep false positive in mptcp_pm_nl_create_listen_socket()
ftrace: Fix invalid address access in lookup_rec() when index is 0
nvme-pci: add NVME_QUIRK_BOGUS_NID for Netac NV3000
ice: avoid bonding causing auxiliary plug/unplug under RTNL lock
mm/userfaultfd: propagate uffd-wp bit when PTE-mapping the huge zeropage
mmc: sdhci_am654: lower power-on failed message severity
fbdev: stifb: Provide valid pixelclock and add fb_check_var() checks
trace/hwlat: Do not wipe the contents of per-cpu thread data
net: phy: nxp-c45-tja11xx: fix MII_BASIC_CONFIG_REV bit
cpuidle: psci: Iterate backwards over list in psci_pd_remove()
x86/mce: Make sure logged MCEs are processed after sysfs update
x86/mm: Fix use of uninitialized buffer in sme_enable()
x86/resctrl: Clear staged_config[] before and after it is used
drm/i915: Don't use stolen memory for ring buffers with LLC
drm/i915/active: Fix misuse of non-idle barriers as fence trackers
io_uring: avoid null-ptr-deref in io_arm_poll_handler
PCI: Unify delay handling for reset and resume
PCI/DPC: Await readiness of secondary bus after reset
HID: core: Provide new max_buffer_size attribute to over-ride the default
HID: uhid: Over-ride the default maximum data buffer value with our own
perf: Fix check before add_event_to_groups() in perf_group_detach()
Linux 5.15.104
Change-Id: Ibe292ef3acc57f5ff1ab272fc99756aa49f68c62
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
kthread_create_on_cpu no longer marks the created thread as a per cpu
thread, so the affinity might get lost on suspend or other hotplug
events.
Export kthread_set_per_cpu so a module that needs a kthread to stay on a
specific cpu can accomplish that.
Bug: 274202992
Change-Id: Iaafc12f93f341f9e0586cb051b7f1c941f140866
Signed-off-by: Arve Hjønnevåg <arve@android.com>
Changes in 5.15.103
fs: prevent out-of-bounds array speculation when closing a file descriptor
btrfs: fix percent calculation for bg reclaim message
perf inject: Fix --buildid-all not to eat up MMAP2
fork: allow CLONE_NEWTIME in clone3 flags
x86/CPU/AMD: Disable XSAVES on AMD family 0x17
drm/amdgpu: fix error checking in amdgpu_read_mm_registers for soc15
drm/connector: print max_requested_bpc in state debugfs
staging: rtl8723bs: Pass correct parameters to cfg80211_get_bss()
ext4: fix cgroup writeback accounting with fs-layer encryption
ext4: fix RENAME_WHITEOUT handling for inline directories
ext4: fix another off-by-one fsmap error on 1k block filesystems
ext4: move where set the MAY_INLINE_DATA flag is set
ext4: fix WARNING in ext4_update_inline_data
ext4: zero i_disksize when initializing the bootloader inode
nfc: change order inside nfc_se_io error path
KVM: Optimize kvm_make_vcpus_request_mask() a bit
KVM: Pre-allocate cpumasks for kvm_make_all_cpus_request_except()
KVM: Register /dev/kvm as the _very_ last thing during initialization
KVM: SVM: Don't rewrite guest ICR on AVIC IPI virtualization failure
KVM: SVM: Process ICR on AVIC IPI delivery failure due to invalid target
fs: dlm: fix log of lowcomms vs midcomms
fs: dlm: add midcomms init/start functions
fs: dlm: start midcomms before scand
udf: Fix off-by-one error when discarding preallocation
f2fs: avoid down_write on nat_tree_lock during checkpoint
f2fs: do not bother checkpoint by f2fs_get_node_info
f2fs: retry to update the inode page given data corruption
ipmi:ssif: Increase the message retry time
ipmi:ssif: Add a timer between request retries
irqdomain: Refactor __irq_domain_alloc_irqs()
iommu/vt-d: Fix PASID directory pointer coherency
block/brd: add error handling support for add_disk()
brd: mark as nowait compatible
arm64: efi: Make efi_rt_lock a raw_spinlock
RISC-V: Avoid dereferening NULL regs in die()
riscv: Avoid enabling interrupts in die()
riscv: Add header include guards to insn.h
scsi: core: Remove the /proc/scsi/${proc_name} directory earlier
regulator: Flag uncontrollable regulators as always_on
regulator: core: Fix off-on-delay-us for always-on/boot-on regulators
regulator: core: Use ktime_get_boottime() to determine how long a regulator was off
ext4: Fix possible corruption when moving a directory
drm/nouveau/kms/nv50-: remove unused functions
drm/nouveau/kms/nv50: fix nv50_wndw_new_ prototype
drm/msm: Fix potential invalid ptr free
drm/msm/a5xx: fix setting of the CP_PREEMPT_ENABLE_LOCAL register
drm/msm/a5xx: fix highest bank bit for a530
drm/msm/a5xx: fix the emptyness check in the preempt code
drm/msm/a5xx: fix context faults during ring switch
bgmac: fix *initial* chip reset to support BCM5358
nfc: fdp: add null check of devm_kmalloc_array in fdp_nci_i2c_read_device_properties
powerpc: dts: t1040rdb: fix compatible string for Rev A boards
ila: do not generate empty messages in ila_xlat_nl_cmd_get_mapping()
selftests: nft_nat: ensuring the listening side is up before starting the client
perf stat: Fix counting when initial delay configured
net: lan78xx: fix accessing the LAN7800's internal phy specific registers from the MAC driver
net: caif: Fix use-after-free in cfusbl_device_notify()
ice: copy last block omitted in ice_get_module_eeprom()
bpf, sockmap: Fix an infinite loop error when len is 0 in tcp_bpf_recvmsg_parser()
drm/msm/dpu: fix len of sc7180 ctl blocks
net: stmmac: add to set device wake up flag when stmmac init phy
net: phylib: get rid of unnecessary locking
bnxt_en: Avoid order-5 memory allocation for TPA data
netfilter: ctnetlink: revert to dumping mark regardless of event type
netfilter: tproxy: fix deadlock due to missing BH disable
btf: fix resolving BTF_KIND_VAR after ARRAY, STRUCT, UNION, PTR
net: phy: smsc: Cache interrupt mask
net: phy: smsc: fix link up detection in forced irq mode
net: ethernet: mtk_eth_soc: fix RX data corruption issue
scsi: megaraid_sas: Update max supported LD IDs to 240
netfilter: conntrack: adopt safer max chain length
platform: x86: MLX_PLATFORM: select REGMAP instead of depending on it
net/smc: fix fallback failed while sendmsg with fastopen
octeontx2-af: Unlock contexts in the queue context cache in case of fault detection
SUNRPC: Fix a server shutdown leak
net: dsa: mt7530: permit port 5 to work without port 6 on MT7621 SoC
af_unix: Remove unnecessary brackets around CONFIG_AF_UNIX_OOB.
af_unix: fix struct pid leaks in OOB support
riscv: Use READ_ONCE_NOCHECK in imprecise unwinding stack mode
s390/ftrace: remove dead code
RISC-V: Don't check text_mutex during stop_machine
ext4: Fix deadlock during directory rename
irqdomain: Fix mapping-creation race
nbd: use the correct block_device in nbd_bdev_reset
iommu/amd: Add PCI segment support for ivrs_[ioapic/hpet/acpihid] commands
iommu/amd: Fix ill-formed ivrs_ioapic, ivrs_hpet and ivrs_acpihid options
iommu/amd: Add a length limitation for the ivrs_acpihid command-line parameter
staging: rtl8723bs: clean up comparsions to NULL
Staging: rtl8723bs: Placing opening { braces in previous line
staging: rtl8723bs: fix placement of braces
staging: rtl8723bs: Fix key-store index handling
watch_queue: fix IOC_WATCH_QUEUE_SET_SIZE alloc error paths
tpm/eventlog: Don't abort tpm_read_log on faulty ACPI address
xfs: use setattr_copy to set vfs inode attributes
xfs: remove XFS_PREALLOC_SYNC
xfs: fallocate() should call file_modified()
xfs: set prealloc flag in xfs_alloc_file_space()
fs: add mode_strip_sgid() helper
fs: move S_ISGID stripping into the vfs_*() helpers
attr: add in_group_or_capable()
fs: move should_remove_suid()
attr: add setattr_should_drop_sgid()
attr: use consistent sgid stripping checks
fs: use consistent setgid checks in is_sxid()
MIPS: Fix a compilation issue
powerpc/iommu: fix memory leak with using debugfs_lookup()
powerpc/kcsan: Exclude udelay to prevent recursive instrumentation
alpha: fix R_ALPHA_LITERAL reloc for large modules
macintosh: windfarm: Use unsigned type for 1-bit bitfields
PCI: Add SolidRun vendor ID
scripts: handle BrokenPipeError for python scripts
media: ov5640: Fix analogue gain control
media: rc: gpio-ir-recv: add remove function
filelocks: use mount idmapping for setlease permission check
ext4: refactor ext4_free_blocks() to pull out ext4_mb_clear_bb()
ext4: add ext4_sb_block_valid() refactored out of ext4_inode_block_valid()
ext4: add strict range checks while freeing blocks
ext4: block range must be validated before use in ext4_mb_clear_bb()
arch: fix broken BuildID for arm64 and riscv
powerpc/vmlinux.lds: Define RUNTIME_DISCARD_EXIT
powerpc/vmlinux.lds: Don't discard .rela* for relocatable builds
s390: define RUNTIME_DISCARD_EXIT to fix link error with GNU ld < 2.36
sh: define RUNTIME_DISCARD_EXIT
tools build: Add feature test for init_disassemble_info API changes
tools include: add dis-asm-compat.h to handle version differences
tools perf: Fix compilation error with new binutils
tools bpf_jit_disasm: Fix compilation error with new binutils
tools bpftool: Fix compilation error with new binutils
KVM: fix memoryleak in kvm_init()
xfs: remove xfs_setattr_time() declaration
UML: define RUNTIME_DISCARD_EXIT
fs: hold writers when changing mount's idmapping
KVM: nVMX: Don't use Enlightened MSR Bitmap for L3
KVM: VMX: Introduce vmx_msr_bitmap_l01_changed() helper
KVM: VMX: Fix crash due to uninitialized current_vmcs
Makefile: use -gdwarf-{4|5} for assembler for DEBUG_INFO_DWARF{4|5}
Linux 5.15.103
Change-Id: I7ab86cd0356da0ac0fe5d54635cad5408f73bafe
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Changes in 5.15.101
Revert "drm/i915: Don't use BAR mappings for ring buffers with LLC"
Linux 5.15.101
Change-Id: I6050cdc6c5fbb3020c826d0abfa3ab820d8de1c8
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
kmap_atomic was deprecated in 5.11, and checkpatch now warns about use
of it. Replace with kmap_local_page, and do not manually disable
preemption or page faults.
Bug: 264474028
Fixes: ef2ab77cc1 ("ANDROID: dma-buf: system_heap: Add pagepool support to system heap")
Change-Id: Idd6413ff56aadf4fd925acb6f567366d0e03166f
Signed-off-by: T.J. Mercier <tjmercier@google.com>
commit fd0815f632 upstream.
Events should only be added to a groups rb tree if they have not been
removed from their context by list_del_event(). Since remove_on_exec
made it possible to call list_del_event() on individual events before
they are detached from their group, perf_group_detach() should check each
sibling's attach_state before calling add_event_to_groups() on it.
Fixes: 2e498d0a74 ("perf: Add support for event removal on exec")
Signed-off-by: Budimir Markovic <markovicbudimir@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/ZBFzvQV9tEqoHEtH@gentoo
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1c5d422124 upstream.
The default maximum data buffer size for this interface is UHID_DATA_MAX
(4k). When data buffers are being processed, ensure this value is used
when ensuring the sanity, rather than a value between the user provided
value and HID_MAX_BUFFER_SIZE (16k).
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b1a37ed00d upstream.
Presently, when a report is processed, its proposed size, provided by
the user of the API (as Report Size * Report Count) is compared against
the subsystem default HID_MAX_BUFFER_SIZE (16k). However, some
low-level HID drivers allocate a reduced amount of memory to their
buffers (e.g. UHID only allocates UHID_DATA_MAX (4k) buffers), rending
this check inadequate in some cases.
In these circumstances, if the received report ends up being smaller
than the proposed report size, the remainder of the buffer is zeroed.
That is, the space between sizeof(csize) (size of the current report)
and the rsize (size proposed i.e. Report Size * Report Count), which can
be handled up to HID_MAX_BUFFER_SIZE (16k). Meaning that memset()
shoots straight past the end of the buffer boundary and starts zeroing
out in-use values, often resulting in calamity.
This patch introduces a new variable into 'struct hid_ll_driver' where
individual low-level drivers can over-ride the default maximum value of
HID_MAX_BUFFER_SIZE (16k) with something more sympathetic to the
interface.
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
[Lee: Backported to v5.15.y]
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 53b54ad074 upstream.
pci_bridge_wait_for_secondary_bus() is called after a Secondary Bus
Reset, but not after a DPC-induced Hot Reset.
As a result, the delays prescribed by PCIe r6.0 sec 6.6.1 are not
observed and devices on the secondary bus may be accessed before
they're ready.
One affected device is Intel's Ponte Vecchio HPC GPU. It comprises a
PCIe switch whose upstream port is not immediately ready after reset.
Because its config space is restored too early, it remains in
D0uninitialized, its subordinate devices remain inaccessible and DPC
recovery fails with messages such as:
i915 0000:8c:00.0: can't change power state from D3cold to D0 (config space inaccessible)
intel_vsec 0000:8e:00.1: can't change power state from D3cold to D0 (config space inaccessible)
pcieport 0000:89:02.0: AER: device recovery failed
Fix it.
Link: https://lore.kernel.org/r/9f5ff00e1593d8d9a4b452398b98aa14d23fca11.1673769517.git.lukas@wunner.de
Tested-by: Ravi Kishore Koppuravuri <ravi.kishore.koppuravuri@intel.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ac91e69805 upstream.
Sheng Bi reports that pci_bridge_secondary_bus_reset() may fail to wait
for devices on the secondary bus to become accessible after reset:
Although it does call pci_dev_wait(), it erroneously passes the bridge's
pci_dev rather than that of a child. The bridge of course is always
accessible while its secondary bus is reset, so pci_dev_wait() returns
immediately.
Sheng Bi proposes introducing a new pci_bridge_secondary_bus_wait()
function which is called from pci_bridge_secondary_bus_reset():
https://lore.kernel.org/linux-pci/20220523171517.32407-1-windy.bi.enflame@gmail.com/
However we already have pci_bridge_wait_for_secondary_bus() which does
almost exactly what we need. So far it's only called on resume from
D3cold (which implies a Fundamental Reset per PCIe r6.0 sec 5.8).
Re-using it for Secondary Bus Resets is a leaner and more rational
approach than introducing a new function.
That only requires a few minor tweaks:
- Amend pci_bridge_wait_for_secondary_bus() to await accessibility of
the first device on the secondary bus by calling pci_dev_wait() after
performing the prescribed delays. pci_dev_wait() needs two parameters,
a reset reason and a timeout, which callers must now pass to
pci_bridge_wait_for_secondary_bus(). The timeout is 1 sec for resume
(PCIe r6.0 sec 6.6.1) and 60 sec for reset (commit 821cdad5c4 ("PCI:
Wait up to 60 seconds for device to become ready after FLR")).
Introduce a PCI_RESET_WAIT macro for the 1 sec timeout.
- Amend pci_bridge_wait_for_secondary_bus() to return 0 on success or
-ENOTTY on error for consumption by pci_bridge_secondary_bus_reset().
- Drop an unnecessary 1 sec delay from pci_reset_secondary_bus() which
is now performed by pci_bridge_wait_for_secondary_bus(). A static
delay this long is only necessary for Conventional PCI, so modern
PCIe systems benefit from shorter reset times as a side effect.
Fixes: 6b2f1351af ("PCI: Wait for device to become ready after secondary bus reset")
Link: https://lore.kernel.org/r/da77c92796b99ec568bd070cbe4725074a117038.1673769517.git.lukas@wunner.de
Reported-by: Sheng Bi <windy.bi.enflame@gmail.com>
Tested-by: Ravi Kishore Koppuravuri <ravi.kishore.koppuravuri@intel.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Cc: stable@vger.kernel.org # v4.17+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
No upstream commit exists for this commit.
The issue was introduced with backporting upstream commit c16bda3759
("io_uring/poll: allow some retries for poll triggering spuriously").
Memory allocation can possibly fail causing invalid pointer be
dereferenced just before comparing it to NULL value.
Move the pointer check in proper place (upstream has the similar location
of the check). In case the request has REQ_F_POLLED flag up, apoll can't
be NULL so no need to check there.
Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e0e6b416b2 upstream.
Users reported oopses on list corruptions when using i915 perf with a
number of concurrently running graphics applications. Root cause analysis
pointed at an issue in barrier processing code -- a race among perf open /
close replacing active barriers with perf requests on kernel context and
concurrent barrier preallocate / acquire operations performed during user
context first pin / last unpin.
When adding a request to a composite tracker, we try to reuse an existing
fence tracker, already allocated and registered with that composite. The
tracker we obtain may already track another fence, may be an idle barrier,
or an active barrier.
If the tracker we get occurs a non-idle barrier then we try to delete that
barrier from a list of barrier tasks it belongs to. However, while doing
that we don't respect return value from a function that performs the
barrier deletion. Should the deletion ever fail, we would end up reusing
the tracker still registered as a barrier task. Since the same structure
field is reused with both fence callback lists and barrier tasks list,
list corruptions would likely occur.
Barriers are now deleted from a barrier tasks list by temporarily removing
the list content, traversing that content with skip over the node to be
deleted, then populating the list back with the modified content. Should
that intentionally racy concurrent deletion attempts be not serialized,
one or more of those may fail because of the list being temporary empty.
Related code that ignores the results of barrier deletion was initially
introduced in v5.4 by commit d8af05ff38 ("drm/i915: Allow sharing the
idle-barrier from other kernel requests"). However, all users of the
barrier deletion routine were apparently serialized at that time, then the
issue didn't exhibit itself. Results of git bisect with help of a newly
developed igt@gem_barrier_race@remote-request IGT test indicate that list
corruptions might start to appear after commit 311770173f ("drm/i915/gt:
Schedule request retirement when timeline idles"), introduced in v5.5.
Respect results of barrier deletion attempts -- mark the barrier as idle
only if successfully deleted from the list. Then, before proceeding with
setting our fence as the one currently tracked, make sure that the tracker
we've got is not a non-idle barrier. If that check fails then don't use
that tracker but go back and try to acquire a new, usable one.
v3: use unlikely() to document what outcome we expect (Andi),
- fix bad grammar in commit description.
v2: no code changes,
- blame commit 311770173f ("drm/i915/gt: Schedule request retirement
when timeline idles"), v5.5, not commit d8af05ff38 ("drm/i915: Allow
sharing the idle-barrier from other kernel requests"), v5.4,
- reword commit description.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/6333
Fixes: 311770173f ("drm/i915/gt: Schedule request retirement when timeline idles")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org # v5.5
Cc: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230302120820.48740-1-janusz.krzysztofik@linux.intel.com
(cherry picked from commit 5060060557)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0424a7dfe9 upstream.
As a temporary storage, staged_config[] in rdt_domain should be cleared
before and after it is used. The stale value in staged_config[] could
cause an MSR access error.
Here is a reproducer on a system with 16 usable CLOSIDs for a 15-way L3
Cache (MBA should be disabled if the number of CLOSIDs for MB is less than
16.) :
mount -t resctrl resctrl -o cdp /sys/fs/resctrl
mkdir /sys/fs/resctrl/p{1..7}
umount /sys/fs/resctrl/
mount -t resctrl resctrl /sys/fs/resctrl
mkdir /sys/fs/resctrl/p{1..8}
An error occurs when creating resource group named p8:
unchecked MSR access error: WRMSR to 0xca0 (tried to write 0x00000000000007ff) at rIP: 0xffffffff82249142 (cat_wrmsr+0x32/0x60)
Call Trace:
<IRQ>
__flush_smp_call_function_queue+0x11d/0x170
__sysvec_call_function+0x24/0xd0
sysvec_call_function+0x89/0xc0
</IRQ>
<TASK>
asm_sysvec_call_function+0x16/0x20
When creating a new resource control group, hardware will be configured
by the following process:
rdtgroup_mkdir()
rdtgroup_mkdir_ctrl_mon()
rdtgroup_init_alloc()
resctrl_arch_update_domains()
resctrl_arch_update_domains() iterates and updates all resctrl_conf_type
whose have_new_ctrl is true. Since staged_config[] holds the same values as
when CDP was enabled, it will continue to update the CDP_CODE and CDP_DATA
configurations. When group p8 is created, get_config_index() called in
resctrl_arch_update_domains() will return 16 and 17 as the CLOSIDs for
CDP_CODE and CDP_DATA, which will be translated to an invalid register -
0xca0 in this scenario.
Fix it by clearing staged_config[] before and after it is used.
[reinette: re-order commit tags]
Fixes: 75408e4350 ("x86/resctrl: Allow different CODE/DATA configurations to be staged")
Suggested-by: Xin Hao <xhao@linux.alibaba.com>
Signed-off-by: Shawn Wang <shawnwang@linux.alibaba.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Tested-by: Reinette Chatre <reinette.chatre@intel.com>
Cc:stable@vger.kernel.org
Link: https://lore.kernel.org/all/2fad13f49fbe89687fc40e9a5a61f23a28d1507a.1673988935.git.reinette.chatre%40intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cbebd68f59 upstream.
cmdline_find_option() may fail before doing any initialization of
the buffer array. This may lead to unpredictable results when the same
buffer is used later in calls to strncmp() function. Fix the issue by
returning early if cmdline_find_option() returns an error.
Found by Linux Verification Center (linuxtesting.org) with static
analysis tool SVACE.
Fixes: aca20d5462 ("x86/mm: Add support to make use of Secure Memory Encryption")
Signed-off-by: Nikita Zhandarovich <n.zhandarovich@fintech.ru>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Cc: <stable@kernel.org>
Link: https://lore.kernel.org/r/20230306160656.14844-1-n.zhandarovich@fintech.ru
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4783b9cb37 upstream.
A recent change introduced a flag to queue up errors found during
boot-time polling. These errors will be processed during late init once
the MCE subsystem is fully set up.
A number of sysfs updates call mce_restart() which goes through a subset
of the CPU init flow. This includes polling MCA banks and logging any
errors found. Since the same function is used as boot-time polling,
errors will be queued. However, the system is now past late init, so the
errors will remain queued until another error is found and the workqueue
is triggered.
Call mce_schedule_work() at the end of mce_restart() so that queued
errors are processed.
Fixes: 3bff147b18 ("x86/mce: Defer processing of early errors")
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230301221420.2203184-1-yazen.ghannam@amd.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 203873a535 upstream.
Find a valid modeline depending on the machine graphic card
configuration and add the fb_check_var() function to validate
Xorg provided graphics settings.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>