Add the functionality that allow users of shmem to reclaim its pages
without going through the kswapd/direct reclaim path. An example usecase
is: Say that device allocates a larger amount of shmem pages and shares
it with hardware. To faster reclaims such pages, drivers can register
the shrinkers and call reclaim_shmem_address_space().
This commit is a squash of changes that contains all the fixes in the
reclaim_shmem_address_space() API.
Bug: 201263305
Change-Id: I03d2c3b9610612af977f89ddeabb63b8e9e50918
Signed-off-by: Charan Teja Reddy <charante@codeaurora.org>
Signed-off-by: Charan Teja Kalla <quic_charante@quicinc.com>
Move the code for aborting all SCSI commands and TMFs into a new function.
This patch makes the ufshcd_err_handler() easier to read. Except for adding
more logging, this patch does not change any functionality.
Change-Id: I55c4c5ab2e0d98977ba74c57e249b0d9573c016a
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20221031183433.2443554-1-bvanassche@acm.org
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit b817e6ffba git://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git for-next)
Signed-off-by: Bart Van Assche <bvanassche@google.com>
Add the following symbols in order to get info on processes.
4 function symbol(s) added
'int cgroup_path_ns(struct cgroup*, char*, size_t, struct cgroup_namespace*)'
'struct task_struct* find_task_by_vpid(pid_t)'
'struct nlattr* nla_reserve_64bit(struct sk_buff*, int, int, int)'
'pid_t pid_nr_ns(struct pid*, struct pid_namespace*)'
Bug: 233972073
Change-Id: I3ed02480b5e349562f2fe0f421d1d297db3c43d1
Signed-off-by: Chris Goldsworthy <quic_cgoldswo@quicinc.com>
The fact that blk_mq_destroy_queue also drops a queue reference leads
to various places having to grab an extra reference. Move the call to
blk_put_queue into the callers to allow removing the extra references.
Change-Id: Ic79ef3c0dbc005013f78c580aa27b5434f143bd9
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Link: https://lore.kernel.org/r/20221018135720.670094-2-hch@lst.de
[axboe: fix fabrics_q vs admin_q conflict in nvme core.c]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
(cherry picked from commit 2b3f056f72 git://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git for-next)
Signed-off-by: Bart Van Assche <bvanassche@google.com>
Depending upon the toolchain, empty ELF sections in the object files
of a hypervisor module can be emitted with unusual flags such as
"Writeable, non-allocatable" which modpost gets grumpy about:
| WARNING: modpost: drivers/misc/pkvm-pl011/pkvm_pl011.o (.hyp.rodata): unexpected non-allocatable section.
Since these sections are empty, whether or not they are allocatable is
not of any concern to the module loader, so adjust the check in modpost
to quietly tolerate this insanity.
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 269245057
Change-Id: I6bf0ccdb1c0440b0a117ea0a0949aff47f7ccadb
Although the host lock had to be held by ufshcd_clk_scaling_start_busy()
callers when that function was introduced, that is no longer the case
today. Hence remove the comment that claims that callers of this function
must hold the host lock.
Reviewed-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Change-Id: Ia7e828b4f0e767e04d39a47bf40f6a809834f188
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20221018202958.1902564-5-bvanassche@acm.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 1626c7bba1 git://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git for-next)
Signed-off-by: Bart Van Assche <bvanassche@google.com>
Some of the platforms might be still expecting dedicated memory region
for ramoops node. So add logic to detect the start and size of the
ramoops memory region by looking up reserved memory region with
of_reserved_mem_lookup() when platform_get_resource() failed.
Bug: 191636717
Change-Id: Idc479b45fb3f637f7235efd6eabac62059d5e92b
Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org>
(cherry picked from commit 9b136eab76)
The reserved memory region for ramoops is assumed to be at a fixed
and known location when read from the devicetree. This is not desirable
in environments where it is preferred for the region to be dynamically
allocated at runtime, as opposed to it being fixed at compile time.
Change the logic for detecting the start and size of the ramoops
memory region by looking up the reserved memory region instead of
using platform_get_resource(), which assumes that the location
of the memory is known ahead of time.
Bug: 191636717
Link: https://lore.kernel.org/patchwork/patch/1451704/
Change-Id: I24066de9f4fe1f1575cb1bbb1687c37a2b1938a4
Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
Signed-off-by: Mukesh Ojha <mojha@codeaurora.org>
Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org>
(cherry picked from commit bd2ca0ba5b)
Bug: 241479010
Test: incfs_test passes, play confirm behavior in bug is fixed
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Change-Id: Ie51f2b76d0873057f54fecf7fcc793c66df20969
Also fix race whereby multiple providers writinig the same block would
actually write out the same block.
Note that multiple_providers_test started failing when incfs was ported
to 5.15, and these fixes are needed to make the test reliable
Bug: 264703896
Test: incfs-test passes, specifically multiple_providers_test. Ran 100
times
Change-Id: I05ad5b2b2f62cf218256222cecb79bbe9953bd97
Signed-off-by: Paul Lawrence <paullawrence@google.com>
MGLRU tries to avoid doing unnecessary anon reclaim work if swap is
low by checking if the available swap is less than the MIN_BATCH_SIZE
(256kB). This can lead to unintened consequences where PSI pressure is
less and LMKD doesn't wake up in time to avoid file cache thrashing.
Remove this check to preserve the old bahavior. This can be improved
later on once we have a low swap notification event from the kernel.
Bug: 268574308
Change-Id: Id381316931a9cf6e7ea8b1ea7800c77f176c9892
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Use the assert_in_mod_range() to validate the registered callback is
part of a module VA space. This feature requires CONFIG_NVHE_EL2_DEBUG.
Bug: 269245057
Change-Id: I4c4d60ac77882fc2d36c3c73b096f4ba9afb83e5
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Add the following symbols to the symbol list:
- irq_do_set_affinity
- __traceiter_android_rvh_gic_v3_set_affinity
- __tracepoint_android_rvh_gic_v3_set_affinity
Bug: 269786647
Change-Id: I5e081606476f3eea727dd7ec94e6290a7ce1acc1
Signed-off-by: Guru Das Srinagesh <quic_gurus@quicinc.com>
Changes in 6.1.13
mptcp: sockopt: make 'tcp_fastopen_connect' generic
mptcp: fix locking for setsockopt corner-case
mptcp: deduplicate error paths on endpoint creation
mptcp: fix locking for in-kernel listener creation
btrfs: move the auto defrag code to defrag.c
btrfs: lock the inode in shared mode before starting fiemap
ASoC: amd: yc: Add DMI support for new acer/emdoor platforms
ASoC: SOF: sof-audio: start with the right widget type
ALSA: usb-audio: Add FIXED_RATE quirk for JBL Quantum610 Wireless
ASoC: Intel: sof_rt5682: always set dpcm_capture for amplifiers
ASoC: Intel: sof_cs42l42: always set dpcm_capture for amplifiers
ASoC: Intel: sof_nau8825: always set dpcm_capture for amplifiers
ASoC: Intel: sof_ssp_amp: always set dpcm_capture for amplifiers
selftests/bpf: Verify copy_register_state() preserves parent/live fields
ALSA: hda: Do not unset preset when cleaning up codec
ASoC: amd: yc: Add Xiaomi Redmi Book Pro 15 2022 into DMI table
bpf, sockmap: Don't let sock_map_{close,destroy,unhash} call itself
ASoC: cs42l56: fix DT probe
tools/virtio: fix the vringh test for virtio ring changes
vdpa: ifcvf: Do proper cleanup if IFCVF init fails
net/rose: Fix to not accept on connected socket
selftest: net: Improve IPV6_TCLASS/IPV6_HOPLIMIT tests apparmor compatibility
net: stmmac: do not stop RX_CLK in Rx LPI state for qcs404 SoC
powerpc/64: Fix perf profiling asynchronous interrupt handlers
fscache: Use clear_and_wake_up_bit() in fscache_create_volume_work()
drm/nouveau/devinit/tu102-: wait for GFW_BOOT_PROGRESS == COMPLETED
net: ethernet: mtk_eth_soc: Avoid truncating allocation
net: sched: sch: Bounds check priority
s390/decompressor: specify __decompress() buf len to avoid overflow
nvme-fc: fix a missing queue put in nvmet_fc_ls_create_association
nvme: clear the request_queue pointers on failure in nvme_alloc_admin_tag_set
nvme: clear the request_queue pointers on failure in nvme_alloc_io_tag_set
drm/amd/display: Add missing brackets in calculation
drm/amd/display: Adjust downscaling limits for dcn314
drm/amd/display: Unassign does_plane_fit_in_mall function from dcn3.2
drm/amd/display: Reset DMUB mailbox SW state after HW reset
drm/amdgpu: enable HDP SD for gfx 11.0.3
drm/amdgpu: Enable vclk dclk node for gc11.0.3
drm/amd/display: Properly handle additional cases where DCN is not supported
platform/x86: touchscreen_dmi: Add Chuwi Vi8 (CWI501) DMI match
ceph: move mount state enum to super.h
ceph: blocklist the kclient when receiving corrupted snap trace
selftests: mptcp: userspace: fix v4-v6 test in v6.1
of: reserved_mem: Have kmemleak ignore dynamically allocated reserved mem
kasan: fix Oops due to missing calls to kasan_arch_is_ready()
mm: shrinkers: fix deadlock in shrinker debugfs
aio: fix mremap after fork null-deref
vmxnet3: move rss code block under eop descriptor
fbdev: Fix invalid page access after closing deferred I/O devices
drm: Disable dynamic debug as broken
drm/amd/amdgpu: fix warning during suspend
drm/amd/display: Fail atomic_check early on normalize_zpos error
drm/vmwgfx: Stop accessing buffer objects which failed init
drm/vmwgfx: Do not drop the reference to the handle too soon
mmc: jz4740: Work around bug on JZ4760(B)
mmc: meson-gx: fix SDIO mode if cap_sdio_irq isn't set
mmc: sdio: fix possible resource leaks in some error paths
mmc: mmc_spi: fix error handling in mmc_spi_probe()
ALSA: hda: Fix codec device field initializan
ALSA: hda/conexant: add a new hda codec SN6180
ALSA: hda/realtek - fixed wrong gpio assigned
ALSA: hda/realtek: fix mute/micmute LEDs don't work for a HP platform.
ALSA: hda/realtek: Enable mute/micmute LEDs and speaker support for HP Laptops
ata: ahci: Add Tiger Lake UP{3,4} AHCI controller
ata: libata-core: Disable READ LOG DMA EXT for Samsung MZ7LH
sched/psi: Fix use-after-free in ep_remove_wait_queue()
hugetlb: check for undefined shift on 32 bit architectures
nilfs2: fix underflow in second superblock position calculations
mm/MADV_COLLAPSE: set EAGAIN on unexpected page refcount
mm/filemap: fix page end in filemap_get_read_batch
mm/migrate: fix wrongly apply write bit after mkdirty on sparc64
gpio: sim: fix a memory leak
freezer,umh: Fix call_usermode_helper_exec() vs SIGKILL
coredump: Move dump_emit_page() to kill unused warning
Revert "mm: Always release pages to the buddy allocator in memblock_free_late()."
net: Fix unwanted sign extension in netdev_stats_to_stats64()
revert "squashfs: harden sanity check in squashfs_read_xattr_id_table"
drm/vc4: crtc: Increase setup cost in core clock calculation to handle extreme reduced blanking
drm/vc4: Fix YUV plane handling when planes are in different buffers
drm/i915/gen11: Wa_1408615072/Wa_1407596294 should be on GT list
ice: fix lost multicast packets in promisc mode
ixgbe: allow to increase MTU to 3K with XDP enabled
i40e: add double of VLAN header when computing the max MTU
net: bgmac: fix BCM5358 support by setting correct flags
net: ethernet: ti: am65-cpsw: Add RX DMA Channel Teardown Quirk
sctp: sctp_sock_filter(): avoid list_entry() on possibly empty list
net/sched: tcindex: update imperfect hash filters respecting rcu
ice: xsk: Fix cleaning of XDP_TX frames
dccp/tcp: Avoid negative sk_forward_alloc by ipv6_pinfo.pktoptions.
net/usb: kalmia: Don't pass act_len in usb_bulk_msg error path
net/sched: act_ctinfo: use percpu stats
net: openvswitch: fix possible memory leak in ovs_meter_cmd_set()
net: stmmac: fix order of dwmac5 FlexPPS parametrization sequence
bnxt_en: Fix mqprio and XDP ring checking logic
tracing: Make trace_define_field_ext() static
net: stmmac: Restrict warning on disabling DMA store and fwd mode
net: use a bounce buffer for copying skb->mark
tipc: fix kernel warning when sending SYN message
net: mpls: fix stale pointer if allocation fails during device rename
igb: conditionalize I2C bit banging on external thermal sensor support
igb: Fix PPS input and output using 3rd and 4th SDP
ixgbe: add double of VLAN header when computing the max MTU
ipv6: Fix datagram socket connection with DSCP.
ipv6: Fix tcp socket connection with DSCP.
mm/gup: add folio to list when folio_isolate_lru() succeed
mm: extend max struct page size for kmsan
i40e: Add checking for null for nlmsg_find_attr()
net/sched: tcindex: search key must be 16 bits
nvme-tcp: stop auth work after tearing down queues in error recovery
nvme-rdma: stop auth work after tearing down queues in error recovery
KVM: x86/pmu: Disable vPMU support on hybrid CPUs (host PMUs)
kvm: initialize all of the kvm_debugregs structure before sending it to userspace
perf/x86: Refuse to export capabilities for hybrid PMUs
alarmtimer: Prevent starvation by small intervals and SIG_IGN
nvme-pci: refresh visible attrs for cmb attributes
ASoC: SOF: Intel: hda-dai: fix possible stream_tag leak
net: sched: sch: Fix off by one in htb_activate_prios()
Linux 6.1.13
Change-Id: I8a1e4175939c14f726c545001061b95462566386
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
commit e917a849c3 upstream.
The sysfs group containing the cmb attributes is registered before the
driver knows if they need to be visible or not. Update the group when
cmb attributes are known to exist so the visibility setting is correct.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217037
Fixes: 86adbf0cdb ("nvme: simplify transport specific device attribute handling")
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d125d1349a upstream.
syzbot reported a RCU stall which is caused by setting up an alarmtimer
with a very small interval and ignoring the signal. The reproducer arms the
alarm timer with a relative expiry of 8ns and an interval of 9ns. Not a
problem per se, but that's an issue when the signal is ignored because then
the timer is immediately rearmed because there is no way to delay that
rearming to the signal delivery path. See posix_timer_fn() and commit
58229a1899 ("posix-timers: Prevent softirq starvation by small intervals
and SIG_IGN") for details.
The reproducer does not set SIG_IGN explicitely, but it sets up the timers
signal with SIGCONT. That has the same effect as explicitely setting
SIG_IGN for a signal as SIGCONT is ignored if there is no handler set and
the task is not ptraced.
The log clearly shows that:
[pid 5102] --- SIGCONT {si_signo=SIGCONT, si_code=SI_TIMER, si_timerid=0, si_overrun=316014, si_int=0, si_ptr=NULL} ---
It works because the tasks are traced and therefore the signal is queued so
the tracer can see it, which delays the restart of the timer to the signal
delivery path. But then the tracer is killed:
[pid 5087] kill(-5102, SIGKILL <unfinished ...>
...
./strace-static-x86_64: Process 5107 detached
and after it's gone the stall can be observed:
syzkaller login: [ 79.439102][ C0] hrtimer: interrupt took 68471 ns
[ 184.460538][ C1] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
...
[ 184.658237][ C1] rcu: Stack dump where RCU GP kthread last ran:
[ 184.664574][ C1] Sending NMI from CPU 1 to CPUs 0:
[ 184.669821][ C0] NMI backtrace for cpu 0
[ 184.669831][ C0] CPU: 0 PID: 5108 Comm: syz-executor192 Not tainted 6.2.0-rc6-next-20230203-syzkaller #0
...
[ 184.670036][ C0] Call Trace:
[ 184.670041][ C0] <IRQ>
[ 184.670045][ C0] alarmtimer_fired+0x327/0x670
posix_timer_fn() prevents that by checking whether the interval for
timers which have the signal ignored is smaller than a jiffie and
artifically delay it by shifting the next expiry out by a jiffie. That's
accurate vs. the overrun accounting, but slightly inaccurate
vs. timer_gettimer(2).
The comment in that function says what needs to be done and there was a fix
available for the regular userspace induced SIG_IGN mechanism, but that did
not work due to the implicit ignore for SIGCONT and similar signals. This
needs to be worked on, but for now the only available workaround is to do
exactly what posix_timer_fn() does:
Increase the interval of self-rearming timers, which have their signal
ignored, to at least a jiffie.
Interestingly this has been fixed before via commit ff86bf0c65
("alarmtimer: Rate limit periodic intervals") already, but that fix got
lost in a later rework.
Reported-by: syzbot+b9564ba6e8e00694511b@syzkaller.appspotmail.com
Fixes: f2c45807d3 ("alarmtimer: Switch over to generic set/get/rearm routine")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: John Stultz <jstultz@google.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/87k00q1no2.ffs@tglx
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4d7404e5ee upstream.
Disable KVM support for virtualizing PMUs on hosts with hybrid PMUs until
KVM gains a sane way to enumeration the hybrid vPMU to userspace and/or
gains a mechanism to let userspace opt-in to the dangers of exposing a
hybrid vPMU to KVM guests. Virtualizing a hybrid PMU, or at least part of
a hybrid PMU, is possible, but it requires careful, deliberate
configuration from userspace.
E.g. to expose full functionality, vCPUs need to be pinned to pCPUs to
prevent migrating a vCPU between a big core and a little core, userspace
must enumerate a reasonable topology to the guest, and guest CPUID must be
curated per vCPU to enumerate accurate vPMU capabilities.
The last point is especially problematic, as KVM doesn't control which
pCPU it runs on when enumerating KVM's vPMU capabilities to userspace,
i.e. userspace can't rely on KVM_GET_SUPPORTED_CPUID in it's current form.
Alternatively, userspace could enable vPMU support by enumerating the
set of features that are common and coherent across all cores, e.g. by
filtering PMU events and restricting guest capabilities. But again, that
requires userspace to take action far beyond reflecting KVM's supported
feature set into the guest.
For now, simply disable vPMU support on hybrid CPUs to avoid inducing
seemingly random #GPs in guests, and punt support for hybrid CPUs to a
future enabling effort.
Reported-by: Jianfeng Gao <jianfeng.gao@intel.com>
Cc: stable@vger.kernel.org
Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: https://lore.kernel.org/all/20220818181530.2355034-1-kan.liang@linux.intel.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230208204230.1360502-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 91c11d5f32 ]
when starting error recovery there might be a authentication work
running, and it involves I/O commands. Given the controller is tearing
down there is no chance for the I/O to complete other than timing out
which may unnecessarily take a full io timeout.
So first tear down the queues, fail/cancel all inflight I/O (including
potentially authentication) and only then stop authentication. This
ensures that failover is not stalled due to blocked authentication I/O.
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1f1a4f8956 ]
when starting error recovery there might be a authentication work
running, and it involves I/O commands. Given the controller is tearing
down there is no chance for the I/O to complete other than timing out
which may unnecessarily take a full io timeout.
So first tear down the queues, fail/cancel all inflight I/O (including
potentially authentication) and only then stop authentication. This
ensures that failover is not stalled due to blocked authentication I/O.
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 8230680f36 upstream.
Take into account the IPV6_TCLASS socket option (DSCP) in
tcp_v6_connect(). Otherwise fib6_rule_match() can't properly
match the DSCP value, resulting in invalid route lookup.
For example:
ip route add unreachable table main 2001:db8::10/124
ip route add table 100 2001:db8::10/124 dev eth0
ip -6 rule add dsfield 0x04 table 100
echo test | socat - TCP6:[2001:db8::11]:54321,ipv6-tclass=0x04
Without this patch, socat fails at connect() time ("No route to host")
because the fib-rule doesn't jump to table 100 and the lookup ends up
being done in the main table.
Fixes: 2cc67cc731 ("[IPV6] ROUTE: Routing by Traffic Class.")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e010ae08c7 upstream.
Take into account the IPV6_TCLASS socket option (DSCP) in
ip6_datagram_flow_key_init(). Otherwise fib6_rule_match() can't
properly match the DSCP value, resulting in invalid route lookup.
For example:
ip route add unreachable table main 2001:db8::10/124
ip route add table 100 2001:db8::10/124 dev eth0
ip -6 rule add dsfield 0x04 table 100
echo test | socat - UDP6:[2001:db8::11]:54321,ipv6-tclass=0x04
Without this patch, socat fails at connect() time ("No route to host")
because the fib-rule doesn't jump to table 100 and the lookup ends up
being done in the main table.
Fixes: 2cc67cc731 ("[IPV6] ROUTE: Routing by Traffic Class.")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5d54cb1767 upstream.
Commit a97f8783a9 ("igb: unbreak I2C bit-banging on i350") introduced
code to change I2C settings to bit banging unconditionally.
However, this patch introduced a regression: On an Intel S2600CWR
Server Board with three NICs:
- 1x dual-port copper
Intel I350 Gigabit Network Connection [8086:1521] (rev 01)
fw 1.63, 0x80000dda
- 2x quad-port SFP+ with copper SFP Avago ABCU-5700RZ
Intel I350 Gigabit Fiber Network Connection [8086:1522] (rev 01)
fw 1.52.0
the SFP NICs no longer get link at all. Reverting commit a97f8783a9
or switching to the Intel out-of-tree driver both fix the problem.
Per the igb out-of-tree driver, I2C bit banging on i350 depends on
support for an external thermal sensor (ETS). However, commit
a97f8783a9 added bit banging unconditionally. Additionally, the
out-of-tree driver always calls init_thermal_sensor_thresh on probe,
while our driver only calls init_thermal_sensor_thresh only in
igb_reset(), and only if an ETS is present, ignoring the internal
thermal sensor. The affected SFPs don't provide an ETS. Per Intel,
the behaviour is a result of i350 firmware requirements.
This patch fixes the problem by aligning the behaviour to the
out-of-tree driver:
- split igb_init_i2c() into two functions:
- igb_init_i2c() only performs the basic I2C initialization.
- igb_set_i2c_bb() makes sure that E1000_CTRL_I2C_ENA is set
and enables bit-banging.
- igb_probe() only calls igb_set_i2c_bb() if an ETS is present.
- igb_probe() calls init_thermal_sensor_thresh() unconditionally.
- igb_reset() aligns its behaviour to igb_probe(), i. e., call
igb_set_i2c_bb() if an ETS is present and call
init_thermal_sensor_thresh() unconditionally.
Fixes: a97f8783a9 ("igb: unbreak I2C bit-banging on i350")
Tested-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Co-developed-by: Jamie Bainbridge <jbainbri@redhat.com>
Signed-off-by: Jamie Bainbridge <jbainbri@redhat.com>
Signed-off-by: Corinna Vinschen <vinschen@redhat.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20230214185549.1306522-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>