The ast2600 no longer uses bit 4 in the control register to indicate a
1MHz clock (It now controls whether this watchdog is reset by a SOC
reset). This means we do not want to set it. It also does not need to be
set for the ast2500, as it is read-only on that SoC.
The comment next to the clock rate selection wandered away from where it
was set, so put it back next to the register setting it's describing.
Fixes: b3528b4874 ("watchdog: aspeed: Add support for AST2600")
Signed-off-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20191108032905.22463-1-joel@jms.id.au
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
The following hang is observed when a 'reboot' command is issued:
# reboot
# Stopping network: OK
Stopping klogd: OK
Stopping syslogd: OK
umount: devtmpfs busy - remounted read-only
[ 8.612079] EXT4-fs (mmcblk0p2): re-mounted. Opts: (null)
The system is going down NOW!
Sent SIGTERM to all processes
Sent SIGKILL to all processes
Requesting system reboot
[ 10.694753] reboot: Restarting system
[ 11.699008] Reboot failed -- System halted
Fix this problem by adding a .restart ops member.
Fixes: 41b630f41b ("watchdog: Add i.MX7ULP watchdog support")
Signed-off-by: Fabio Estevam <festevam@gmail.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20191029174037.25381-1-festevam@gmail.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
It can be useful to delay setting the nowayout feature for a watchdog
device. Moreover, not every driver (notably gpio_wdt) implements a
nowayout module parameter/otherwise respects CONFIG_WATCHDOG_NOWAYOUT,
and modifying those drivers carries a risk of causing a regression for
someone who has two watchdog devices, sets CONFIG_WATCHDOG_NOWAYOUT
and somehow relies on the gpio_wdt driver being ignorant of
that (i.e., allowing one to gracefully close a gpio_wdt but not the
other watchdog in the system).
So instead, simply make the nowayout sysfs file writable. Obviously,
setting nowayout is a one-way street.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20191105205118.11359-1-linux@rasmusvillemoes.dk
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
When PREEMPT_RT is enabled, all hrtimer expiry functions are
deferred for execution into the context of ksoftirqd unless otherwise
annotated.
Deferring the expiry of the hrtimer used by the watchdog core, however,
is a waste, as the callback does nothing but queue a kthread work item
and wakeup watchdogd.
It's worst then that, too: the deferral through ksoftirqd also means
that for correct behavior a user must adjust the scheduling parameters
of both watchdogd _and_ ksoftirqd, which is unnecessary and has other
side effects (like causing unrelated expiry functions to execute at
potentially elevated priority).
Instead, mark the hrtimer used by the watchdog core as being _HARD to
allow it's execution directly from hardirq context. The work done in
this expiry function is well-bounded and minimal.
A user still must adjust the scheduling parameters of the watchdogd
to be correct w.r.t. their application needs.
Link: https://lkml.kernel.org/r/0e02d8327aeca344096c246713033887bc490dd7.1538089180.git.julia@ni.com
Cc: Guenter Roeck <linux@roeck-us.net>
Reported-and-tested-by: Steffen Trumtrar <s.trumtrar@pengutronix.de>
Reported-by: Tim Sander <tim@krieglstein.org>
Signed-off-by: Julia Cartwright <julia@ni.com>
Acked-by: Guenter Roeck <linux@roeck-us.net>
[bigeasy: use only HRTIMER_MODE_REL_HARD]
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20191105144506.clyadjbvnn7b7b2m@linutronix.de
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
The struct cdev is embedded in the struct watchdog_core_data. In the
current code, we manage the watchdog_core_data with a kref, but the
cdev is manged by a kobject. There is no any relationship between
this kref and kobject. So it is possible that the watchdog_core_data is
freed before the cdev is entirely released. We can easily get the
following call trace with CONFIG_DEBUG_KOBJECT_RELEASE and
CONFIG_DEBUG_OBJECTS_TIMERS enabled.
ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x38
WARNING: CPU: 23 PID: 1028 at lib/debugobjects.c:481 debug_print_object+0xb0/0xf0
Modules linked in: softdog(-) deflate ctr twofish_generic twofish_common camellia_generic serpent_generic blowfish_generic blowfish_common cast5_generic cast_common cmac xcbc af_key sch_fq_codel openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4
CPU: 23 PID: 1028 Comm: modprobe Not tainted 5.3.0-next-20190924-yoctodev-standard+ #180
Hardware name: Marvell OcteonTX CN96XX board (DT)
pstate: 00400009 (nzcv daif +PAN -UAO)
pc : debug_print_object+0xb0/0xf0
lr : debug_print_object+0xb0/0xf0
sp : ffff80001cbcfc70
x29: ffff80001cbcfc70 x28: ffff800010ea2128
x27: ffff800010bad000 x26: 0000000000000000
x25: ffff80001103c640 x24: ffff80001107b268
x23: ffff800010bad9e8 x22: ffff800010ea2128
x21: ffff000bc2c62af8 x20: ffff80001103c600
x19: ffff800010e867d8 x18: 0000000000000060
x17: 0000000000000000 x16: 0000000000000000
x15: ffff000bd7240470 x14: 6e6968207473696c
x13: 5f72656d6974203a x12: 6570797420746365
x11: 6a626f2029302065 x10: 7461747320657669
x9 : 7463612820657669 x8 : 3378302f3078302b
x7 : 0000000000001d7a x6 : ffff800010fd5889
x5 : 0000000000000000 x4 : 0000000000000000
x3 : 0000000000000000 x2 : ffff000bff948548
x1 : 276a1c9e1edc2300 x0 : 0000000000000000
Call trace:
debug_print_object+0xb0/0xf0
debug_check_no_obj_freed+0x1e8/0x210
kfree+0x1b8/0x368
watchdog_cdev_unregister+0x88/0xc8
watchdog_dev_unregister+0x38/0x48
watchdog_unregister_device+0xa8/0x100
softdog_exit+0x18/0xfec4 [softdog]
__arm64_sys_delete_module+0x174/0x200
el0_svc_handler+0xd0/0x1c8
el0_svc+0x8/0xc
This is a common issue when using cdev embedded in a struct.
Fortunately, we already have a mechanism to solve this kind of issue.
Please see commit 233ed09d7f ("chardev: add helper function to
register char devs with a struct device") for more detail.
In this patch, we choose to embed the struct device into the
watchdog_core_data, and use the API provided by the commit 233ed09d7f
to make sure that the release of watchdog_core_data and cdev are
in sequence.
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20191008112934.29669-1-haokexin@gmail.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
Pull ARM SoC fixes from Olof Johansson:
"A set of fixes that have trickled in over the last couple of weeks:
- MAINTAINER update for Cavium/Marvell ThunderX2
- stm32 tweaks to pinmux for Joystick/Camera, and RAM allocation for
CAN interfaces
- i.MX fixes for voltage regulator GPIO mappings, fixes voltage
scaling issues
- More i.MX fixes for various issues on i.MX eval boards: interrupt
storm due to u-boot leaving pins in new states, fixing power button
config, a couple of compatible-string corrections.
- Powerdown and Suspend/Resume fixes for Allwinner A83-based tablets
- A few documentation tweaks and a fix of a memory leak in the reset
subsystem"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
MAINTAINERS: update Cavium ThunderX2 maintainers
ARM: dts: stm32: change joystick pinctrl definition on stm32mp157c-ev1
ARM: dts: stm32: remove OV5640 pinctrl definition on stm32mp157c-ev1
ARM: dts: stm32: Fix CAN RAM mapping on stm32mp157c
ARM: dts: stm32: relax qspi pins slew-rate for stm32mp157
arm64: dts: zii-ultra: fix ARM regulator GPIO handle
ARM: sunxi: Fix CPU powerdown on A83T
ARM: dts: sun8i-a83t-tbs-a711: Fix WiFi resume from suspend
arm64: dts: imx8mn: fix compatible string for sdma
arm64: dts: imx8mm: fix compatible string for sdma
reset: fix reset_control_ops kerneldoc comment
ARM: dts: imx6-logicpd: Re-enable SNVS power key
soc: imx: gpc: fix initialiser format
ARM: dts: imx6qdl-sabreauto: Fix storm of accelerometer interrupts
arm64: dts: ls1028a: fix a compatible issue
reset: fix reset_control_get_exclusive kerneldoc comment
reset: fix reset_control_lookup kerneldoc comment
reset: fix of_reset_control_get_count kerneldoc comment
reset: fix of_reset_simple_xlate kerneldoc comment
reset: Fix memory leak in reset_control_array_put()
Pull IIO fixes and staging driver from Greg KH:
"Here is a mix of a number of IIO driver fixes for 5.4-rc7, and a whole
new staging driver.
The IIO fixes resolve some reported issues, all are tiny.
The staging driver addition is the vboxsf filesystem, which is the
VirtualBox guest shared folder code. Hans has been trying to get
filesystem reviewers to review the code for many months now, and
Christoph finally said to just merge it in staging now as it is
stand-alone and the filesystem people can review it easier over time
that way.
I know it's late for this big of an addition, but it is stand-alone.
The code has been in linux-next for a while, long enough to pick up a
few tiny fixes for it already so people are looking at it.
All of these have been in linux-next with no reported issues"
* tag 'staging-5.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: Fix error return code in vboxsf_fill_super()
staging: vboxsf: fix dereference of pointer dentry before it is null checked
staging: vboxsf: Remove unused including <linux/version.h>
staging: Add VirtualBox guest shared folder (vboxsf) support
iio: adc: stm32-adc: fix stopping dma
iio: imu: inv_mpu6050: fix no data on MPU6050
iio: srf04: fix wrong limitation in distance measuring
iio: imu: adis16480: make sure provided frequency is positive
Pull char/misc driver fixes from Greg KH:
"Here are a number of late-arrival driver fixes for issues reported for
some char/misc drivers for 5.4-rc7
These all come from the different subsystem/driver maintainers as
things that they had reports for and wanted to see fixed.
All of these have been in linux-next with no reported issues"
* tag 'char-misc-5.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
intel_th: pci: Add Jasper Lake PCH support
intel_th: pci: Add Comet Lake PCH support
intel_th: msu: Fix possible memory leak in mode_store()
intel_th: msu: Fix overflow in shift of an unsigned int
intel_th: msu: Fix missing allocation failure check on a kstrndup
intel_th: msu: Fix an uninitialized mutex
intel_th: gth: Fix the window switching sequence
soundwire: slave: fix scanf format
soundwire: intel: fix intel_register_dai PDI offsets and numbers
interconnect: Add locking in icc_set_tag()
interconnect: qcom: Fix icc_onecell_data allocation
soundwire: depend on ACPI || OF
soundwire: depend on ACPI
thunderbolt: Drop unnecessary read when writing LC command in Ice Lake
thunderbolt: Fix lockdep circular locking depedency warning
thunderbolt: Read DP IN adapter first two dwords in one go
Pull configfs regression fix from Christoph Hellwig:
"Fix a regression from this merge window in the configfs symlink
handling (Honggang Li)"
* tag 'configfs-for-5.4-2' of git://git.infradead.org/users/hch/configfs:
configfs: calculate the depth of parent item
Pull x86 fixes from Thomas Gleixner:
"A small set of fixes for x86:
- Make the tsc=reliable/nowatchdog command line parameter work again.
It was broken with the introduction of the early TSC clocksource.
- Prevent the evaluation of exception stacks before they are set up.
This causes a crash in dumpstack because the stack walk termination
gets screwed up.
- Prevent a NULL pointer dereference in the rescource control file
system.
- Avoid bogus warnings about APIC id mismatch related to the LDR
which can happen when the LDR is not in use and therefore not
initialized. Only evaluate that when the APIC is in logical
destination mode"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/tsc: Respect tsc command line paraemeter for clocksource_tsc_early
x86/dumpstack/64: Don't evaluate exception stacks before setup
x86/apic/32: Avoid bogus LDR warnings
x86/resctrl: Prevent NULL pointer dereference when reading mondata
Pull timer fixes from Thomas Gleixner:
"A small set of fixes for timekeepoing and clocksource drivers:
- VDSO data was updated conditional on the availability of a VDSO
capable clocksource. This causes the VDSO functions which do not
depend on a VDSO capable clocksource to operate on stale data.
Always update unconditionally.
- Prevent a double free in the mediatek driver
- Use the proper helper in the sh_mtu2 driver so it won't attempt to
initialize non-existing interrupts"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
timekeeping/vsyscall: Update VDSO data unconditionally
clocksource/drivers/sh_mtu2: Do not loop using platform_get_irq_by_name()
clocksource/drivers/mediatek: Fix error handling
Pull scheduler fixes from Thomas Gleixner:
"Two fixes for scheduler regressions:
- Plug a subtle race condition which was introduced with the rework
of the next task selection functionality. The change of task
properties became unprotected which can be observed inconsistently
causing state corruption.
- A trivial compile fix for CONFIG_CGROUPS=n"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched: Fix pick_next_task() vs 'change' pattern race
sched/core: Fix compilation error when cgroup not selected
Pull perf tooling fixes from Thomas Gleixner:
- Fix the time sorting algorithm which was broken due to truncation of
big numbers
- Fix the python script generator fail caused by a broken tracepoint
array iterator
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf tools: Fix time sorting
perf tools: Remove unused trace_find_next_event()
perf scripting engines: Iterate on tep event arrays directly
Pull irq fixlet from Thomas Gleixner:
"A trivial fix for a kernel doc regression where an argument change was
not reflected in the documentation"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irq/irqdomain: Update __irq_domain_alloc_fwnode() function documentation
Pull stacktrace fix from Thomas Gleixner:
"A small fix for a stacktrace regression.
Saving a stacktrace for a foreign task skipped an extra entry which
makes e.g. the output of /proc/$PID/stack incomplete"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
stacktrace: Don't skip first entry on noncurrent tasks
Pull cifs fix from Steve French:
"Small fix for an smb3 reconnect bug (also marked for stable)"
* tag '5.4-rc7-smb3-fix' of git://git.samba.org/sfrench/cifs-2.6:
SMB3: Fix persistent handles reconnect
config option GENERIC_IO was removed but still selected by lib/kconfig
This patch finish the cleaning.
Fixes: 9de8da4774 ("kconfig: kill off GENERIC_IO option")
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull btrfs fixes from David Sterba:
"A few regressions and fixes for stable.
Regressions:
- fix a race leading to metadata space leak after task received a
signal
- un-deprecate 2 ioctls, marked as deprecated by mistake
Fixes:
- fix limit check for number of devices during chunk allocation
- fix a race due to double evaluation of i_size_read inside max()
macro, can cause a crash
- remove wrong device id check in tree-checker"
* tag 'for-5.4-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: un-deprecate ioctls START_SYNC and WAIT_SYNC
btrfs: save i_size to avoid double evaluation of i_size_read in compress_file_range
Btrfs: fix race leading to metadata space leak after task received signal
btrfs: tree-checker: Fix wrong check on max devid
btrfs: Consider system chunk array size for new SYSTEM chunks
Pull watchdog fixes from Wim Van Sebroeck:
- cpwd: fix build regression
- pm8916_wdt: fix pretimeout registration flow
- meson: Fix the wrong value of left time
- imx_sc_wdt: Pretimeout should follow SCU firmware format
- bd70528: Add MODULE_ALIAS to allow module auto loading
* tag 'linux-watchdog-5.4-rc7' of git://www.linux-watchdog.org/linux-watchdog:
watchdog: bd70528: Add MODULE_ALIAS to allow module auto loading
watchdog: imx_sc_wdt: Pretimeout should follow SCU firmware format
watchdog: meson: Fix the wrong value of left time
watchdog: pm8916_wdt: fix pretimeout registration flow
watchdog: cpwd: fix build regression
Pull networking fixes from David Miller:
1) BPF sample build fixes from Björn Töpel
2) Fix powerpc bpf tail call implementation, from Eric Dumazet.
3) DCCP leaks jiffies on the wire, fix also from Eric Dumazet.
4) Fix crash in ebtables when using dnat target, from Florian Westphal.
5) Fix port disable handling whne removing bcm_sf2 driver, from Florian
Fainelli.
6) Fix kTLS sk_msg trim on fallback to copy mode, from Jakub Kicinski.
7) Various KCSAN fixes all over the networking, from Eric Dumazet.
8) Memory leaks in mlx5 driver, from Alex Vesker.
9) SMC interface refcounting fix, from Ursula Braun.
10) TSO descriptor handling fixes in stmmac driver, from Jose Abreu.
11) Add a TX lock to synchonize the kTLS TX path properly with crypto
operations. From Jakub Kicinski.
12) Sock refcount during shutdown fix in vsock/virtio code, from Stefano
Garzarella.
13) Infinite loop in Intel ice driver, from Colin Ian King.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (108 commits)
ixgbe: need_wakeup flag might not be set for Tx
i40e: need_wakeup flag might not be set for Tx
igb/igc: use ktime accessors for skb->tstamp
i40e: Fix for ethtool -m issue on X722 NIC
iavf: initialize ITRN registers with correct values
ice: fix potential infinite loop because loop counter being too small
qede: fix NULL pointer deref in __qede_remove()
net: fix data-race in neigh_event_send()
vsock/virtio: fix sock refcnt holding during the shutdown
net: ethernet: octeon_mgmt: Account for second possible VLAN header
mac80211: fix station inactive_time shortly after boot
net/fq_impl: Switch to kvmalloc() for memory allocation
mac80211: fix ieee80211_txq_setup_flows() failure path
ipv4: Fix table id reference in fib_sync_down_addr
ipv6: fixes rt6_probe() and fib6_nh->last_probe init
net: hns: Fix the stray netpoll locks causing deadlock in NAPI path
net: usb: qmi_wwan: add support for DW5821e with eSIM support
CDC-NCM: handle incomplete transfer of MTU
nfc: netlink: fix double device reference drop
NFC: st21nfca: fix double free
...
Pull block fixes from Jens Axboe:
- Two NVMe device removal crash fixes, and a compat fixup for for an
ioctl that was introduced in this release (Anton, Charles, Max - via
Keith)
- Missing error path mutex unlock for drbd (Dan)
- cgroup writeback fixup on dead memcg (Tejun)
- blkcg online stats print fix (Tejun)
* tag 'for-linus-2019-11-08' of git://git.kernel.dk/linux-block:
cgroup,writeback: don't switch wbs immediately on dead wbs if the memcg is dead
block: drbd: remove a stray unlock in __drbd_send_protocol()
blkcg: make blkcg_print_stat() print stats only for online blkgs
nvme: change nvme_passthru_cmd64 to explicitly mark rsvd
nvme-multipath: fix crash in nvme_mpath_clear_ctrl_paths
nvme-rdma: fix a segmentation fault during module unload
Jeff Kirsher says:
====================
Intel Wired LAN Driver Fixes 2019-11-08
This series contains fixes to igb, igc, ixgbe, i40e, iavf and ice
drivers.
Colin Ian King fixes a potentially wrap-around counter in a for-loop.
Nick fixes the default ITR values for the iavf driver to 50 usecs
interval.
Arkadiusz fixes 'ethtool -m' for X722 devices where the correct value
cannot be obtained from the firmware, so add X722 to the check to ensure
the wrong value is not returned.
Jake fixes igb and igc drivers in their implementation of launch time
support by declaring skb->tstamp value as ktime_t instead of s64.
Magnus fixes ixgbe and i40e where the need_wakeup flag for transmit may
not be set for AF_XDP sockets that are only used to send packets.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The need_wakeup flag for Tx might not be set for AF_XDP sockets that
are only used to send packets. This happens if there is at least one
outstanding packet that has not been completed by the hardware and we
get that corresponding completion (which will not generate an
interrupt since interrupts are disabled in the napi poll loop) between
the time we stopped processing the Tx completions and interrupts are
enabled again. In this case, the need_wakeup flag will have been
cleared at the end of the Tx completion processing as we believe we
will get an interrupt from the outstanding completion at a later point
in time. But if this completion interrupt occurs before interrupts
are enable, we lose it and should at that point really have set the
need_wakeup flag since there are no more outstanding completions that
can generate an interrupt to continue the processing. When this
happens, user space will see a Tx queue need_wakeup of 0 and skip
issuing a syscall, which means will never get into the Tx processing
again and we have a deadlock.
This patch introduces a quick fix for this issue by just setting the
need_wakeup flag for Tx to 1 all the time. I am working on a proper
fix for this that will toggle the flag appropriately, but it is more
challenging than I anticipated and I am afraid that this patch will
not be completed before the merge window closes, therefore this easier
fix for now. This fix has a negative performance impact in the range
of 0% to 4%. Towards the higher end of the scale if you have driver
and application on the same core and issue a lot of packets, and
towards no negative impact if you use two cores, lower transmission
speeds and/or a workload that also receives packets.
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The need_wakeup flag for Tx might not be set for AF_XDP sockets that
are only used to send packets. This happens if there is at least one
outstanding packet that has not been completed by the hardware and we
get that corresponding completion (which will not generate an
interrupt since interrupts are disabled in the napi poll loop) between
the time we stopped processing the Tx completions and interrupts are
enabled again. In this case, the need_wakeup flag will have been
cleared at the end of the Tx completion processing as we believe we
will get an interrupt from the outstanding completion at a later point
in time. But if this completion interrupt occurs before interrupts
are enable, we lose it and should at that point really have set the
need_wakeup flag since there are no more outstanding completions that
can generate an interrupt to continue the processing. When this
happens, user space will see a Tx queue need_wakeup of 0 and skip
issuing a syscall, which means will never get into the Tx processing
again and we have a deadlock.
This patch introduces a quick fix for this issue by just setting the
need_wakeup flag for Tx to 1 all the time. I am working on a proper
fix for this that will toggle the flag appropriately, but it is more
challenging than I anticipated and I am afraid that this patch will
not be completed before the merge window closes, therefore this easier
fix for now. This fix has a negative performance impact in the range
of 0% to 4%. Towards the higher end of the scale if you have driver
and application on the same core and issue a lot of packets, and
towards no negative impact if you use two cores, lower transmission
speeds and/or a workload that also receives packets.
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When implementing launch time support in the igb and igc drivers, the
skb->tstamp value is assumed to be a s64, but it's declared as a ktime_t
value.
Although ktime_t is typedef'd to s64 it wasn't always, and the kernel
provides accessors for ktime_t values.
Use the ktime_to_timespec64 and ktime_set accessors instead of directly
assuming that the variable is always an s64.
This improves portability if the code is ever moved to another kernel
version, or if the definition of ktime_t ever changes again in the
future.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch contains fix for a problem with command:
'ethtool -m <dev>'
which breaks functionality of:
'ethtool <dev>'
when called on X722 NIC
Disallowed update of link phy_types on X722 NIC
Currently correct value cannot be obtained from FW
Previously wrong value returned by FW was used and was
a root cause for incorrect output of 'ethtool <dev>' command
Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Since commit 92418fb147 ("i40e/i40evf: Use usec value instead of reg
value for ITR defines") the driver tracks the interrupt throttling
intervals in single usec units, although the actual ITRN registers are
programmed in 2 usec units. Most register programming flows in the driver
correctly handle the conversion, although it is currently not applied when
the registers are initialized to their default values. Most of the time
this doesn't present a problem since the default values are usually
immediately overwritten through the standard adaptive throttling mechanism,
or updated manually by the user, but if adaptive throttling is disabled and
the interval values are left alone then the incorrect value will persist.
Since the intended default interval of 50 usecs (vs. 100 usecs as
programmed) performs better for most traffic workloads, this can lead to
performance regressions.
This patch adds the correct conversion when writing the initial values to
the ITRN registers.
Signed-off-by: Nicholas Nunley <nicholas.d.nunley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently the for-loop counter i is a u8 however it is being checked
against a maximum value hw->num_tx_sched_layers which is a u16. Hence
there is a potential wrap-around of counter i back to zero if
hw->num_tx_sched_layers is greater than 255. Fix this by making i
a u16.
Addresses-Coverity: ("Infinite loop")
Fixes: b36c598c99 ("ice: Updates to Tx scheduler code")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Pull pwm fix from Thierry Reding:
"One more fix to keep a reference to the driver's module as long as
there are users of the PWM exposed by the driver"
* tag 'pwm/for-5.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm:
pwm: bcm-iproc: Prevent unloading the driver module while in use
Commit 67692435c4 ("sched: Rework pick_next_task() slow-path")
inadvertly introduced a race because it changed a previously
unexplored dependency between dropping the rq->lock and
sched_class::put_prev_task().
The comments about dropping rq->lock, in for example
newidle_balance(), only mentions the task being current and ->on_cpu
being set. But when we look at the 'change' pattern (in for example
sched_setnuma()):
queued = task_on_rq_queued(p); /* p->on_rq == TASK_ON_RQ_QUEUED */
running = task_current(rq, p); /* rq->curr == p */
if (queued)
dequeue_task(...);
if (running)
put_prev_task(...);
/* change task properties */
if (queued)
enqueue_task(...);
if (running)
set_next_task(...);
It becomes obvious that if we do this after put_prev_task() has
already been called on @p, things go sideways. This is exactly what
the commit in question allows to happen when it does:
prev->sched_class->put_prev_task(rq, prev, rf);
if (!rq->nr_running)
newidle_balance(rq, rf);
The newidle_balance() call will drop rq->lock after we've called
put_prev_task() and that allows the above 'change' pattern to
interleave and mess up the state.
Furthermore, it turns out we lost the RT-pull when we put the last DL
task.
Fix both problems by extracting the balancing from put_prev_task() and
doing a multi-class balance() pass before put_prev_task().
Fixes: 67692435c4 ("sched: Rework pick_next_task() slow-path")
Reported-by: Quentin Perret <qperret@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Quentin Perret <qperret@google.com>
Tested-by: Valentin Schneider <valentin.schneider@arm.com>
When cgroup is disabled the following compilation error was hit
kernel/sched/core.c: In function ‘uclamp_update_active_tasks’:
kernel/sched/core.c:1081:23: error: storage size of ‘it’ isn’t known
struct css_task_iter it;
^~
kernel/sched/core.c:1084:2: error: implicit declaration of function ‘css_task_iter_start’; did you mean ‘__sg_page_iter_start’? [-Werror=implicit-function-declaration]
css_task_iter_start(css, 0, &it);
^~~~~~~~~~~~~~~~~~~
__sg_page_iter_start
kernel/sched/core.c:1085:14: error: implicit declaration of function ‘css_task_iter_next’; did you mean ‘__sg_page_iter_next’? [-Werror=implicit-function-declaration]
while ((p = css_task_iter_next(&it))) {
^~~~~~~~~~~~~~~~~~
__sg_page_iter_next
kernel/sched/core.c:1091:2: error: implicit declaration of function ‘css_task_iter_end’; did you mean ‘get_task_cred’? [-Werror=implicit-function-declaration]
css_task_iter_end(&it);
^~~~~~~~~~~~~~~~~
get_task_cred
kernel/sched/core.c:1081:23: warning: unused variable ‘it’ [-Wunused-variable]
struct css_task_iter it;
^~
cc1: some warnings being treated as errors
make[2]: *** [kernel/sched/core.o] Error 1
Fix by protetion uclamp_update_active_tasks() with
CONFIG_UCLAMP_TASK_GROUP
Fixes: babbe170e0 ("sched/uclamp: Update CPU's refcount on TG's clamp changes")
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Patrick Bellasi <patrick.bellasi@matbug.net>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Ben Segall <bsegall@google.com>
Link: https://lkml.kernel.org/r/20191105112212.596-1-qais.yousef@arm.com