It can be useful to derive min/max rates of a snd_pcm_hardware without
having a snd_pcm_runtime, such as before constructing an ASoC DAI link.
Create a new helper that takes a pointer to a snd_pcm_hardware directly,
and refactor the original function as a wrapper around it, to avoid
needing to update any call sites.
Signed-off-by: Samuel Holland <samuel@sholland.org>
Reviewed-by: Takashi Iwai <tiwai@suse.de>
Link: https://lore.kernel.org/r/20200305051143.60691-2-samuel@sholland.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Merge series from Bard Liao <yung-chuan.liao@linux.intel.com>:
As discussed in [1], ASoC core supports multi codec DAIs
on a DAI link. However it does not do so for CPU DAIs.
So, add support for multi CPU DAIs on a DAI Link by adding
multi CPU DAI in Card instantiation, suspend and resume
functions, PCM ops, stream handling functions and DAPM.
[1]: https://www.spinics.net/lists/alsa-devel/msg71369.html
changes in v5:
- rebase to latest kernel base
Bard Liao (2):
ASoC: Return error if the function does not support multi-cpu
ASoC: pcm: check if cpu-dai supports a given stream
Shreyas NC (4):
ASoC: Add initial support for multiple CPU DAIs
ASoC: Add multiple CPU DAI support for PCM ops
ASoC: Add dapm_add_valid_dai_widget helper
ASoC: Add multiple CPU DAI support in DAPM
include/sound/soc.h | 15 +
sound/soc/soc-compress.c | 5 +-
sound/soc/soc-core.c | 168 +++++-----
sound/soc/soc-dapm.c | 133 ++++----
sound/soc/soc-generic-dmaengine-pcm.c | 18 +
sound/soc/soc-pcm.c | 463 ++++++++++++++++++--------
6 files changed, 531 insertions(+), 271 deletions(-)
--
2.17.1
ASoC core supports multiple codec DAIs but supports only a CPU DAI.
To support multiple cpu DAIs, add cpu_dai and num_cpu_dai in
snd_soc_dai_link and snd_soc_pcm_runtime structures similar to
support for codec_dai. This is intended as a preparatory patch to
eventually support the unification of the Codec and CPU DAI.
Inline with multiple codec DAI approach, add support to allocate,
init, bind and probe multiple cpu_dai on init if driver specifies
that. Also add support to loop over multiple cpu_dai during
suspend and resume.
This is intended as a preparatory patch to eventually unify the CPU
and Codec DAI into DAI components.
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Shreyas NC <shreyas.nc@intel.com>
Link: https://lore.kernel.org/r/20200225133917.21314-2-yung-chuan.liao@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
ASoC component open/close and snd_soc_component_module_get/put are called
independently for each component-substream pair, so the logic added in
commit dd03907bf1 ("ASoC: soc-pcm: call snd_soc_component_open/close()
once") was not sufficient and led to PCM playback and module unload errors.
Implement handling of failures directly in soc_pcm_components_open(),
so that any successfully opened components are closed upon error with
other components. This allows to clean up error handling in
soc_pcm_open() without adding more state tracking.
Fixes: dd03907bf1 ("ASoC: soc-pcm: call snd_soc_component_open/close() once")
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Link: https://lore.kernel.org/r/20200220094955.16968-1-kai.vehmanen@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
dpcm_path_put() (A) is calling kfree(*list).
The freed list is created by dapm_widget_list_create() (B) which is called
from snd_soc_dapm_dai_get_connected_widgets() (C) which is called from
dpcm_path_get() (D).
(B) dapm_widget_list_create(**list, ...)
{
...
=> *list = kzalloc();
...
}
(C) snd_soc_dapm_dai_get_connected_widgets(..., **list, ...)
{
...
dapm_widget_list_create(list, ...);
...
}
(D) dpcm_path_get(..., **list)
{
...
snd_soc_dapm_dai_get_connected_widgets(..., list, ...);
...
}
(A) dpcm_path_put(**list)
{
=> kfree(*list);
}
This kind of unbalance code is very difficult to read/understand.
To avoid this issue, this patch adds each missing paired function
dapm_widget_list_free() for dapm_widget_list_create() (B), and
snd_soc_dapm_dai_free_widgets() for snd_soc_dapm_dai_get_connected_widgets() (C).
This patch uses these, and moves dpcm_path_put() next to dpcm_path_get().
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Link: https://lore.kernel.org/r/87a75fjc9q.wl-kuninori.morimoto.gx@renesas.com
Signed-off-by: Mark Brown <broonie@kernel.org>
DAI driver has playback/capture stream.
OTOH, we have SNDRV_PCM_STREAM_PLAYBACK/CAPTURE.
Because of this kind of implementation,
ALSA SoC needs to have many verbose code.
To solve this issue, this patch adds snd_soc_dai_get_pcm_stream() macro
to get playback/capture stream pointer from stream.
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Link: https://lore.kernel.org/r/87ftf7jcab.wl-kuninori.morimoto.gx@renesas.com
Signed-off-by: Mark Brown <broonie@kernel.org>
There is a need to use RT5682 as DAI clock master for other codecs
within a platform, which means that the DAI clocks are required to
remain, regardless of whether the RT5682 is actually running
playback/capture.
The RT5682 CCF basic functions are implemented almost by the existing
internal functions and asoc apis. It needs a clk provider (rt5682 mclk)
to generate the bclk and wclk outputs.
The RT5682 CCF supports and restricts as below:
1. Fmt of DAI-AIF1 must be configured to master before using CCF.
2. Only accept a 48MHz clk as the clk provider.
3. Only provide a 48kHz wclk and a set of multiples of wclk as bclk.
There are some temporary limitations in this patch until a better
implementation.
Signed-off-by: Derek Fang <derek.fang@realtek.com>
Link: https://lore.kernel.org/r/1582033912-6841-1-git-send-email-derek.fang@realtek.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Add all required types and methods to support each and every request
that driver could sent to firmware. Probe is one of SOF firmware
features which allows for data extraction and injection directly from
or to DMA stream.
Exposes eight IPCs:
- addition and removal of injection DMAs
- addition and removal of probe points
- info retrieval of injection DMAs and probe points
- probe initialization and cleanup
Signed-off-by: Cezary Rojewski <cezary.rojewski@intel.com>
Acked-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://lore.kernel.org/r/20200218143924.10565-5-cezary.rojewski@intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Current soc_pcm_open() calls snd_soc_dai_startup() under loop.
Thus, it needs to care about started/not-yet-started codec DAI.
But, if soc-dai.c is handling it, soc-pcm.c don't need to care
about it.
This patch adds started flag to soc-dai.h, and simplify soc-pcm.c.
This is one of prepare for cleanup soc-pcm-open()
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Link: https://lore.kernel.org/r/875zgfcey5.wl-kuninori.morimoto.gx@renesas.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Pull new zonefs file system from Damien Le Moal:
"Zonefs is a very simple file system exposing each zone of a zoned
block device as a file.
Unlike a regular file system with native zoned block device support
(e.g. f2fs or the on-going btrfs effort), zonefs does not hide the
sequential write constraint of zoned block devices to the user. As a
result, zonefs is not a POSIX compliant file system. Its goal is to
simplify the implementation of zoned block devices support in
applications by replacing raw block device file accesses with a richer
file based API, avoiding relying on direct block device file ioctls
which may be more obscure to developers.
One example of this approach is the implementation of LSM
(log-structured merge) tree structures (such as used in RocksDB and
LevelDB) on zoned block devices by allowing SSTables to be stored in a
zone file similarly to a regular file system rather than as a range of
sectors of a zoned device. The introduction of the higher level
construct "one file is one zone" can help reducing the amount of
changes needed in the application while at the same time allowing the
use of zoned block devices with various programming languages other
than C.
Zonefs IO management implementation uses the new iomap generic code.
Zonefs has been successfully tested using a functional test suite
(available with zonefs userland format tool on github) and a prototype
implementation of LevelDB on top of zonefs"
* tag 'zonefs-5.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs:
zonefs: Add documentation
fs: New zonefs file system
Pull x86 fixes from Thomas Gleixner:
"A set of fixes for X86:
- Ensure that the PIT is set up when the local APIC is disable or
configured in legacy mode. This is caused by an ordering issue
introduced in the recent changes which skip PIT initialization when
the TSC and APIC frequencies are already known.
- Handle malformed SRAT tables during early ACPI parsing which caused
an infinite loop anda boot hang.
- Fix a long standing race in the affinity setting code which affects
PCI devices with non-maskable MSI interrupts. The problem is caused
by the non-atomic writes of the MSI address (destination APIC id)
and data (vector) fields which the device uses to construct the MSI
message. The non-atomic writes are mandated by PCI.
If both fields change and the device raises an interrupt after
writing address and before writing data, then the MSI block
constructs a inconsistent message which causes interrupts to be
lost and subsequent malfunction of the device.
The fix is to redirect the interrupt to the new vector on the
current CPU first and then switch it over to the new target CPU.
This allows to observe an eventually raised interrupt in the
transitional stage (old CPU, new vector) to be observed in the APIC
IRR and retriggered on the new target CPU and the new vector.
The potential spurious interrupts caused by this are harmless and
can in the worst case expose a buggy driver (all handlers have to
be able to deal with spurious interrupts as they can and do happen
for various reasons).
- Add the missing suspend/resume mechanism for the HYPERV hypercall
page which prevents resume hibernation on HYPERV guests. This
change got lost before the merge window.
- Mask the IOAPIC before disabling the local APIC to prevent
potentially stale IOAPIC remote IRR bits which cause stale
interrupt lines after resume"
* tag 'x86-urgent-2020-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/apic: Mask IOAPIC entries when disabling the local APIC
x86/hyperv: Suspend/resume the hypercall page for hibernation
x86/apic/msi: Plug non-maskable MSI affinity race
x86/boot: Handle malformed SRAT tables during early ACPI parsing
x86/timer: Don't skip PIT setup when APIC is disabled or in legacy mode
Pull perf fixes from Thomas Gleixner:
"A set of fixes and improvements for the perf subsystem:
Kernel fixes:
- Install cgroup events to the correct CPU context to prevent a
potential list double add
- Prevent an integer underflow in the perf mlock accounting
- Add a missing prototype for arch_perf_update_userpage()
Tooling:
- Add a missing unlock in the error path of maps__insert() in perf
maps.
- Fix the build with the latest libbfd
- Fix the perf parser so it does not delete parse event terms, which
caused a regression for using perf with the ARM CoreSight as the
sink configuration was missing due to the deletion.
- Fix the double free in the perf CPU map merging test case
- Add the missing ustring support for the perf probe command"
* tag 'perf-urgent-2020-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf maps: Add missing unlock to maps__insert() error case
perf probe: Add ustring support for perf probe command
perf: Make perf able to build with latest libbfd
perf test: Fix test case Merge cpu map
perf parse: Copy string to perf_evsel_config_term
perf parse: Refactor 'struct perf_evsel_config_term'
kernel/events: Add a missing prototype for arch_perf_update_userpage()
perf/cgroups: Install cgroup events to correct cpuctx
perf/core: Fix mlock accounting in perf_mmap()
Pull interrupt fixes from Thomas Gleixner:
"A set of fixes for the interrupt subsystem:
- Provision only ACPI enabled redistributors on GICv3
- Use the proper command colums when building the INVALL command for
the GICv3-ITS
- Ensure the allocation of the L2 vPE table for GICv4.1
- Correct the GICv4.1 VPROBASER programming so it uses the proper
size
- A set of small GICv4.1 tidy up patches
- Configuration cleanup for C-SKY interrupt chip
- Clarify the function documentation for irq_set_wake() to document
that the wakeup functionality is orthogonal to the irq
disable/enable mechanism"
* tag 'irq-urgent-2020-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/gic-v3-its: Rename VPENDBASER/VPROPBASER accessors
irqchip/gic-v3-its: Remove superfluous WARN_ON
irqchip/gic-v4.1: Drop 'tmp' in inherit_vpe_l1_table_from_rd()
irqchip/gic-v4.1: Ensure L2 vPE table is allocated at RD level
irqchip/gic-v4.1: Set vpe_l1_base for all redistributors
irqchip/gic-v4.1: Fix programming of GICR_VPROPBASER_4_1_SIZE
genirq: Clarify that irq wake state is orthogonal to enable/disable
irqchip/gic-v3-its: Reference to its_invall_cmd descriptor when building INVALL
irqchip: Some Kconfig cleanup for C-SKY
irqchip/gic-v3: Only provision redistributors that are enabled in ACPI
Pull networking fixes from David Miller:
1) Unbalanced locking in mwifiex_process_country_ie, from Brian Norris.
2) Fix thermal zone registration in iwlwifi, from Andrei
Otcheretianski.
3) Fix double free_irq in sgi ioc3 eth, from Thomas Bogendoerfer.
4) Use after free in mptcp, from Florian Westphal.
5) Use after free in wireguard's root_remove_peer_lists, from Eric
Dumazet.
6) Properly access packets heads in bonding alb code, from Eric
Dumazet.
7) Fix data race in skb_queue_len(), from Qian Cai.
8) Fix regression in r8169 on some chips, from Heiner Kallweit.
9) Fix XDP program ref counting in hv_netvsc, from Haiyang Zhang.
10) Certain kinds of set link netlink operations can cause a NULL deref
in the ipv6 addrconf code. Fix from Eric Dumazet.
11) Don't cancel uninitialized work queue in drop monitor, from Ido
Schimmel.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (84 commits)
net: thunderx: use proper interface type for RGMII
mt76: mt7615: fix max_nss in mt7615_eeprom_parse_hw_cap
bpf: Improve bucket_log calculation logic
selftests/bpf: Test freeing sockmap/sockhash with a socket in it
bpf, sockhash: Synchronize_rcu before free'ing map
bpf, sockmap: Don't sleep while holding RCU lock on tear-down
bpftool: Don't crash on missing xlated program instructions
bpf, sockmap: Check update requirements after locking
drop_monitor: Do not cancel uninitialized work item
mlxsw: spectrum_dpipe: Add missing error path
mlxsw: core: Add validation of hardware device types for MGPIR register
mlxsw: spectrum_router: Clear offload indication from IPv6 nexthops on abort
selftests: mlxsw: Add test cases for local table route replacement
mlxsw: spectrum_router: Prevent incorrect replacement of local table routes
net: dsa: microchip: enable module autoprobe
ipv6/addrconf: fix potential NULL deref in inet6_set_link_af()
dpaa_eth: support all modes with rate adapting PHYs
net: stmmac: update pci platform data to use phy_interface
net: stmmac: xgmac: fix missing IFF_MULTICAST checki in dwxgmac2_set_filter
net: stmmac: fix missing IFF_MULTICAST check in dwmac4_set_filter
...
Pull ARM SoC late updates from Olof Johansson:
"This is some material that we picked up into our tree late, or that
had more complex dependencies on more than one topic branch that makes
sense to keep separately.
- TI support for secure accelerators and hwrng on OMAP4/5
- TI camera changes for dra7 and am437x and SGX improvement due to
better reset control support on am335x, am437x and dra7
- Davinci moves to proper clocksource on DM365, and regulator/audio
improvements for DM365 and DM644x eval boards"
* tag 'armsoc-late' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (32 commits)
ARM: dts: omap4-droid4: Enable hdq for droid4 ds250x 1-wire battery nvmem
ARM: dts: motorola-cpcap-mapphone: Configure calibration interrupt
ARM: dts: Configure interconnect target module for am437x sgx
ARM: dts: Configure sgx for dra7
ARM: dts: Configure rstctrl reset for am335x SGX
ARM: dts: dra7: Add ti-sysc node for VPE
ARM: dts: dra7: add vpe clkctrl node
ARM: dts: am43x-epos-evm: Add VPFE and OV2659 entries
ARM: dts: am437x-sk-evm: Add VPFE and OV2659 entries
ARM: dts: am43xx: add support for clkout1 clock
arm: dts: dra76-evm: Add CAL and OV5640 nodes
arm: dtsi: dra76x: Add CAL dtsi node
arm: dts: dra72-evm-common: Add entries for the CSI2 cameras
ARM: dts: DRA72: Add CAL dtsi node
ARM: dts: dra7-l4: Add ti-sysc node for CAM
ARM: OMAP: DRA7xx: Make CAM clock domain SWSUP only
ARM: dts: dra7: add cam clkctrl node
ARM: OMAP2+: Drop legacy platform data for omap4 des
ARM: OMAP2+: Drop legacy platform data for omap4 sham
ARM: OMAP2+: Drop legacy platform data for omap4 aes
...
Pull ARM SoC-related driver updates from Olof Johansson:
"Various driver updates for platforms:
- Nvidia: Fuse support for Tegra194, continued memory controller
pieces for Tegra30
- NXP/FSL: Refactorings of QuickEngine drivers to support
ARM/ARM64/PPC
- NXP/FSL: i.MX8MP SoC driver pieces
- TI Keystone: ring accelerator driver
- Qualcomm: SCM driver cleanup/refactoring + support for new SoCs.
- Xilinx ZynqMP: feature checking interface for firmware. Mailbox
communication for power management
- Overall support patch set for cpuidle on more complex hierarchies
(PSCI-based)
and misc cleanups, refactorings of Marvell, TI, other platforms"
* tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (166 commits)
drivers: soc: xilinx: Use mailbox IPI callback
dt-bindings: power: reset: xilinx: Add bindings for ipi mailbox
drivers: soc: ti: knav_qmss_queue: Pass lockdep expression to RCU lists
MAINTAINERS: Add brcmstb PCIe controller entry
soc/tegra: fuse: Unmap registers once they are not needed anymore
soc/tegra: fuse: Correct straps' address for older Tegra124 device trees
soc/tegra: fuse: Warn if straps are not ready
soc/tegra: fuse: Cache values of straps and Chip ID registers
memory: tegra30-emc: Correct error message for timed out auto calibration
memory: tegra30-emc: Firm up hardware programming sequence
memory: tegra30-emc: Firm up suspend/resume sequence
soc/tegra: regulators: Do nothing if voltage is unchanged
memory: tegra: Correct reset value of xusb_hostr
soc/tegra: fuse: Add APB DMA dependency for Tegra20
bus: tegra-aconnect: Remove PM_CLK dependency
dt-bindings: mediatek: add MT6765 power dt-bindings
soc: mediatek: cmdq: delete not used define
memory: tegra: Add support for the Tegra194 memory controller
memory: tegra: Only include support for enabled SoCs
memory: tegra: Support DVFS on Tegra186 and later
...
Pull vfs file system parameter updates from Al Viro:
"Saner fs_parser.c guts and data structures. The system-wide registry
of syntax types (string/enum/int32/oct32/.../etc.) is gone and so is
the horror switch() in fs_parse() that would have to grow another case
every time something got added to that system-wide registry.
New syntax types can be added by filesystems easily now, and their
namespace is that of functions - not of system-wide enum members. IOW,
they can be shared or kept private and if some turn out to be widely
useful, we can make them common library helpers, etc., without having
to do anything whatsoever to fs_parse() itself.
And we already get that kind of requests - the thing that finally
pushed me into doing that was "oh, and let's add one for timeouts -
things like 15s or 2h". If some filesystem really wants that, let them
do it. Without somebody having to play gatekeeper for the variants
blessed by direct support in fs_parse(), TYVM.
Quite a bit of boilerplate is gone. And IMO the data structures make a
lot more sense now. -200LoC, while we are at it"
* 'merge.nfs-fs_parse.1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (25 commits)
tmpfs: switch to use of invalfc()
cgroup1: switch to use of errorfc() et.al.
procfs: switch to use of invalfc()
hugetlbfs: switch to use of invalfc()
cramfs: switch to use of errofc() et.al.
gfs2: switch to use of errorfc() et.al.
fuse: switch to use errorfc() et.al.
ceph: use errorfc() and friends instead of spelling the prefix out
prefix-handling analogues of errorf() and friends
turn fs_param_is_... into functions
fs_parse: handle optional arguments sanely
fs_parse: fold fs_parameter_desc/fs_parameter_spec
fs_parser: remove fs_parameter_description name field
add prefix to fs_context->log
ceph_parse_param(), ceph_parse_mon_ips(): switch to passing fc_log
new primitive: __fs_parse()
switch rbd and libceph to p_log-based primitives
struct p_log, variants of warnf() et.al. taking that one instead
teach logfc() to handle prefices, give it saner calling conventions
get rid of cg_invalf()
...
Pull misc vfs updates from Al Viro:
- bmap series from cmaiolino
- getting rid of convolutions in copy_mount_options() (use a couple of
copy_from_user() instead of the __get_user() crap)
* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
saner copy_mount_options()
fibmap: Reject negative block numbers
fibmap: Use bmap instead of ->bmap method in ioctl_fibmap
ecryptfs: drop direct calls to ->bmap
cachefiles: drop direct usage of ->bmap method.
fs: Enable bmap() function to properly return errors
Merge thundering herd avoidance on pipe IO.
This would have been applied for 5.5 already, but got delayed because of
a user-space race condition in the GNU make jobserver code. Now that
there's a new GNU make 4.3 release, and most distributions seem to have
at least applied the (almost three year old) fix for the problem, let's
see if people notice.
And it might have been just bad random timing luck on my machine.
If you do hit the race condition, things will still work, but the
symptom is that you don't get nearly the expected parallelism when using
"make -j<N>".
The jobserver bug can definitely happen without this patch too, but
seems to be easier to trigger when we no longer wake up pipe waiters
unnecessarily.
* pipe-exclusive-wakeup:
pipe: use exclusive waits when reading or writing
This makes the pipe code use separate wait-queues and exclusive waiting
for readers and writers, avoiding a nasty thundering herd problem when
there are lots of readers waiting for data on a pipe (or, less commonly,
lots of writers waiting for a pipe to have space).
While this isn't a common occurrence in the traditional "use a pipe as a
data transport" case, where you typically only have a single reader and
a single writer process, there is one common special case: using a pipe
as a source of "locking tokens" rather than for data communication.
In particular, the GNU make jobserver code ends up using a pipe as a way
to limit parallelism, where each job consumes a token by reading a byte
from the jobserver pipe, and releases the token by writing a byte back
to the pipe.
This pattern is fairly traditional on Unix, and works very well, but
will waste a lot of time waking up a lot of processes when only a single
reader needs to be woken up when a writer releases a new token.
A simplified test-case of just this pipe interaction is to create 64
processes, and then pass a single token around between them (this
test-case also intentionally passes another token that gets ignored to
test the "wake up next" logic too, in case anybody wonders about it):
#include <unistd.h>
int main(int argc, char **argv)
{
int fd[2], counters[2];
pipe(fd);
counters[0] = 0;
counters[1] = -1;
write(fd[1], counters, sizeof(counters));
/* 64 processes */
fork(); fork(); fork(); fork(); fork(); fork();
do {
int i;
read(fd[0], &i, sizeof(i));
if (i < 0)
continue;
counters[0] = i+1;
write(fd[1], counters, (1+(i & 1)) *sizeof(int));
} while (counters[0] < 1000000);
return 0;
}
and in a perfect world, passing that token around should only cause one
context switch per transfer, when the writer of a token causes a
directed wakeup of just a single reader.
But with the "writer wakes all readers" model we traditionally had, on
my test box the above case causes more than an order of magnitude more
scheduling: instead of the expected ~1M context switches, "perf stat"
shows
231,852.37 msec task-clock # 15.857 CPUs utilized
11,250,961 context-switches # 0.049 M/sec
616,304 cpu-migrations # 0.003 M/sec
1,648 page-faults # 0.007 K/sec
1,097,903,998,514 cycles # 4.735 GHz
120,781,778,352 instructions # 0.11 insn per cycle
27,997,056,043 branches # 120.754 M/sec
283,581,233 branch-misses # 1.01% of all branches
14.621273891 seconds time elapsed
0.018243000 seconds user
3.611468000 seconds sys
before this commit.
After this commit, I get
5,229.55 msec task-clock # 3.072 CPUs utilized
1,212,233 context-switches # 0.232 M/sec
103,951 cpu-migrations # 0.020 M/sec
1,328 page-faults # 0.254 K/sec
21,307,456,166 cycles # 4.074 GHz
12,947,819,999 instructions # 0.61 insn per cycle
2,881,985,678 branches # 551.096 M/sec
64,267,015 branch-misses # 2.23% of all branches
1.702148350 seconds time elapsed
0.004868000 seconds user
0.110786000 seconds sys
instead. Much better.
[ Note! This kernel improvement seems to be very good at triggering a
race condition in the make jobserver (in GNU make 4.2.1) for me. It's
a long known bug that was fixed back in June 2017 by GNU make commit
b552b0525198 ("[SV 51159] Use a non-blocking read with pselect to
avoid hangs.").
But there wasn't a new release of GNU make until 4.3 on Jan 19 2020,
so a number of distributions may still have the buggy version. Some
have backported the fix to their 4.2.1 release, though, and even
without the fix it's quite timing-dependent whether the bug actually
is hit. ]
Josh Triplett says:
"I've been hammering on your pipe fix patch (switching to exclusive
wait queues) for a month or so, on several different systems, and I've
run into no issues with it. The patch *substantially* improves
parallel build times on large (~100 CPU) systems, both with parallel
make and with other things that use make's pipe-based jobserver.
All current distributions (including stable and long-term stable
distributions) have versions of GNU make that no longer have the
jobserver bug"
Tested-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Daniel Borkmann says:
====================
pull-request: bpf 2020-02-07
The following pull-request contains BPF updates for your *net* tree.
We've added 15 non-merge commits during the last 10 day(s) which contain
a total of 12 files changed, 114 insertions(+), 31 deletions(-).
The main changes are:
1) Various BPF sockmap fixes related to RCU handling in the map's tear-
down code, from Jakub Sitnicki.
2) Fix macro state explosion in BPF sk_storage map when calculating its
bucket_log on allocation, from Martin KaFai Lau.
3) Fix potential BPF sockmap update race by rechecking socket's established
state under lock, from Lorenz Bauer.
4) Fix crash in bpftool on missing xlated instructions when kptr_restrict
sysctl is set, from Toke Høiland-Jørgensen.
5) Fix i40e's XSK wakeup code to return proper error in busy state and
various misc fixes in xdpsock BPF sample code, from Maciej Fijalkowski.
6) Fix the way modifiers are skipped in BTF in the verifier while walking
pointers to avoid program rejection, from Alexei Starovoitov.
7) Fix Makefile for runqslower BPF tool to i) rebuild on libbpf changes and
ii) to fix undefined reference linker errors for older gcc version due to
order of passed gcc parameters, from Yulia Kartseva and Song Liu.
8) Fix a trampoline_count BPF kselftest warning about missing braces around
initializer, from Andrii Nakryiko.
9) Fix up redundant "HAVE" prefix from large INSN limit kernel probe in
bpftool, from Michal Rostecki.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, we will not set vpe_l1_page for the current RD if we can
inherit the vPE configuration table from another RD (or ITS), which
results in an inconsistency between RDs within the same CommonLPIAff
group.
Let's rename it to vpe_l1_base to indicate the base address of the
vPE configuration table of this RD, and set it properly for *all*
v4.1 redistributors.
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200206075711.1275-3-yuzenghui@huawei.com
Puyll NFS client updates from Anna Schumaker:
"Stable bugfixes:
- Fix memory leaks and corruption in readdir # v2.6.37+
- Directory page cache needs to be locked when read # v2.6.37+
New features:
- Convert NFS to use the new mount API
- Add "softreval" mount option to let clients use cache if server goes down
- Add a config option to compile without UDP support
- Limit the number of inactive delegations the client can cache at once
- Improved readdir concurrency using iterate_shared()
Other bugfixes and cleanups:
- More 64-bit time conversions
- Add additional diagnostic tracepoints
- Check for holes in swapfiles, and add dependency on CONFIG_SWAP
- Various xprtrdma cleanups to prepare for 5.7's changes
- Several fixes for NFS writeback and commit handling
- Fix acls over krb5i/krb5p mounts
- Recover from premature loss of openstateids
- Fix NFS v3 chacl and chmod bug
- Compare creds using cred_fscmp()
- Use kmemdup_nul() in more places
- Optimize readdir cache page invalidation
- Lease renewal and recovery fixes"
* tag 'nfs-for-5.6-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (93 commits)
NFSv4.0: nfs4_do_fsinfo() should not do implicit lease renewals
NFSv4: try lease recovery on NFS4ERR_EXPIRED
NFS: Fix memory leaks
nfs: optimise readdir cache page invalidation
NFS: Switch readdir to using iterate_shared()
NFS: Use kmemdup_nul() in nfs_readdir_make_qstr()
NFS: Directory page cache pages need to be locked when read
NFS: Fix memory leaks and corruption in readdir
SUNRPC: Use kmemdup_nul() in rpc_parse_scope_id()
NFS: Replace various occurrences of kstrndup() with kmemdup_nul()
NFSv4: Limit the total number of cached delegations
NFSv4: Add accounting for the number of active delegations held
NFSv4: Try to return the delegation immediately when marked for return on close
NFS: Clear NFS_DELEGATION_RETURN_IF_CLOSED when the delegation is returned
NFSv4: nfs_inode_evict_delegation() should set NFS_DELEGATION_RETURNING
NFS: nfs_find_open_context() should use cred_fscmp()
NFS: nfs_access_get_cached_rcu() should use cred_fscmp()
NFSv4: pnfs_roc() must use cred_fscmp() to compare creds
NFS: remove unused macros
nfs: Return EINVAL rather than ERANGE for mount parse errors
...
Pull i2c updates from Wolfram Sang:
"i2c core:
- huge improvements and refactorizations of the Linux I2C
documentation (lots of thanks to Luca for doing it and Jean for the
careful review)
- subsystem wide API conversion to i2c_new_client_device()
- remove obsolete parport-light driver
- smaller core updates (removal of 'extern', enabling more compile
testing, use more helper macros)
- and quite a bunch of driver updates (new IDs, simplifications,
better PM, support of atomic transfers and other improvements)
i2c-mux:
- The main feature is the idle-state rework of the pca954x driver
from Biwen Li
at24 driver:
- minor maintenance: update the license tag, sort headers
- move support for the write-protect pin into nvmem core
- add a reference to the new wp-gpios property in nvmem to at25
bindings
- add support for regulator and pm_runtime control"
* 'i2c/for-5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (91 commits)
i2c: cros-ec-tunnel: Fix ACPI identifier
i2c: cros-ec-tunnel: Fix slave device enumeration
i2c: stm32f7: add PM_SLEEP suspend/resume support
i2c: cadence: Fix wording in i2c-cadence driver
i2c: cadence: Fix power management order of operations
i2c: cadence: Fix error printing in case of defer
i2c: cadence: Handle transfer_size rollover
i2c: i801: Add support for Intel Comet Lake PCH-V
docs: i2c: writing-clients: properly name the stop condition
docs: i2c: i2c-protocol: use same wording as smbus-protocol
docs: i2c: rename sections so the overall picture is clearer
docs: i2c: old-module-parameters: use monospace instead of ""
docs: i2c: old-module-parameters: clarify this is for obsolete kernels
docs: i2c: old-module-parameters: fix internal hyperlink
docs: i2c: instantiating-devices: use monospace for sysfs attributes
docs: i2c: instantiating-devices: rearrange static instatiation
docs: i2c: instantiating-devices: fix internal hyperlink
docs: i2c: smbus-protocol: improve I2C Block transactions description
docs: i2c: smbus-protocol: fix punctuation
docs: i2c: smbus-protocol: fix typo
...
Pull drm fixes from Dave Airlie:
"Just some fixes for this merge window: the tegra changes fix some
regressions in the merge, nouveau has a few modesetting fixes.
The amdgpu fixes are bit bigger, but they contain a couple of weeks of
fixes, and don't seem to contain anything that isn't really a fix.
Summary:
tegra:
- merge window regression fixes
nouveau:
- couple of volta/turing modesetting fixes
amdgpu:
- EDC fixes for Arcturus
- GDDR6 memory training fixe
- Fix for reading gfx clockgating registers while in GFXOFF state
- i2c freq fixes
- Misc display fixes
- TLB invalidation fix when using semaphores
- VCN 2.5 instancing fixes
- Switch raven1 gfxoff to a blacklist
- Coreboot workaround for KV/KB
- Root cause dongle fixes for display and revert workaround
- Enable GPU reset for renoir and navi
- Navi overclocking fixes
- Fix up confusing warnings in display clock validation on raven
amdkfd:
- SDMA fix
radeon:
- Misc LUT fixes"
* tag 'drm-next-2020-02-07' of git://anongit.freedesktop.org/drm/drm: (90 commits)
gpu: host1x: Set DMA direction only for DMA-mapped buffer objects
drm/tegra: Reuse IOVA mapping where possible
drm/tegra: Relax IOMMU usage criteria on old Tegra
drm/amd/dm/mst: Ignore payload update failures
drm/amdgpu: update default voltage for boot od table for navi1x
drm/amdgpu/smu10: fix smu10_get_clock_by_type_with_voltage
drm/amdgpu/smu10: fix smu10_get_clock_by_type_with_latency
drm/amdgpu/display: handle multiple numbers of fclks in dcn_calcs.c (v2)
drm/amdgpu: fetch default VDDC curve voltages (v2)
drm/amdgpu/smu_v11_0: Correct behavior of restoring default tables (v2)
drm/amdgpu/navi10: add OD_RANGE for navi overclocking
drm/amdgpu/navi: fix index for OD MCLK
drm/amd/display: Fix HW/SW state mismatch
drm/amd/display: Fix a typo when computing dsc configuration
drm/amd/powerplay: fix navi10 system intermittent reboot issue V2
drm/amdkfd: Fix a bug in SDMA RLC queue counting under HWS mode
drm/amd/display: Only enable cursor on pipes that need it
drm/nouveau/kms/gv100-: avoid sending a core update until the first modeset
drm/nouveau/kms/gv100-: move window ownership setup into modesetting path
drm/nouveau/disp/gv100-: halt NV_PDISP_FE_RM_INTR_STAT_CTRL_DISP_ERROR storms
...
Pull clk fixes from Stephen Boyd:
"A collection of fixes:
- Make of_clk.h self contained
- Fix new qcom DT bindings that just merged to match the DTS files
- Fix qcom clk driver to properly detect DFS clk frequencies
- Fix the ls1028a driver to not deref a pointer before assigning it"
* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
of: clk: Make <linux/of_clk.h> self-contained
clk: qcom: Use ARRAY_SIZE in videocc-sc7180 for parent clocks
clk: qcom: Get rid of the test clock for videocc-sc7180
dt-bindings: clock: Cleanup qcom,videocc bindings for sdm845/sc7180
clk: qcom: Use ARRAY_SIZE in gpucc-sc7180 for parent clocks
clk: qcom: Get rid of the test clock for gpucc-sc7180
dt-bindings: clock: Fix qcom,gpucc bindings for sdm845/sc7180/msm8998
clk: qcom: Use ARRAY_SIZE in dispcc-sc7180 for parent clocks
clk: qcom: Get rid of the test clock for dispcc-sc7180
clk: qcom: Get rid of fallback global names for dispcc-sc7180
dt-bindings: clock: Fix qcom,dispcc bindings for sdm845/sc7180
clk: qcom: rcg2: Don't crash if our parent can't be found; return an error
clk: ls1028a: fix a dereference of pointer 'parent' before a null check
dt-bindings: clk: qcom: Fix self-validation, split, and clean cruft
clk: qcom: Don't overwrite 'cfg' in clk_rcg2_dfs_populate_freq()