I've done a lot of history digging. The first signs of this
optimization was introduced in i915:
commit 25067bfc06
Author: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Date: Wed Sep 10 12:03:17 2014 -0300
drm/i915: pin sprite fb only if it changed
without much justification. Pinning already pinned stuff is real cheap
(it's just obj->pin_count++ really), and the missing implicit sync was
entirely forgotten about it seems. It's at least not mentioned
anywhere it the commit message.
It was also promptly removed shortly afterwards in
commit ea2c67bb4a
Author: Matt Roper <matthew.d.roper@intel.com>
Date: Tue Dec 23 10:41:52 2014 -0800
drm/i915: Move to atomic plane helpers (v9)
again without really mentioning the side-effect that plane updates
with the same fb now again obey implicit syncing.
Note that this only ever applied to the plane_update hook, all other
legacy entry points (set_base, page_flip) always obeyed implicit sync
in the drm/i915 driver.
The real source of this code here seems to be msm, copied to vc4, then
copied to tinydrm. I've also tried to dig around in all available msm
sources, but the corresponding check for fb != old_fb is present ever
since the initial merge in
commit cf3a7e4ce0
Author: Rob Clark <robdclark@gmail.com>
Date: Sat Nov 8 13:21:06 2014 -0500
drm/msm: atomic core bits
The only older version I've found of msm atomic code predates the
atomic helpers, and so didn't even use any of this. It also does not
have a corresponding check (because it simply did no implicit sync at
all).
I've chatted with Rob on irc, and he didn't remember the reason for
this either.
Note we had epic amounts of fun with too much syncing against
_vblank_, especially around cursor updates. But I don't ever
discussing a need for less syncing against implicit fences.
Also note that explicit fencing allows you to sidetrack all of this,
at least for all the drivers correctly implemented using
drm_atomic_set_fence_for_plane().
Given that it seems to be an accident of history, and that big drivers
like i915 (and also nouveau it seems, I didn't follow the
amdgpu/radeon sync code to figure this out properly there) never have
done it, let's remove this.
Cc: Rob Clark <robdclark@gmail.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Sean Paul <seanpaul@chromium.org>
Cc: David Airlie <airlied@linux.ie>
Cc: "Noralf Trønnes" <noralf@tronnes.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180405154449.23038-8-daniel.vetter@ffwll.ch
The vop irq is shared between vop and iommu and irq probing in the
iommu driver moved to the probe function recently. This can in some
cases lead to a stall if the irq is triggered while the vop driver
still has it disabled, but the vop irq handler gets called.
But there is no real need to disable the irq, as the vop can simply
also track its enabled state and ignore irqs in that case.
For this we can simply check the power-domain state of the vop,
similar to how the iommu driver does it.
So remove the enable/disable handling and add appropriate condition
to the irq handler.
changes in v2:
- move to just check the power-domain state
- add clock handling
changes in v3:
- clarify comment to speak of runtime-pm not power-domain
changes in v4:
- address Marc's comments (clk-enable WARN_ON and style improvement)
Fixes: d0b912bd4c ("iommu/rockchip: Request irqs in rk_iommu_probe()")
Cc: stable@vger.kernel.org
Signed-off-by: Sandy Huang <hjc@rock-chips.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Ezequiel Garcia <ezequiel@collabora.com>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180612132028.27490-3-heiko@sntech.de
This patch makes RC_CORE to be selected with this driver.
sil_sii8620 driver calls remote controller interfaces directly
so RC_CORE should be enabled mandatorily.
And some boards not using remote controller device don't really
need to know that RC_CORE config should be enabled to use sil_sii8620
driver only for HDMI.
Changelog v2:
- select INPUT because compiling will fail without INPUT.
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1527154379-31886-1-git-send-email-inki.dae@samsung.com
This is the format generated by VC4's H.264 engine, and preferred by
the ISP as well. By displaying SAND buffers directly, we can avoid
needing to use the ISP to rewrite the SAND H.264 output to linear
before display.
This is a joint effort by Dave Stevenson (who wrote the initial patch
and DRM demo) and Eric Anholt (drm_fourcc.h generalization, safety
checks, RGBA support).
v2: Make the parameter macro give all of the middle 48 bits (suggested
by Daniels). Fix fourcc_mod_broadcom_mod()'s bits/shift being
swapped. Mark NV12/21 as supported, not YUV420.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
Signed-off-by: Eric Anholt <eric@anholt.net>
Cc: Daniel Vetter <daniel@ffwll.ch>
Acked-by: Daniel Stone <daniels@collabora.com> (v1)
Cc: Boris Brezillon <boris.brezillon@bootlin.com>
Cc: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180316220435.31416-3-eric@anholt.net
Disabling CONFIG_PM produces a compile time warning when these
functions are not referenced:
drivers/gpu/drm/sun4i/sun6i_mipi_dsi.c:1072:12: error: 'sun6i_dsi_runtime_suspend' defined but not used [-Werror=unused-function]
static int sun6i_dsi_runtime_suspend(struct device *dev)
^~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/sun4i/sun6i_mipi_dsi.c:1043:12: error: 'sun6i_dsi_runtime_resume' defined but not used [-Werror=unused-function]
static int sun6i_dsi_runtime_resume(struct device *dev)
^~~~~~~~~~~~~~~~~~~~~~~~
Fixes: 133add5b5a ("drm/sun4i: Add Allwinner A31 MIPI-DSI controller support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180525155030.3667352-1-arnd@arndb.de
To no surprise (since we've flip-flopped over the use of PIN_HIGH a few
times), doing a search by address over a pathologically fragmented
address space is exceeding slow. To protect ourselves from nearly
unbounded latency (think searching a million holes while under
struct_mutex), limit the search for the highest available hole and
fallback to best-fit if it fails.
In the pathologically fragmented case, such as igt/gem_ctx_thrash, the
effect is dramatic, bringing the runtime down from hours to seconds
(depending on how many other slow searches you hit, e.g. alloc_iova()
and alloc_vmap_area() both degrade to a slow rbtree walk after their
small cache is exhausted). For the real world, the number of search
steps is unlikely to be significant as we should only need to search
once per new context.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180521082131.13744-3-chris@chris-wilson.co.uk
Searching for an available hole by address is slow, as there no
guarantee that a hole will be available and so we must walk over all
nodes in the rbtree before we determine the search was futile. In many
cases, the caller doesn't strictly care for the highest available hole
and was just opportunistically laying out the address space in a
preferred order. In such cases, the caller can accept any address and
would rather do so then do a slow walk.
To be able to mix search strategies, the caller wants to tell the drm_mm
how long to spend on the search. Without a good guide for what should be
the best split, start with a request to try once at most. That is return
the top-most (or lowest) hole if it fulfils the alignment and size
requirements.
v2: Documentation, by why of example (selftests) and kerneldoc.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180521082131.13744-2-chris@chris-wilson.co.uk
As we keep an rbtree of available holes sorted by their size, we can
very easily determine if there is any hole large enough that might
satisfy the allocation request. This helps when dealing with a highly
fragmented address space and a request for a search by address.
To cache the largest size, we convert into the cached rbtree variant
which tracks the leftmost node for us. However, currently we sorted into
ascending size order so the leftmost node is the smallest, and so to
make it the largest hole we need to invert our sorting.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180521082131.13744-1-chris@chris-wilson.co.uk