Commit Graph

361229 Commits

Author SHA1 Message Date
Alastair Bridgewater
2709b275c5 drm/nouveau/disp/gf119: Use supplied HDMI InfoFrames
Now that we have the InfoFrame data being provided, for the most
part, program the hardware to use it.

While we're here, and since the functionality will come in handy
for supporting 3D stereoscopy, implement setting the Vendor
("generic"?) InfoFrame.

Also don't enable any InfoFrame that is not provided, and disable
the Vendor InfoFrame when disabling the output.

Signed-off-by: Alastair Bridgewater <alastair.bridgewater@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:18 +10:00
Alastair Bridgewater
ba32836879 drm/nouveau/disp/gt215: Use supplied HDMI InfoFrames
Now that we have the InfoFrame data being provided, for the most
part, program the hardware to use it.

While we're here, and since the functionality will come in handy
for supporting 3D stereoscopy, implement setting the Vendor
("generic") InfoFrame.

Also don't enable any AVI or Vendor InfoFrame that is not provided,
and disable the Vendor InfoFrame when disabling the output.

Ignore the Audio InfoFrame: We don't supply it, and altering HDMI
audio semantics (for better or worse) on this hardware is out of
scope for me at this time.

Signed-off-by: Alastair Bridgewater <alastair.bridgewater@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:18 +10:00
Alastair Bridgewater
a45f7908b3 drm/nouveau/disp/g84-gt200: Use supplied HDMI InfoFrames
Now that we have the InfoFrame data being provided, for the most
part, program the hardware to use it.

While we're here, and since the functionality will come in handy
for supporting 3D stereoscopy, implement setting the Vendor
("generic"?) InfoFrame.

Also don't enable any AVI or Vendor InfoFrame that is not provided,
and disable the Vendor InfoFrame when disabling the output.

Ignore the Audio InfoFrame: We don't supply it, and altering HDMI
audio semantics (for better or worse) on this hardware is out of
scope for me at this time.

Signed-off-by: Alastair Bridgewater <alastair.bridgewater@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:18 +10:00
Alastair Bridgewater
f60213c0ee drm/nouveau/disp: Add mechanism to convert HDMI InfoFrames to hardware format
HDMI InfoFrames are passed to NVKM as bags of bytes, but the
hardware needs them to be packed into words.  Rather than having
four (or more) copies of the packing logic introduce a single copy
now, in a central place.

We currently need these for AVI and Vendor InfoFrames, but we may
also expect to need them for Audio InfoFrames at some point.

Signed-off-by: Alastair Bridgewater <alastair.bridgewater@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:18 +10:00
Alastair Bridgewater
34fd3e5d8c drm/nouveau: Pass mode-dependent AVI and Vendor HDMI InfoFrames to NVKM
Now that we have mechanism by which to pass mode-dependent HDMI
InfoFrames to the low-level hardware driver, it is incumbent upon
us to do so.

Signed-off-by: Alastair Bridgewater <alastair.bridgewater@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:18 +10:00
Alastair Bridgewater
31fe2c2002 drm/nouveau/disp/g84-: Extend NVKM HDMI power control method to set InfoFrames
The nouveau driver, in the Linux 3.7 days, used to try and set the
AVI InfoFrame based on the selected display mode.  These days, it
uses a fixed set of InfoFrames.  Start to correct that, by
providing a mechanism whereby InfoFrame data may be passed to the
NVKM functions that do the actual configuration.

At this point, only establish the new parameters and their parsing,
don't actually use the data anywhere yet (since it's not supplied
anywhere).

Signed-off-by: Alastair Bridgewater <alastair.bridgewater@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:18 +10:00
Alastair Bridgewater
35dd9874bf drm/nouveau: Clean up nv50_head_atomic_check_mode() and fix blankus calculation
drm_mode_set_crtcinfo() does compensation for interlace and
doublescan timing effects already, so do it first and use the
compensated figures instead of the constant "vscan / ilace" terms
that we had before.

And then it turns out that the hardware model for how the timing
parameters are configured is basically the standard model, but
starting one clock before the sync pulse rather than at the start
of the display area, which lets us drastically simplify the
overall timing calculations (verifying the changes by algebraic
operations is left as an exercise for the reader).

Finally, there were a couple of issues with the computation of
m->v.blankus that are addressed here.  Interlaced modes would
generate a negative intermediate result.  Double scan modes would
generate an overestimate rather than an underestimate.  And when
enabling frame-packing modes, a rather extreme overestimate would
be generated.  Fixed, by using the timings as adjusted for the
CRTC to find the length of the vertical blanking period instead of
mixing adjusted and pre-adjustment timing parameters.

Signed-off-by: Alastair Bridgewater <alastair.bridgewater@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:18 +10:00
Dave Airlie
925344ccc9 BackMerge tag 'v4.12-rc5' into drm-next
Linux 4.12-rc5 for nouveau fixes
2017-06-16 13:58:27 +10:00
Tariq Toukan
4c07c13240 net/mlx4_en: Refactor mlx4_en_free_tx_desc
Some code re-ordering, functionally equivalent.

- The !tx_info->inl check is evaluated anyway in both flows
  (common case/end case). Run it first, this might finish
  the flows earlier.
- dma_unmap calls are identical in both flows, get it out
  of the if block into the common area.

Performance tests:
Tested on ConnectX3Pro, Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz

Gain is too small to be measurable, no degradation sensed.
Results are similar for IPv4 and IPv6.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Cc: kernel-team@fb.com
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-15 22:53:23 -04:00
Tariq Toukan
9573e0d39f net/mlx4_en: Replace TXBB_SIZE multiplications with shift operations
Define LOG_TXBB_SIZE, log of TXBB_SIZE, and use it with a shift
operation instead of a multiplication with TXBB_SIZE.
Operations are equivalent as TXBB_SIZE is a power of two.

Performance tests:
Tested on ConnectX3Pro, Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz

Gain is too small to be measurable, no degradation sensed.
Results are similar for IPv4 and IPv6.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Cc: kernel-team@fb.com
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-15 22:53:23 -04:00
Tariq Toukan
77788b5bf6 net/mlx4_en: Increase default TX ring size
Increase the default TX ring size (from 512 to 1024) to match
the RX ring size.
This gives the XDP TX ring a better chance to keep up with the
rate of its RX ring in case of a high load of XDP_TX actions.

Tested:
Ethtool counter rx_xdp_tx_full used to increase, after applying this
patch it stopped.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Cc: kernel-team@fb.com
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-15 22:53:23 -04:00
Tariq Toukan
6c78511b05 net/mlx4_en: Poll XDP TX completion queue in RX NAPI
Instead of having their own NAPIs, XDP TX completion queues get
polled within the corresponding RX NAPI.
This prevents any possible race on TX ring prod/cons indices,
between the context that issues the transmits (RX NAPI) and the
context that handles the completions (was previously done in
a separate NAPI).

This also improves performance, as it decreases the number
of NAPIs running on a CPU, saving the overhead of syncing
and switching between the contexts.

Performance tests:
Tested on ConnectX3Pro, Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
Single queue no-RSS optimization ON.

XDP_TX packet rate:
-------------------------------------
     | Before    | After     | Gain |
IPv4 | 12.0 Mpps | 13.8 Mpps |  15% |
IPv6 | 12.0 Mpps | 13.8 Mpps |  15% |
-------------------------------------

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Cc: kernel-team@fb.com
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-15 22:53:23 -04:00
Tariq Toukan
36ea796498 net/mlx4_en: Improve XDP xmit function
Several performance improvements in XDP TX datapath,
including:
- Ring a single doorbell for XDP TX ring per NAPI budget,
  instead of doing it per a lower threshold (was 8).
  This includes removing the flow of immediate doorbell ringing
  in case of a full TX ring.
- Compiler branch predictor hints.
- Calculate values in compile time rather than in runtime.

Performance tests:
Tested on ConnectX3Pro, Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
Single queue no-RSS optimization ON.

XDP_TX packet rate:
-------------------------------------
     | Before    | After     | Gain |
IPv4 | 10.3 Mpps | 12.0 Mpps |  17% |
IPv6 | 10.3 Mpps | 12.0 Mpps |  17% |
-------------------------------------

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Cc: kernel-team@fb.com
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-15 22:53:23 -04:00
Tariq Toukan
f28186d6b5 net/mlx4_en: Improve stack xmit function
Several small code and performance improvements in stack TX datapath,
including:
- Compiler branch predictor hints.
- Minimize variables scope.
- Move tx_info non-inline flow handling to a separate function.
- Calculate data_offset in compile time rather than in runtime
  (for !lso_header_size branch).
- Avoid trinary-operator ("?") when value can be preset in a matching
  branch.

Performance tests:
Tested on ConnectX3Pro, Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz

Gain is too small to be measurable, no degradation sensed.
Results are similar for IPv4 and IPv6.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Cc: kernel-team@fb.com
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-15 22:53:23 -04:00
Tariq Toukan
cc26a49086 net/mlx4_en: Improve transmit CQ polling
Several small performance improvements in TX CQ polling,
including:
- Compiler branch predictor hints.
- Minimize variables scope.
- More proper check of cq type.
- Use boolean instead of int for a binary indication.

Performance tests:
Tested on ConnectX3Pro, Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz

Packet-rate tests for both regular stack and XDP use cases:
No noticeable gain, no degradation.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Cc: kernel-team@fb.com
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-15 22:53:23 -04:00
Tariq Toukan
9bcee89ac4 net/mlx4_en: Improve receive data-path
Several small performance improvements in RX datapath,
including:
- Compiler branch predictor hints.
- Replace a multiplication with a shift operation.
- Minimize variables scope.
- Write-prefetch for packet header.
- Avoid trinary-operator ("?") when value can be preset in a matching
  branch.
- Save a branch by updating RX ring doorbell within
  mlx4_en_refill_rx_buffers(), which now returns void.

Performance tests:
Tested on ConnectX3Pro, Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
Single queue no-RSS optimization ON
(enable by ethtool -L <interface> rx 1).

XDP_DROP packet rate:
Same (28.1 Mpps), lower CPU utilization (from ~100% to ~92%).

Drop packets in TC:
-------------------------------------
     | Before    | After     | Gain |
IPv4 | 4.14 Mpps | 4.18 Mpps |   1% |
-------------------------------------

XDP_TX packet rate:
-------------------------------------
     | Before    | After     | Gain |
IPv4 | 10.1 Mpps | 10.3 Mpps |   2% |
IPv6 | 10.1 Mpps | 10.3 Mpps |   2% |
-------------------------------------

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Cc: kernel-team@fb.com
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-15 22:53:23 -04:00
Saeed Mahameed
4931c6ef04 net/mlx4_en: Optimized single ring steering
Avoid touching RX QP RSS context when loading with only
one RX ring, to allow optimized A0 RX steering.

Enable by:
- loading mlx4_core with module param: log_num_mgm_entry_size = -6.
- then: ethtool -L <interface> rx 1

Performance tests:
Tested on ConnectX3Pro, Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz

XDP_DROP packet rate:
-------------------------------------
     | Before    | After     | Gain |
IPv4 | 20.5 Mpps | 28.1 Mpps |  37% |
IPv6 | 18.4 Mpps | 28.1 Mpps |  53% |
-------------------------------------

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Cc: kernel-team@fb.com
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-15 22:53:22 -04:00
Tariq Toukan
cf97050d54 net/mlx4_en: Remove unused argument in TX datapath function
Remove owner argument, as it is obsolete and unused.
This also saves the overhead of calculating its value in data-path.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Cc: kernel-team@fb.com
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-15 22:53:22 -04:00
Dave Airlie
a682169891 Merge tag 'imx-drm-next-2017-06-08' of git://git.pengutronix.de/git/pza/linux into drm-next
imx-drm: cleanups and YUV 4:2:0 memory read/write reduction support

- Remove counter load enable form PRE, which has no effect.
- Add support for setting the double read/write reduction flag in channel
  parameter memory. This can be used to save some memory bandwidth when
  capturing in YUV 4:2:0 chroma subsampled formats.
- Allocate DMA channel structures as needed, most of the 64 channels are
  unused or even reserved.
- Remove unused interrupt busy waiting routine.
- Set VDIC field order for both AUTO and MAN inputs simultaneously as
  both can't be active at the same time.

* tag 'imx-drm-next-2017-06-08' of git://git.pengutronix.de/git/pza/linux:
  gpu: ipu-v3: vdic: include AUTO field order bit in ipu_vdi_set_field_order
  gpu: ipu-v3: remove interrupt busy waiting routine
  gpu: ipu-v3: allocate ipuv3_channels as needed
  gpu: ipu-v3: Add support for double read/write reduction
  gpu: ipu-v3: prg: remove counter load enable
2017-06-16 10:05:38 +10:00
Dave Airlie
033fd3256f Merge tag 'drm-fsl-dcu-for-v4.13' of http://git.agner.ch/git/linux-drm-fsl-dcu into drm-next
some fsl-dcu cleanups

* tag 'drm-fsl-dcu-for-v4.13' of http://git.agner.ch/git/linux-drm-fsl-dcu:
  drm/fsl-dcu: use new drm_atomic_helper_shutdown
  drm/fsl-dcu: implement irq_preinstall/uninstall callbacks
  drm/fsl: Drop drm_vblank_cleanup
2017-06-16 10:05:03 +10:00
Dave Airlie
202dfa086d Merge branch 'drm/next/du' of git://linuxtv.org/pinchartl/media into drm-next
The series interleaves DRM and V4L2 patches due to dependencies between the R-
Car DU and VSP drivers. Mauro has acked all the V4L2 patches to go through
your tree, and they don't conflict with anything queued for v4.13 in his tree.
If I need to send any conflicting patches through Mauro's tree for v4.13, I'll
make sure to base them on this branch.

* 'drm/next/du' of git://linuxtv.org/pinchartl/media:
  drm: rcar-du: Map memory through the VSP device
  v4l: vsp1: Add API to map and unmap DRM buffers through the VSP
  v4l: vsp1: Map the DL and video buffers through the proper bus master
  v4l: rcar-fcp: Add an API to retrieve the FCP device
  v4l: rcar-fcp: Don't get/put module reference
  drm: rcar-du: Register a completion callback with VSP1
  v4l: vsp1: Extend VSP1 module API to allow DRM callbacks
  v4l: vsp1: Postpone frame end handling in event of display list race
  drm: rcar-du: Arm the page flip event after queuing the page flip
2017-06-16 10:04:14 +10:00
Dave Airlie
7249e3d64e Merge tag 'sunxi-drm-for-4.13' of https://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux into drm-next
sun4i-drm changes for 4.13

An unusually big pull request for this merge window, with three notable
features:
  - V3s display engine support. This is especially notable because it uses
    a different display engine used on the newer Allwinner SoCs (H3, A64
    and the likes) that will be quite easily supported now.
  - HDMI support for the old Allwinner SoCs. This is enabled only on the
    A10s for now, but should be really easy to extend to deal with A10, A20
    and A31
  - Preliminary work to deal with dual-pipeline SoCs (A10, A20, A31, H3,
    etc.). It currently ignores the second pipeline, but we can use the
    dual-pipelines bindings. This will be useful to enable the display
    pipeline while we work on the dual-pipeline.

* tag 'sunxi-drm-for-4.13' of https://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux: (27 commits)
  drm/sun4i: Add compatible for the A10s pipeline
  drm/sun4i: Add HDMI support
  dt-bindings: display: sun4i: Add allwinner,tcon-channel property
  dt-bindings: display: sun4i: Add HDMI display bindings
  drm/sun4i: Ignore the generic connectors for components
  drm/sun4i: tcon: multiply the vtotal when not in interlace
  drm/sun4i: tcon: Change vertical total size computation inconsistency
  drm/sun4i: tcon: Fix tcon channel 1 backporch calculation
  drm/sun4i: tcon: Switch mux on only for composite
  drm/sun4i: tcon: Move the muxing out of the mode set function
  drm/sun4i: tcon: Add channel debug
  drm/sun4i: tcon: add support for V3s TCON
  drm/sun4i: Add compatible string for V3s display engine
  drm/sun4i: add support for Allwinner DE2 mixers
  drm/sun4i: add a Kconfig option for sun4i-backend
  drm/sun4i: abstract a engine type
  drm/sun4i: return only planes for layers created
  dt-bindings: add bindings for DE2 on V3s SoC
  drm/sun4i: backend: Clarify sun4i_backend_layer_enable debug message
  drm/sun4i: Set TCON clock inside sun4i_tconX_mode_set
  ...
2017-06-16 10:02:35 +10:00
Dave Airlie
7119dbdf7c Merge tag 'drm-intel-fixes-2017-06-15' of git://anongit.freedesktop.org/git/drm-intel into drm-fixes
drm/i915 fixes for v4.12-rc6

* tag 'drm-intel-fixes-2017-06-15' of git://anongit.freedesktop.org/git/drm-intel:
  drm/i915: Fix GVT-g PVINFO version compatibility check
  drm/i915: Fix SKL+ watermarks for 90/270 rotation
  drm/i915: Fix scaling check for 90/270 degree plane rotation
2017-06-16 10:01:52 +10:00
Dave Airlie
91c0719c69 Merge tag 'drm-misc-fixes-2017-06-15' of git://anongit.freedesktop.org/git/drm-misc into drm-fixes
Driver Changes:
- dw-hdmi: Fix compilation error if REGMAP_MMIO not selected (Laurent)
- host1x: Fix incorrect return value (Christophe)
- tegra: Shore up idr API usage in tegra staging code (Dmitry)
- mgag200: Always use HiPri mode for G200e4v2 and limit max bandwidth (Mathieu)
- mxsfb: Ensure display can be lit up without bootloader initialization (Fabio)

Cc: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Dmitry Osipenko <digetx@gmail.com>
Cc: Mathieu Larouche <mathieu.larouche@matrox.com>
Cc: Fabio Estevam <fabio.estevam@nxp.com>

* tag 'drm-misc-fixes-2017-06-15' of git://anongit.freedesktop.org/git/drm-misc:
  drm: mxsfb_crtc: Reset the eLCDIF controller
  drm/mgag200: Fix to always set HiPri for G200e4 V2
  drm/tegra: Correct idr_alloc() minimum id
  drm/tegra: Fix lockup on a use of staging API
  gpu: host1x: Fix error handling
  drm: dw-hdmi: Fix compilation breakage by selecting REGMAP_MMIO
2017-06-16 10:01:04 +10:00
Dave Airlie
1b22f6d72a Merge branch 'drm-fixes-4.12' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
A few fixes for 4.12:
- fix a UVD regression on SI
- fix overflow in watermark calcs on large modes

* 'drm-fixes-4.12' of git://people.freedesktop.org/~agd5f/linux:
  drm/radeon: Fix overflow of watermark calcs at > 4k resolutions.
  drm/amdgpu: Fix overflow of watermark calcs at > 4k resolutions.
  drm/radeon: fix "force the UVD DPB into VRAM as well"
2017-06-16 10:00:11 +10:00
Dave Airlie
04d4fb5fa6 Merge branch 'drm-next-4.13' of git://people.freedesktop.org/~agd5f/linux into drm-next
New radeon and amdgpu features for 4.13:
- Lots of Vega10 bug fixes
- Preliminary Raven support
- KIQ support for compute rings
- MEC queue management rework from Andres
- Audio support for DCE6
- SR-IOV improvements
- Improved module parameters for controlling radeon vs amdgpu support
  for SI and CIK
- Bug fixes
- General code cleanups

[airlied: dropped drmP.h header from one file was needed and build broke]

* 'drm-next-4.13' of git://people.freedesktop.org/~agd5f/linux: (362 commits)
  drm/amdgpu: Fix compiler warnings
  drm/amdgpu: vm_update_ptes remove code duplication
  drm/amd/amdgpu: Port VCN over to new SOC15 macros
  drm/amd/amdgpu: Port PSP v10.0 over to new SOC15 macros
  drm/amd/amdgpu: Port PSP v3.1 over to new SOC15 macros
  drm/amd/amdgpu: Port NBIO v7.0 driver over to new SOC15 macros
  drm/amd/amdgpu: Port NBIO v6.1 driver over to new SOC15 macros
  drm/amd/amdgpu: Port UVD 7.0 over to new SOC15 macros
  drm/amd/amdgpu: Port MMHUB over to new SOC15 macros
  drm/amd/amdgpu: Cleanup gfxhub read-modify-write patterns
  drm/amd/amdgpu: Port GFXHUB over to new SOC15 macros
  drm/amd/amdgpu: Add offset variant to SOC15 macros
  drm/amd/powerplay: add avfs control for Vega10
  drm/amdgpu: add virtual display support for raven
  drm/amdgpu/gfx9: fix compute ring doorbell index
  drm/amd/amdgpu: Rename KIQ ring to avoid spaces
  drm/amd/amdgpu: gfx9 tidy ups (v2)
  drm/amdgpu: add contiguous flag in ucode bo create
  drm/amdgpu: fix missed gpu info firmware when cache firmware during S3
  drm/amdgpu: export test ib debugfs interface
  ...
2017-06-16 09:56:53 +10:00
Dave Airlie
bfda9aa153 Merge tag 'drm-misc-next-2017-06-15' of git://anongit.freedesktop.org/git/drm-misc into drm-next
Cross-subsystem Changes:
- dt-bindings: add vendor prefix for NLT Technologies, Ltd. (Lucas)
- dt-bindings: Add support for samsung s6e3hf2 panel (Hoegeun)

Core Changes:
- Add drm_panel_bridge to avoid connector boilerplate in drivers (Eric)
- Trival fixes for dupe forward decl and reduce scope of variable (Dawid)

Driver Changes:
- dw-hdmi: Use mode_valid hook on bridge instead of connector (Jose)
- vc4,atmel-hlcdc: Use drm_panel_bridge where appropriate (Eric)
- panel: Add Innolux P079ZCA panel driver (Chris)
- panel-simple: Add NL12880B20-05, NL192108AC18-02D, P320HVN03 panels (Lucas)
- panel-samsung-s6e3ha2: Add s6e3hf2 panel support (Hoegeun)
- zte,vc4,pl111,panel,mxsfb: Miscellaneous fixes

Cc: Jose Abreu <Jose.Abreu@synopsys.com>
Cc: Eric Anholt <eric@anholt.net>
Cc: Chris Zhong <zyw@rock-chips.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Hoegeun Kwon <hoegeun.kwon@samsung.com>
Cc: Dawid Kurek <dawikur@gmail.com>

* tag 'drm-misc-next-2017-06-15' of git://anongit.freedesktop.org/git/drm-misc: (26 commits)
  drm: Reduce scope of 'state' variable
  drm: mxsfb_crtc: Reset the eLCDIF controller
  drm: Remove duplicate forward declaration
  drm/panel: s6e3ha2: Add support for s6e3hf2 panel on TM2e board
  dt-bindings: Add support for samsung s6e3hf2 panel
  drm/panel: add backlight dependency for sitronix-st7789v
  drm/panel: S6E3HA2 needs backlight code
  drm/panel: simple: add support for AUO P320HVN03
  drm/panel: simple: add support for NLT NL192108AC18-02D
  dt-bindings: add vendor prefix for NLT Technologies, Ltd.
  drm/panel: simple: add support for NEC NL12880B20-05
  drm/panel: add Innolux P079ZCA panel driver
  dt-bindings: Add INNOLUX P079ZCA panel bindings
  drm/vc4: Fix resource leak in 'vc4_get_hang_state_ioctl()' in error handling path
  drm/vc4/vc4_bo.c: always set bo->resv
  drm: Add const to name field declaration in struct drm_prop_enum_list
  drm/pl111: Fix offset calculation for the primary plane.
  drm/atmel-hlcdc: Fix panel registration
  drm/bridge: Build the panel wrapper in drm_kms_helper
  drm/atmel-hlcdc: Replace the panel usage with drm_panel_bridge.
  ...
2017-06-16 09:33:43 +10:00
Boris Brezillon
34c8ea400f drm/vc4: Mimic drm_atomic_helper_commit() behavior
The VC4 KMS driver is implementing its own ->atomic_commit() but there
are a few generic helpers we can use instead of open-coding the logic.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Link: http://patchwork.freedesktop.org/patch/msgid/1496392332-8722-4-git-send-email-boris.brezillon@free-electrons.com
2017-06-15 16:29:08 -07:00
Eric Anholt
83753117f1 drm/vc4: Add get/set tiling ioctls.
This allows mesa to set the tiling format for a BO and have that
tiling format be respected by mesa on the other side of an
import/export (and by vc4 scanout in the kernel), without defining a
protocol to pass the tiling through userspace.

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: http://patchwork.freedesktop.org/patch/msgid/20170608001336.12842-2-eric@anholt.net
Acked-by: Dave Airlie <airlied@redhat.com>
2017-06-15 16:02:45 -07:00
Eric Anholt
98830d91da drm/vc4: Add T-format scanout support.
The T tiling format is what V3D uses for textures, with no raster
support at all until later revisions of the hardware (and always at a
large 3D performance penalty).  If we can't scan out V3D's format,
then we often need to do a relayout at some stage of the pipeline,
either right before texturing from the scanout buffer (common in X11
without a compositor) or between a tiled screen buffer right before
scanout (an option I've considered in trying to resolve this
inconsistency, but which means needing to use the dirty fb ioctl and
having some update policy).

T-format scanout lets us avoid either of those shadow copies, for a
massive, obvious performance improvement to X11 window dragging
without a compositor.  Unfortunately, enabling a compositor to work
around the discrepancy has turned out to be too costly in memory
consumption for the Raspbian distribution.

Because the HVS operates a scanline at a time, compositing from T does
increase the memory bandwidth cost of scanout.  On my 1920x1080@32bpp
display on a RPi3, we go from about 15% of system memory bandwidth
with linear to about 20% with tiled.  However, for X11 this still ends
up being a huge performance win in active usage.

This patch doesn't yet handle src_x/src_y offsetting within the tiled
buffer.  However, we fail to do so for untiled buffers already.

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: http://patchwork.freedesktop.org/patch/msgid/20170608001336.12842-1-eric@anholt.net
Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com>
2017-06-15 16:02:45 -07:00
Toshi Kani
56b47fe657 acpi/nfit: Add support of NVDIMM memory error notification in ACPI 6.2
ACPI 6.2 defines a new ACPI notification value to NVDIMM Root Device
in Table 5-169.

 0x81 Unconsumed Uncorrectable Memory Error Detected
      Used to pro-actively notify OSPM of uncorrectable memory errors
      detected (for example a memory scrubbing engine that continuously
      scans the NVDIMMs memory). This is an optional notification. Only
      locations that were mapped in to SPA by the platform will generate
      a notification.

Add support of this notification value by initiating an ARS scan. This
will find new error locations and add their badblocks information.

Link: http://www.uefi.org/sites/default/files/resources/ACPI_6_2.pdf
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Linda Knippers <linda.knippers@hpe.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-06-15 14:39:42 -07:00
Dan Williams
4e4f00a9b5 x86, dax, libnvdimm: remove wb_cache_pmem() indirection
With all handling of the CONFIG_ARCH_HAS_PMEM_API case being moved to
libnvdimm and the pmem driver directly we do not need to provide global
wrappers and fallbacks in the CONFIG_ARCH_HAS_PMEM_API=n case. The pmem
driver will simply not link to arch_wb_cache_pmem() in that case.  Same
as before, pmem flushing is only defined for x86_64, via
clean_cache_range(), but it is straightforward to add other archs in the
future.

arch_wb_cache_pmem() is an exported function since the pmem module needs
to find it, but it is privately declared in drivers/nvdimm/pmem.h because
there are no consumers outside of the pmem driver.

Cc: <x86@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Oliver O'Halloran <oohall@gmail.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-06-15 14:35:24 -07:00
Dan Williams
abebfbe2f7 dm: add ->flush() dax operation support
Allow device-mapper to route flush operations to the
per-target implementation. In order for the device stacking to work we
need a dax_dev and a pgoff relative to that device. This gives each
layer of the stack the information it needs to look up the operation
pointer for the next level.

This conceptually allows for an array of mixed device drivers with
varying flush implementations.

Reviewed-by: Toshi Kani <toshi.kani@hpe.com>
Reviewed-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-06-15 14:34:59 -07:00
Dan Williams
3c1cebff23 dax, pmem: introduce an optional 'flush' dax_operation
Filesystem-DAX flushes caches whenever it writes to the address returned
through dax_direct_access() and when writing back dirty radix entries.
That flushing is only required in the pmem case, so add a dax operation
to allow pmem to take this extra action, but skip it for other dax
capable devices that do not provide a flush routine.

An example for this differentiation might be a volatile ram disk where
there is no expectation of persistence. In fact the pmem driver itself might
front such an address range specified by the NFIT. So, this "no flush"
property might be something passed down by the bus / libnvdimm.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-06-15 14:34:59 -07:00
Toshi Kani
975750a98c libnvdimm, pmem: Add sysfs notifications to badblocks
Sysfs "badblocks" information may be updated during run-time that:
 - MCE, SCI, and sysfs "scrub" may add new bad blocks
 - Writes and ioctl() may clear bad blocks

Add support to send sysfs notifications to sysfs "badblocks" file
under region and pmem directories when their badblocks information
is re-evaluated (but is not necessarily changed) during run-time.

Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Linda Knippers <linda.knippers@hpe.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-06-15 14:31:41 -07:00
Dan Williams
8990cdf10c libnvdimm, label: switch to using v1.2 labels by default
The rules for which version of the label specification are in effect at
any given point in time are as follows:

1/ If a DIMM has an existing / valid index block then the version
   specified is used regardless if it is a previous version.

2/ By default when the kernel is initializing new index blocks the
   latest specification version (v1.2 at time of writing) is used.

3/ An environment that wants to force create v1.1 label-sets must
   arrange for userspace to disable all active regions / namespaces /
   dimms and write a valid set of v1.1 index blocks to the dimms.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-06-15 14:31:41 -07:00
Dan Williams
b3fde74ea1 libnvdimm, label: add address abstraction identifiers
Starting with v1.2 labels, 'address abstractions' can be hinted via an
address abstraction id that implies an info-block format. The standard
address abstraction in the specification is the v2 format of the
Block-Translation-Table (BTT). Support for that is saved for a later
patch, for now we add support for the Linux supported address
abstractions BTT (v1), PFN, and DAX.

The new 'holder_class' attribute for namespace devices is added for
tooling to specify the 'abstraction_guid' to store in the namespace label.
For v1.1 labels this field is undefined and any setting of
'holder_class' away from the default 'none' value will only have effect
until the driver is unloaded. Setting 'holder_class' requires that
whatever device tries to claim the namespace must be of the specified
class.

Cc: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-06-15 14:31:40 -07:00
Dan Williams
355d838878 libnvdimm, label: add v1.2 label checksum support
The v1.2 namespace label specification adds a fletcher checksum to each
label instance. Add generation and validation support for the new field.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-06-15 14:31:40 -07:00
Dan Williams
3934d8410c libnvdimm, label: update 'nlabel' and 'position' handling for local namespaces
The v1.2 namespace label specification requires 'nlabel' and 'position'
to be valid for the first ("lowest dpa") label in the set. It also
requires all non-first labels to set those fields to 0xff.

Linux does not much care if these values are correct, because we can
just trust the count of labels with the matching uuid like the v1.1
case. However, we set them correctly in case other environments care.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-06-15 14:31:40 -07:00
Dan Williams
8f2bc2430e libnvdimm, label: populate 'isetcookie' for blk-aperture namespaces
Starting with the v1.2 definition of namespace labels, the isetcookie
field is populated and validated for blk-aperture namespaces. This adds
some safety against inadvertent copying of namespace labels from one
DIMM-device to another.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-06-15 14:31:40 -07:00
Dan Williams
faec6f8a1c libnvdimm, label: populate the type_guid property for v1.2 namespaces
The type_guid refers to the "Address Range Type GUID" for the region
backing a namespace as defined the ACPI NFIT (NVDIMM Firmware Interface
Table). This 'type' identifier specifies an access mechanism for the
given namespace. This capability replaces the confusing usage of the
'NSLABEL_FLAG_LOCAL' flag to indicate a block-aperture-mode namespace.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-06-15 14:31:40 -07:00
Dan Williams
f979b13c3c libnvdimm, label: honor the lba size specified in v1.2 labels
Previously we only honored the lba size for blk-aperture mode
namespaces. For pmem namespaces the lba size was just assumed to be 512.
With the new v1.2 label definition and compatibility with other
operating environments, the ->lbasize property is now respected for pmem
namespaces.

Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-06-15 14:31:39 -07:00
Dan Williams
c12c48ce86 libnvdimm, label: add v1.2 interleave-set-cookie algorithm
The interleave-set-cookie algorithm is extended to incorporate all the
same components that are used to generate an nvdimm unique-id. For
backwards compatibility we still maintain the old v1.1 definition.

Reported-by: Nicholas Moulin <nicholas.w.moulin@intel.com>
Reported-by: Kaushik Kanetkar <kaushik.a.kanetkar@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-06-15 14:31:39 -07:00
Dan Williams
564e871aa6 libnvdimm, label: add v1.2 nvdimm label definitions
In support of improved interoperability between operating systems and pre-boot
environments the Intel proposed NVDIMM Namespace Specification [1], has been
adopted and modified to the the UEFI 2.7 NVDIMM Label Protocol [2].

Update the definitions of the namespace label data structures so that the new
format can be supported alongside the existing label format.

The new specification changes the default label size to 256 bytes, so
everywhere that relied on sizeof(struct nd_namespace_label) must now use the
sizeof_namespace_label() helper.

There should be no functional differences from these changes as the
default is still the v1.1 128-byte format. Future patches will move the
default to the v1.2 definition.

[1]: http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf
[2]: http://www.uefi.org/sites/default/files/resources/UEFI_Spec_2_7.pdf

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-06-15 14:31:39 -07:00
Gustavo A. R. Silva
1492a3a7b2 atm: solos-pci: remove useless variable assignments
Value assigned to variable _data32_ at lines 1254 and 1257 is
overwritten at line 1260 before it can be used. This makes
such variable assignments useless.

Addresses-Coverity-ID: 1227049
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-15 17:25:33 -04:00
Wolfram Sang
d44005672d i2c: stub: fix build warning regression
Commit 6c42778780 ("i2c: stub: use pr_fmt") changed the DEBUG
handling and caused build warnings. Revert back to the original.

Fixes: 6c42778780 ("i2c: stub: use pr_fmt")
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2017-06-15 23:22:11 +02:00
Majd Dibbiny
8812c24d28 net/mlx5: Add fast unload support in shutdown flow
Adding a support to flush all HW resources with one FW command and
skip all the heavy unload flows of the driver on kernel shutdown.
There's no need to free all the SW context since a new fresh kernel
will be loaded afterwards.

Regarding the FW resources, they should be closed, otherwise we will
have leakage in the FW. To accelerate this flow, we execute one command
in the beginning that tells the FW that the driver isn't going to close
any of the FW resources and asks the FW to clean up everything.
Once the commands complete, it's safe to close the PCI resources and
finish the routine.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-16 00:19:44 +03:00
Majd Dibbiny
4525abeaae net/mlx5: Expose command polling interface
Add a new interface for commands execution that allows the
caller to wait for the command's completion in a busy-wait
loop (polling mode).

This is useful if we want to execute a command in a polling mode
while the driver is working in events mode for the rest of
the commands.
This interface will be used in the downstream patches.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-16 00:19:43 +03:00
Gal Pressman
3834a5e626 net/mlx5e: Optimize update stats work
Unlike ethtool stats, get_stats ndo provides information cached by
update stats work that is running in the background without updating
them explicitly.
We cannot update all counters inside the ndo because some
updates require firmware commands that cannot be performed under a
spinlock.

update_stats work does not need to update ALL counters, since only
some of them are needed by ndo_get_stats.
This patch will allow for a minimal run of update_stats using an extra
parameter which will update necessary counters only and cut 13
firmware commands in each iteration of the work.

Work duration previous to this patch: ~4200us.
Work duration after this patch: ~700us (17% of the original time).

Signed-off-by: Gal Pressman <galp@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Cc: kernel-team@fb.com
2017-06-16 00:19:32 +03:00
Gal Pressman
432609a4cd net/mlx5e: Move and optimize query out of buffer function
Move "query queue counter out of buffer" helper function out of
qp.c to en_main.c, since mlx5e netdev driver is the only one to use it.

Also allocate the output buffer on the stack instead of the heap, to reduce
number of heap allocs on update_stats work.

Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Cc: kernel-team@fb.com
2017-06-16 00:19:02 +03:00