Rodrigo gave a persuasive argument for keeping workarounds: that they
serve as a good guide for the bring up of the next generation. Not only
do workarounds persist into the early revisions, they show where the
workarounds were previously added to the code flow and sometimes the old
workarounds have an explanation that give insight into their wider
implications.
Based on his suggestion, document the policy that we want to keep the
workarounds from the current generation to guide the next. Older
preproduction workarounds we still want to remove to keep the code
clean.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20171117102635.8689-1-chris@chris-wilson.co.uk
Acked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The watermarks it should calculate against are the old optimal watermarks.
The currently active crtc watermarks are pure fiction, and are invalid in
case of a nonblocking modeset, page flip enabling/disabling planes or any
other reason.
When the crtc is disabled or during a modeset the intermediate watermarks
don't need to be programmed separately, and could be directly assigned
to the optimal watermarks.
CXSR must always be disabled in the intermediate case for modesets,
else we get a WARN for vblank wait timeout.
Also rename crtc_state to new_crtc_state, to distinguish it from the old
state.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171115163157.14372-2-maarten.lankhorst@linux.intel.com
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
The watermarks it should calculate against are the old optimal watermarks.
The currently active crtc watermarks are pure fiction, and are invalid in
case of a nonblocking modeset, page flip enabling/disabling planes or any
other reason.
When the crtc is disabled or during a modeset the intermediate watermarks
don't need to be programmed separately, and could be directly assigned
to the optimal watermarks.
CXSR must always be disabled in the intermediate case for modesets, else
we get a WARN for vblank wait timeout.
Also rename crtc_state to new_crtc_state, to distinguish it from the old state.
Changes since v1:
- Use intel_atomic_get_old_crtc_state. (ville)
Changes since v2:
- Always unset cxsr during modeset.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171115163157.14372-1-maarten.lankhorst@linux.intel.com
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
The firmware may have set up the pipe correctly, but the FIFO
underrun and CRC interrupts are likely not enabled.
This resulted in debugfs_test.read_all_entries failing on haswell,
because of a timeout when reading the crc debugfs entry.
Solve this by enabling FIFO underrun reporting after the initial
fastset, which lets interrupts be generated as expected.
Changes since v1:
- Always enable CPU FIFO underrun reporting for >GEN2,
and handle GEN2 correctly.
Changes since v2:
- Remove unneeded HAS_DDI, simplify GEN2 case.
Changes since v3:
- Use intel_crtc_pch_transcoder to determine pch transcoder for underruns. (Ville)
- Remove crtc->config dereference in intel_crtc_pch_transcoder. (Ville)
Testcase: debugfs_test.read_all_entries
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171113144043.58658-1-maarten.lankhorst@linux.intel.com
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
The first test aims to check guc_init_doorbell_hw, changing the existing
guc clients and doorbells state before calling it.
The second test tries to create as many clients as it is currently possible
(currently limited to max number of doorbells) and exercise the doorbell
alloc/dealloc code.
Since our usage mode require very few clients/doorbells, this code has
been exercised very lightly and it's good to have a simple test for it.
As reference, this test already helped identify the bug fixed by
commit 7f1ea2ac30 ("drm/i915/guc: Fix doorbell id selection").
v2: Extend number of clients; check for client allocation failure when
number of doorbells is exceeded; validate client properties; reuse
guc_init_doorbell_hw (Chris).
v3: guc_init_doorbell_hw test added per Chris suggestion.
v4: Try to explain why guc_init_doorbell_hw exist and comment some
details in the subtest.
v5: Remove redundant pr_info at the beginning of each subtest (Chris);
rebase (s/i915_guc_client/intel_guc_client/).
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171116220632.1909-1-michel.thierry@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Pull AFS updates from David Howells:
"kAFS filesystem driver overhaul.
The major points of the overhaul are:
(1) Preliminary groundwork is laid for supporting network-namespacing
of kAFS. The remainder of the namespacing work requires some way
to pass namespace information to submounts triggered by an
automount. This requires something like the mount overhaul that's
in progress.
(2) sockaddr_rxrpc is used in preference to in_addr for holding
addresses internally and add support for talking to the YFS VL
server. With this, kAFS can do everything over IPv6 as well as
IPv4 if it's talking to servers that support it.
(3) Callback handling is overhauled to be generally passive rather
than active. 'Callbacks' are promises by the server to tell us
about data and metadata changes. Callbacks are now checked when
we next touch an inode rather than actively going and looking for
it where possible.
(4) File access permit caching is overhauled to store the caching
information per-inode rather than per-directory, shared over
subordinate files. Whilst older AFS servers only allow ACLs on
directories (shared to the files in that directory), newer AFS
servers break that restriction.
To improve memory usage and to make it easier to do mass-key
removal, permit combinations are cached and shared.
(5) Cell database management is overhauled to allow lighter locks to
be used and to make cell records autonomous state machines that
look after getting their own DNS records and cleaning themselves
up, in particular preventing races in acquiring and relinquishing
the fscache token for the cell.
(6) Volume caching is overhauled. The afs_vlocation record is got rid
of to simplify things and the superblock is now keyed on the cell
and the numeric volume ID only. The volume record is tied to a
superblock and normal superblock management is used to mediate
the lifetime of the volume fscache token.
(7) File server record caching is overhauled to make server records
independent of cells and volumes. A server can be in multiple
cells (in such a case, the administrator must make sure that the
VL services for all cells correctly reflect the volumes shared
between those cells).
Server records are now indexed using the UUID of the server
rather than the address since a server can have multiple
addresses.
(8) File server rotation is overhauled to handle VMOVED, VBUSY (and
similar), VOFFLINE and VNOVOL indications and to handle rotation
both of servers and addresses of those servers. The rotation will
also wait and retry if the server says it is busy.
(9) Data writeback is overhauled. Each inode no longer stores a list
of modified sections tagged with the key that authorised it in
favour of noting the modified region of a page in page->private
and storing a list of keys that made modifications in the inode.
This simplifies things and allows other keys to be used to
actually write to the server if a key that made a modification
becomes useless.
(10) Writable mmap() is implemented. This allows a kernel to be build
entirely on AFS.
Note that Pre AFS-3.4 servers are no longer supported, though this can
be added back if necessary (AFS-3.4 was released in 1998)"
* tag 'afs-next-20171113' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: (35 commits)
afs: Protect call->state changes against signals
afs: Trace page dirty/clean
afs: Implement shared-writeable mmap
afs: Get rid of the afs_writeback record
afs: Introduce a file-private data record
afs: Use a dynamic port if 7001 is in use
afs: Fix directory read/modify race
afs: Trace the sending of pages
afs: Trace the initiation and completion of client calls
afs: Fix documentation on # vs % prefix in mount source specification
afs: Fix total-length calculation for multiple-page send
afs: Only progress call state at end of Tx phase from rxrpc callback
afs: Make use of the YFS service upgrade to fully support IPv6
afs: Overhaul volume and server record caching and fileserver rotation
afs: Move server rotation code into its own file
afs: Add an address list concept
afs: Overhaul cell database management
afs: Overhaul permit caching
afs: Overhaul the callback handling
afs: Rename struct afs_call server member to cm_server
...
this can fix the memory leak under the case that not all
BO are freed during "takedown" stage, because originally
it blocks following kfree on mgr.
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Spec describe all values in MHz. We handle our
clocks in KHz. This includes the best_dco_centrality that was
forgot in the same unity as spec. Consequently we couldn't
get a good divider for high frequenies. Hence HDMI 2.0 wasn't
working.
Spec tells 999999 for initial best_dco_centrality meaning the
max value in MHz.
Since we convert dco from MHz to KHz we also need to convert
this initial best_doc_centrality to 999999000 or 999999999
or even better, to the max that its variable allow.
This patch also replaces the use of "* KHz(1)" with the values
directly on KHz to avoid future confusion.
v2: Use U32_MAX instead of random 99999 as spec tells. (Ville).
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Shashank Sharma <shashank.sharma@intel.com>
Cc: Mika Kahola <mika.kahola@intel.com>
Cc: Manasi Navare <manasi.d.navare@intel.com>
Cc: James Ausmus <james.ausmus@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171114234223.10600-1-rodrigo.vivi@intel.com
We check whether the multiplies will overflow prior to calling
kmalloc_array so that we can respond with -EINVAL for the invalid user
arguments rather than treating it as an -ENOMEM that would otherwise
occur. However, as Dan Carpenter pointed out, we did an addition on the
unsigned int prior to passing to kmalloc_array where it would be
promoted to size_t for the calculation, thereby allowing it to overflow
and underallocate.
v2: buffer_count is currently limited to INT_MAX because we treat it as
signaled variable for LUT_HANDLE in eb_lookup_vma
v3: Move common checks for eb1/eb2 into the same function
v4: Put the check back for nfence*sizeof(user_fence) overflow
v5: access_ok uses ULONG_MAX but kvmalloc_array uses SIZE_MAX
v6: size_t and unsigned long are not type-equivalent on 32b
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171116105059.25142-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
When we call intel_engine_cancel_signaling() to stop reporting when
a request is completed via an asynchronous signal, we remove that request
from the breadcrumb wait queue. However, we may be concurrently
processing that request in the signaler itself, the actual operations on
the request's node itself are serialised but we do not actually clear the
waiter after removing it from the tree allowing both parties to attempt
to do so and corrupting the rbtree. (Previously removing from the
breadcrumb wait queue could only be done on behalf of i915_wait_request,
so this race could not happen).
Reported-by: "He, Bo" <bo.he@intel.com>
Fixes: 9eb143bbec ("drm/i915: Allow a request to be cancelled")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: "He, Bo" <bo.he@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171115121458.24655-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
If the sink device is in HDMI mode, enable infoframe interrupt in scdt
irq handle function else call start_video function immediately, because
in DVI mode, there is no infoframe interrupt provided.
Rename start_hdmi function to start_video and get rid of the old
start_video function. In start_video, if the sink is DVI and mode is
MHL1 or MHl2, write appropriate values to registers else the path
should remain the same as in HDMI mode.
Signed-off-by: Maciej Purski <m.purski@samsung.com>
Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1510224822-7732-1-git-send-email-m.purski@samsung.com
Pull drm updates from Dave Airlie:
"This is the main drm pull request for v4.15.
Core:
- Atomic object lifetime fixes
- Atomic iterator improvements
- Sparse/smatch fixes
- Legacy kms ioctls to be interruptible
- EDID override improvements
- fb/gem helper cleanups
- Simple outreachy patches
- Documentation improvements
- Fix dma-buf rcu races
- DRM mode object leasing for improving VR use cases.
- vgaarb improvements for non-x86 platforms.
New driver:
- tve200: Faraday Technology TVE200 block.
This "TV Encoder" encodes a ITU-T BT.656 stream and can be found in
the StorLink SL3516 (later Cortina Systems CS3516) as well as the
Grain Media GM8180.
New bridges:
- SiI9234 support
New panels:
- S6E63J0X03, OTM8009A, Seiko 43WVF1G, 7" rpi touch panel, Toshiba
LT089AC19000, Innolux AT043TN24
i915:
- Remove Coffeelake from alpha support
- Cannonlake workarounds
- Infoframe refactoring for DisplayPort
- VBT updates
- DisplayPort vswing/emph/buffer translation refactoring
- CCS fixes
- Restore GPU clock boost on missed vblanks
- Scatter list updates for userptr allocations
- Gen9+ transition watermarks
- Display IPC (Isochronous Priority Control)
- Private PAT management
- GVT: improved error handling and pci config sanitizing
- Execlist refactoring
- Transparent Huge Page support
- User defined priorities support
- HuC/GuC firmware refactoring
- DP MST fixes
- eDP power sequencing fixes
- Use RCU instead of stop_machine
- PSR state tracking support
- Eviction fixes
- BDW DP aux channel timeout fixes
- LSPCON fixes
- Cannonlake PLL fixes
amdgpu:
- Per VM BO support
- Powerplay cleanups
- CI powerplay support
- PASID mgr for kfd
- SR-IOV fixes
- initial GPU reset for vega10
- Prime mmap support
- TTM updates
- Clock query interface for Raven
- Fence to handle ioctl
- UVD encode ring support on Polaris
- Transparent huge page DMA support
- Compute LRU pipe tweaks
- BO flag to allow buffers to opt out of implicit sync
- CTX priority setting API
- VRAM lost infrastructure plumbing
qxl:
- fix flicker since atomic rework
amdkfd:
- Further improvements from internal AMD tree
- Usermode events
- Drop radeon support
nouveau:
- Pascal temperature sensor support
- Improved BAR2 handling
- MMU rework to support Pascal MMU
exynos:
- Improved HDMI/mixer support
- HDMI audio interface support
tegra:
- Prep work for tegra186
- Cleanup/fixes
msm:
- Preemption support for a5xx
- Display fixes for 8x96 (snapdragon 820)
- Async cursor plane fixes
- FW loading rework
- GPU debugging improvements
vc4:
- Prep for DSI panels
- fix T-format tiling scanout
- New madvise ioctl
Rockchip:
- LVDS support
omapdrm:
- omap4 HDMI CEC support
etnaviv:
- GPU performance counters groundwork
sun4i:
- refactor driver load + TCON backend
- HDMI improvements
- A31 support
- Misc fixes
udl:
- Probe/EDID read fixes.
tilcdc:
- Misc fixes.
pl111:
- Support more variants
adv7511:
- Improve EDID handling.
- HDMI CEC support
sii8620:
- Add remote control support"
* tag 'drm-for-v4.15' of git://people.freedesktop.org/~airlied/linux: (1480 commits)
drm/rockchip: analogix_dp: Use mutex rather than spinlock
drm/mode_object: fix documentation for object lookups.
drm/i915: Reorder context-close to avoid calling i915_vma_close() under RCU
drm/i915: Move init_clock_gating() back to where it was
drm/i915: Prune the reservation shared fence array
drm/i915: Idle the GPU before shinking everything
drm/i915: Lock llist_del_first() vs llist_del_all()
drm/i915: Calculate ironlake intermediate watermarks correctly, v2.
drm/i915: Disable lazy PPGTT page table optimization for vGPU
drm/i915/execlists: Remove the priority "optimisation"
drm/i915: Filter out spurious execlists context-switch interrupts
drm/amdgpu: use irq-safe lock for kiq->ring_lock
drm/amdgpu: bypass lru touch for KIQ ring submission
drm/amdgpu: Potential uninitialized variable in amdgpu_vm_update_directories()
drm/amdgpu: potential uninitialized variable in amdgpu_vce_ring_parse_cs()
drm/amd/powerplay: initialize a variable before using it
drm/amd/powerplay: suppress KASAN out of bounds warning in vega10_populate_all_memory_levels
drm/amd/amdgpu: fix evicted VRAM bo adjudgement condition
drm/vblank: Tune drm_crtc_accurate_vblank_count() WARN down to a debug
drm/rockchip: add CONFIG_OF dependency for lvds
...
Previously the performance is improved through the workload auditing
and shadowing ahead of vGPU scheduling, however, there is the case that
more requests are allocated in submit_context before the previous request
is added, the timeline will hold its seqno which is later.
This patch is to move the request alloc to dispatch_workload function,
where is the same place as request is added.
It will fix the issue of kernel BUG for (timeline->seqno != request->fence.seqno)
check when add_request.
Fixes: 89ea20b930 ("drm/i915/gvt: Factor out scan and shadow from workload dispatch")
Signed-off-by: Chuanxiao Dong <chuanxiao.dong@intel.com>
Signed-off-by: fred gao <fred.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Currently every vgpu share a common gvt opregion memory, but
it is freed at vgpu destroy, then the later vgpu doesn't have
opregion memory once the first vgpu is destroyed. This cause
guest function failure like reboot, second or later boot.
This patch allocate and init virt opregion memory for each
vgpu, so this memory could be freed at vgpu destroy.
Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
mmio_read_from_hw() let vgpu could read hw reg, if vgpu's workload
is running on hw, things is good. Otherwise vgpu will get other
vgpu's reg val, it is unsafe.
This patch limit such hw access to active vgpu. If vgpu isn't
running on hw, the reg read of this vgpu will get the last active
val which saved at schedule_out.
v2: ring timestamp is walking continuously even if the ring is idle.
so read hw directly. (Zhenyu)
Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
This reverts commit b20d09886fd1b74cd2255d846029a049e524db14.
This caused windows driver boot errors for invalid page address.
Revert for now.
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Our vGPU doesn't have a device ROM, we need follow the PCI spec to
report this info to drivers. Otherwise, we would see below errors.
Inspecting possible rom at 0xfe049000 (vd=8086:1912 bdf=00:10.0)
qemu-system-x86_64: vfio-pci: Cannot read device rom at 00000000-0000-0000-0000-000000000001
Device option ROM contents are probably invalid (check dmesg).
Skip option ROM probe with rombar=0, or load from file with romfile=No option rom signature (got 4860)
I will also send a improvement patch to PCI subsystem related to PCI ROM.
But no idea to omit below error, since no pattern to detect vbios shadow
without touch its content.
0000:00:10.0: Invalid PCI ROM header signature: expecting 0xaa55, got 0x0000
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
gvt_vgpu_err means something goes wrong. We need the error propagates to
kernel message by default.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
I have seen the cmd parser dump partial odd info. Stop that and only dump
the full verbose info when debug enabled.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Use I915_WRITE_FW instead of I915_WRITE to reduce overhead.
The overall mmio switch latency lowers from ~600us to ~180us.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
This new debugfs entry is used to figure out which registers of vGPU
is different to host. It is a useful tool for new platform enabling
and debugging. When read this entry, all the diff mmio are recognized
and sorted by mmio offset. Besides, the bit positions of different
value are listed in 'Diff' column. Here is a show:
$ sudo cat ./mmio_diff
Offset HW vGPU Diff
00002030 000025f8 00000000 3-8,10,13
00002034 012025f8 00000000 3-8,10,13,21,24
00002038 027fb000 00000000 12-13,15-22,25
0000203c 00003000 00000000 12-13
00002054 0000000a 00000040 1,3,6
00002074 012025f8 00000000 3-8,10,13,21,24
00002080 fffe6000 00000000 13-14,17-31
000020a8 fffffeff ffffffff 8
000020d4 00000004 00000000 2
....
00145974 eb42718c 010c11b0 2-5,13-14,17-19,22,25,27,29-31
00145978 0000002f 0000002a 0,2
0014597c 0000002f 0000002a 0,2
00145980 0000002b 00000028 0-1
00145984 a5a87c9e b27d20c0 1-4,6,10-12,14,16,18,20,22-26,28
001459c0 88390000 883c0000 16,18
00146200 88350000 883a0000 16-19
Total: 72432, Diff: 901
v3: fix a typo.
v2: add mmio_hw_access_pre/post().
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
This patch add a function intel_gvt_for_each_tracked_mmio() to
iterate each tracked mmio. The caller don't be aware of how the
tracked mmios are presented internally.
v2: remove snapshot_hw_mmio_registers().
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
this is an enhanced opregion emulation for win guest support
by initializing more data members including opregion header
size, version and child device propertity for display port.
for simplicity, redefined child_device_config structure.
Signed-off-by: Xiaolin Zhang <xiaolin.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>