linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-05 18:41:58 +09:00

Author	SHA1	Message	Date
Chris Wilson	141a0895d5	drm/i915/pmu: Remove conditional HOTPLUG_CPU registration Even for static CPU configurations, the hotplug CPU framework is still used to determine the CPU topology, and is still being used by the perf event register to check for valid CPUs. Fixes: `b46a33e271` ("drm/i915/pmu: Expose a PMU interface for perf queries") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171123123432.25035-1-tvrtko.ursulin@linux.intel.com	2017-11-24 09:54:43 +00:00
Chris Wilson	e7e5da7127	drm/i915/selftests: Hold rpm wakeref for request + ggtt usage Since the removal of the delayed rc6 enabling, we now setup and drop the early rpm wakeref during modules initialisation before we start the live selftests. As such, we are now detecting errors in the tests where we were not holding the required wakeref for various actions. As rpm is not the primary goal of the tests involved, take a coarse and convenient rpm wakeref around the tests. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.william.auld@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171123233712.21836-1-chris@chris-wilson.co.uk Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>	2017-11-24 09:10:58 +00:00
Chris Wilson	b7d3aabf90	drm/i915/pmu: Hide the (unsigned long)ptr cast We pretend the PMU config id is a pointer value when encoding it into the device parameters for presentation via sysfs. This requires casting of an unsigned long into and out of the pointer member, which annoys smatch: drivers/gpu/drm/i915/i915_pmu.c:684 i915_pmu_event_show() warn: argument 3 to %lx specifier is cast from pointer Instead of abusing a generic dev_ext_attribute, define our own typesafe attributes. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171123211751.2885-1-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>	2017-11-24 08:54:02 +00:00
Chris Wilson	8911a31c81	drm/i915: Move mi_set_context() into the legacy ringbuffer submission The legacy i915_switch_context() is only applicable to the legacy ringbuffer submission method, so move it from the general i915_gem_context.c to intel_ringbuffer.c (rename pending!). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171123152631.31385-2-chris@chris-wilson.co.uk	2017-11-23 16:12:06 +00:00
Chris Wilson	b1c24a6137	drm/i915: Unwind incomplete legacy context switches The legacy context switch for ringbuffer submission is multistaged, where each of those stages may fail. However, we were updating global state after some stages, and so we had to force the incomplete request to be submitted because we could not unwind. Save the global state before performing the switches, and so enable us to unwind back to the previous global state should any phase fail. We then must cancel the request instead of submitting it should the construction fail. v2: s/saved_ctx/from_ctx/; s/ctx/to_ctx/ etc. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171123152631.31385-1-chris@chris-wilson.co.uk	2017-11-23 16:12:04 +00:00
Matthew Auld	c83a8d4a2e	drm/i915/selftests: test descending addresses For igt_write_huge make sure the higher gtt offsets don't feel left out, which is especially true when dealing with the 48b PPGTT, where we timeout long before we are able exhaust the address space. v2: just use IGT_TIMEOUT Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171123135421.17967-2-matthew.auld@intel.com	2017-11-23 16:09:12 +00:00
Matthew Auld	621d07b20e	drm/i915/selftests: rein in igt_write_huge Rather than repeat the test for each engine, which takes a long time, let's try alternating between the engines in some randomized order. v2: fix gen2 blunder fix !order blunder more cunning permutation construction! Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171123135421.17967-1-matthew.auld@intel.com	2017-11-23 16:09:11 +00:00
Daniel Vetter	6a44e17721	drm/i915: remove stale comment from sanitize_encoder This goes back to pre-atomic, where due to intermediate dpms states connectors and encoder states might indeed not have matched. With atomic that's all smashed together (and hopefully no bios ever enables a vga output in dpms standby/suspedn state or we're toast). In commit `873ffe69a9` Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Date: Wed Aug 5 12:37:07 2015 +0200 drm/i915: Remove connectors_active from sanitization, v2. sanitize_encoders was changed to disable the encoder in all cases, which made the comment obsolete. Remove the misleading comment. Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171121094241.9129-1-daniel.vetter@ffwll.ch	2017-11-23 14:59:07 +01:00
Daniel Vetter	42e5e65765	drm/i915: sync dp link status checks against atomic commmits Two bits: - check actual atomic state, the legacy stuff can only be looked at from within the atomic_commit_tail function, since it's only protected by ordering and not by any locks. - Make sure we don't wreak the work an ongoing nonblocking commit is doing. v2: We need the crtc lock too, because a plane update might change it without having to acquire the connection_mutex (Maarten). Use Maarten's changes for this locking, while keeping the logic that uses the connection->commit->hw_done signal for syncing with nonblocking commits. v3: The initial state objects from the hw state readout do not have a commit object. Check for that (spotted by CI). v4: Fix deadlock from jumping to put_power with locks still held. (mlankhorst) Cc: Manasi Navare <manasi.d.navare@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> References: https://bugs.freedesktop.org/show_bug.cgi?id=103336 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99272 Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20171113160140.22679-1-maarten.lankhorst@linux.intel.com	2017-11-23 14:59:07 +01:00
Tvrtko Ursulin	fbba5559d9	drm/i915/pmu: Clear the previous sample value when parking When turning off the engines, and the pmu sampling, clear the previous value as the current measurement should be 0. v2: Use a for-loop v3: * Move clearing to timer self-dis-arm to avoid race with parking. * Clear frequency samples as well. v4: * Init frequency to idle_freq. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v3) Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171123102654.29296-1-tvrtko.ursulin@linux.intel.com	2017-11-23 12:27:52 +00:00
Tvrtko Ursulin	b552ae444e	drm/i915/pmu: Drop I915_ENGINE_SAMPLE_MAX from uapi headers We have agreed during the engine classes discussion that fields marked as non-ABI are better left out altogether from uapi headers. v2: Use a local define for maintanability. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171123100701.18430-1-tvrtko.ursulin@linux.intel.com	2017-11-23 12:27:43 +00:00
Anusha Srivatsa	4f0aa1fa3e	drm/i915/dmc: DMC 1.04 for Kabylake There is a new version of DMC available for KBL. The release notes mentions: 1. Fix for the issue where DC_STATE was getting enabled even when disabled by driver causing data corruption. v2: Remove pull request from commit message (Rodrigo). Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1510253503-12634-1-git-send-email-anusha.srivatsa@intel.com	2017-11-23 11:14:11 +02:00
Chris Wilson	b4e3c935b2	drm/i915: Save/restore irq state for vlv_residency_raw() Since commit `6060b6aec0` ("drm/i915/pmu: Add RC6 residency metrics"), vlv_residency_raw() may be called from an irq-disabled context (via perf event sampling on remote cpu). As such, we can no longer assume that we are called from process context and must save/restore the irq state for the spinlock. Fixes: `6060b6aec0` ("drm/i915/pmu: Add RC6 residency metrics") Testcase: igt/perf_pmu/other-init-3 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171122222510.22627-1-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>	2017-11-23 07:19:10 +00:00
Chris Wilson	ee48700dd5	drm/i915: Call i915_gem_init_userptr() before taking struct_mutex We don't need struct_mutex to initialise userptr (it just allocates a workqueue for itself etc), but we do need struct_mutex later on in i915_gem_init() in order to feed requests onto the HW. This should break the chain [ 385.697902] ====================================================== [ 385.697907] WARNING: possible circular locking dependency detected [ 385.697913] 4.14.0-CI-Patchwork_7234+ #1 Tainted: G U [ 385.697917] ------------------------------------------------------ [ 385.697922] perf_pmu/2631 is trying to acquire lock: [ 385.697927] (&mm->mmap_sem){++++}, at: [<ffffffff811bfe1e>] __might_fault+0x3e/0x90 [ 385.697941] but task is already holding lock: [ 385.697946] (&cpuctx_mutex){+.+.}, at: [<ffffffff8116fe8c>] perf_event_ctx_lock_nested+0xbc/0x1d0 [ 385.697957] which lock already depends on the new lock. [ 385.697963] the existing dependency chain (in reverse order) is: [ 385.697970] -> #4 (&cpuctx_mutex){+.+.}: [ 385.697980] __mutex_lock+0x86/0x9b0 [ 385.697985] perf_event_init_cpu+0x5a/0x90 [ 385.697991] perf_event_init+0x178/0x1a4 [ 385.697997] start_kernel+0x27f/0x3f1 [ 385.698003] verify_cpu+0x0/0xfb [ 385.698006] -> #3 (pmus_lock){+.+.}: [ 385.698015] __mutex_lock+0x86/0x9b0 [ 385.698020] perf_event_init_cpu+0x21/0x90 [ 385.698025] cpuhp_invoke_callback+0xca/0xc00 [ 385.698030] _cpu_up+0xa7/0x170 [ 385.698035] do_cpu_up+0x57/0x70 [ 385.698039] smp_init+0x62/0xa6 [ 385.698044] kernel_init_freeable+0x97/0x193 [ 385.698050] kernel_init+0xa/0x100 [ 385.698055] ret_from_fork+0x27/0x40 [ 385.698058] -> #2 (cpu_hotplug_lock.rw_sem){++++}: [ 385.698068] cpus_read_lock+0x39/0xa0 [ 385.698073] apply_workqueue_attrs+0x12/0x50 [ 385.698078] __alloc_workqueue_key+0x1d8/0x4d8 [ 385.698134] i915_gem_init_userptr+0x5f/0x80 [i915] [ 385.698176] i915_gem_init+0x7c/0x390 [i915] [ 385.698213] i915_driver_load+0x99e/0x15c0 [i915] [ 385.698250] i915_pci_probe+0x33/0x90 [i915] [ 385.698256] pci_device_probe+0xa1/0x130 [ 385.698262] driver_probe_device+0x293/0x440 [ 385.698267] __driver_attach+0xde/0xe0 [ 385.698272] bus_for_each_dev+0x5c/0x90 [ 385.698277] bus_add_driver+0x16d/0x260 [ 385.698282] driver_register+0x57/0xc0 [ 385.698287] do_one_initcall+0x3e/0x160 [ 385.698292] do_init_module+0x5b/0x1fa [ 385.698297] load_module+0x2374/0x2dc0 [ 385.698302] SyS_finit_module+0xaa/0xe0 [ 385.698307] entry_SYSCALL_64_fastpath+0x1c/0xb1 [ 385.698311] -> #1 (&dev->struct_mutex){+.+.}: [ 385.698320] __mutex_lock+0x86/0x9b0 [ 385.698361] i915_mutex_lock_interruptible+0x4c/0x130 [i915] [ 385.698403] i915_gem_fault+0x206/0x760 [i915] [ 385.698409] __do_fault+0x1a/0x70 [ 385.698413] __handle_mm_fault+0x7c4/0xdb0 [ 385.698417] handle_mm_fault+0x154/0x300 [ 385.698440] __do_page_fault+0x2d6/0x570 [ 385.698445] page_fault+0x22/0x30 [ 385.698449] -> #0 (&mm->mmap_sem){++++}: [ 385.698459] lock_acquire+0xaf/0x200 [ 385.698464] __might_fault+0x68/0x90 [ 385.698470] _copy_to_user+0x1e/0x70 [ 385.698475] perf_read+0x1aa/0x290 [ 385.698480] __vfs_read+0x23/0x120 [ 385.698484] vfs_read+0xa3/0x150 [ 385.698488] SyS_read+0x45/0xb0 [ 385.698493] entry_SYSCALL_64_fastpath+0x1c/0xb1 [ 385.698497] other info that might help us debug this: [ 385.698505] Chain exists of: &mm->mmap_sem --> pmus_lock --> &cpuctx_mutex [ 385.698517] Possible unsafe locking scenario: [ 385.698522] CPU0 CPU1 [ 385.698526] ---- ---- [ 385.698529] lock(&cpuctx_mutex); [ 385.698553] lock(pmus_lock); [ 385.698558] lock(&cpuctx_mutex); [ 385.698564] lock(&mm->mmap_sem); [ 385.698568] * DEADLOCK * [ 385.698574] 1 lock held by perf_pmu/2631: [ 385.698578] #0: (&cpuctx_mutex){+.+.}, at: [<ffffffff8116fe8c>] perf_event_ctx_lock_nested+0xbc/0x1d0 [ 385.698589] stack backtrace: [ 385.698595] CPU: 3 PID: 2631 Comm: perf_pmu Tainted: G U 4.14.0-CI-Patchwork_7234+ #1 [ 385.698602] Hardware name: /NUC6CAYB, BIOS AYAPLCEL.86A.0040.2017.0619.1722 06/19/2017 [ 385.698609] Call Trace: [ 385.698615] dump_stack+0x5f/0x86 [ 385.698621] print_circular_bug.isra.18+0x1d0/0x2c0 [ 385.698627] __lock_acquire+0x19c3/0x1b60 [ 385.698634] ? generic_exec_single+0x77/0xe0 [ 385.698640] ? lock_acquire+0xaf/0x200 [ 385.698644] lock_acquire+0xaf/0x200 [ 385.698650] ? __might_fault+0x3e/0x90 [ 385.698655] __might_fault+0x68/0x90 [ 385.698660] ? __might_fault+0x3e/0x90 [ 385.698665] _copy_to_user+0x1e/0x70 [ 385.698670] perf_read+0x1aa/0x290 [ 385.698675] __vfs_read+0x23/0x120 [ 385.698682] ? __fget+0x101/0x1f0 [ 385.698686] vfs_read+0xa3/0x150 [ 385.698691] SyS_read+0x45/0xb0 [ 385.698696] entry_SYSCALL_64_fastpath+0x1c/0xb1 [ 385.698701] RIP: 0033:0x7ff1c46876ed [ 385.698705] RSP: 002b:00007fff13552f90 EFLAGS: 00000293 ORIG_RAX: 0000000000000000 [ 385.698712] RAX: ffffffffffffffda RBX: ffffc90000647ff0 RCX: 00007ff1c46876ed [ 385.698718] RDX: 0000000000000010 RSI: 00007fff13552fa0 RDI: 0000000000000005 [ 385.698723] RBP: 000056063d300580 R08: 0000000000000000 R09: 0000000000000060 [ 385.698729] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000046 [ 385.698734] R13: 00007fff13552c6f R14: 00007ff1c6279d00 R15: 00007ff1c6279a40 Testcase: igt/perf_pmu Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171122172621.16158-1-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>	2017-11-22 17:40:37 +00:00
Chris Wilson	62d0fe4529	drm/i915: Remove success dmesg noise for intel_rotate_pages() During selftesting intel_rotate_pages() is very, very verbose without giving us any information. Suppress the noise. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171122145646.1859-1-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>	2017-11-22 17:29:58 +00:00
Chris Wilson	c65c8b0f7a	drm/i915/selftests: Use NOWARN for large allocations We may try to do a large kmalloc for the permutation array, falling back to a smaller array/test if the first allocation fails. Since we are intentionally trying a large allocation which may fail, pass __GFP_NOWARN. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103842 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171122120600.27025-1-chris@chris-wilson.co.uk Reviewed-by: Matthew Auld <matthew.auld@intel.com>	2017-11-22 12:15:39 +00:00
Tvrtko Ursulin	6060b6aec0	drm/i915/pmu: Add RC6 residency metrics For clients like intel-gpu-overlay it is easier to read the counters via the perf API than having to parse sysfs. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171121181852.16128-9-tvrtko.ursulin@linux.intel.com	2017-11-22 11:25:06 +00:00
Tvrtko Ursulin	36cc8b963f	drm/i915: Convert intel_rc6_residency_us to ns Will be used for exposing the PMU counters. v2: * Move intel_runtime_pm_get/put to the callers. (Chris Wilson) * Restore full unit conversion precision. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171121181852.16128-8-tvrtko.ursulin@linux.intel.com	2017-11-22 11:25:05 +00:00
Tvrtko Ursulin	0cd4684d6e	drm/i915/pmu: Add interrupt count metric For clients like intel-gpu-overlay it is easier to read the count via the perf API than having to parse /proc. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171121181852.16128-7-tvrtko.ursulin@linux.intel.com	2017-11-22 11:25:04 +00:00
Tvrtko Ursulin	b3add01ee2	drm/i915/pmu: Wire up engine busy stats to PMU We can use engine busy stats instead of the sampling timer for better accuracy. By doing this we replace the stohastic sampling with busyness metric derived directly from engine activity. This is context switch interrupt driven, so as accurate as we can get from software tracking. As a secondary benefit, we can also not run the sampling timer in cases only busyness metric is enabled. v2: Rebase. v3: * Rebase, comments. * Leave engine busyness controls out of workers. v4: Checkpatch cleanup. v5: Added comment to pmu_needs_timer change. v6: * Rebase. * Fix style of some comments. (Chris Wilson) v7: Rebase and commit message update. (Chris Wilson) v8: Add delayed stats disabling to improve accuracy in face of CPU hotplug events. v9: Rebase. v10: Rebase - i915_modparams.enable_execlists removal. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171121181852.16128-6-tvrtko.ursulin@linux.intel.com	2017-11-22 11:25:03 +00:00
Tvrtko Ursulin	30e17b7847	drm/i915: Engine busy time tracking Track total time requests have been executing on the hardware. We add new kernel API to allow software tracking of time GPU engines are spending executing requests. Both per-engine and global API is added with the latter also being exported for use by external users. v2: * Squashed with the internal API. * Dropped static key. * Made per-engine. * Store time in monotonic ktime. v3: Moved stats clearing to disable. v4: * Comments. * Don't export the API just yet. v5: Whitespace cleanup. v6: * Rename ref to active. * Drop engine aggregate stats for now. * Account initial busy period after enabling stats. v7: * Rebase. v8: * Move context in notification after the notifier. (Chris Wilson) v9: In cases where stats tracking is getting disabled while there is an active context on an engine, add up the current value to the total. This also implies we don't clear the total when tracking is disabled any longer. There is no real need to do so because we define the stats as relative while enabled, meaning comparison between two samples while tracking is enabled is the valid usage. However, when busy stats will later be plugged into the perf PMU API, it is beneficial to not reset the total, since the PMU core likes to do some counter disable/enable cycles on startup, and while doing so during a single long context executing on an engine we would lose some accuracy and so make unit testing more difficult than needs to be. v10: * Fix accounting for preemption. v11: * Rebase for i915_modparams.enable_execlists removal. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171121181852.16128-5-tvrtko.ursulin@linux.intel.com	2017-11-22 11:25:02 +00:00
Tvrtko Ursulin	73fd9d3816	drm/i915: Wrap context schedule notification No functional change just something which will be handy in the following patch. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171121181852.16128-4-tvrtko.ursulin@linux.intel.com	2017-11-22 11:25:01 +00:00
Tvrtko Ursulin	feff0dc6cd	drm/i915/pmu: Suspend sampling when GPU is idle If only a subset of events is enabled we can afford to suspend the sampling timer when the GPU is idle and so save some cycles and power. v2: Rebase and limit timer even more. v3: Rebase. v4: Rebase. v5: Skip action if perf PMU failed to register. v6: Checkpatch cleanup. v7: * Add a common helper to start the timer if needed. (Chris Wilson) * Add comment explaining bitwise logic in pmu_needs_timer. v8: Fix some comments styles. (Chris Wilson) v9: Rebase. v10: Move function declarations to i915_pmu.h. v11: Rename functions to i915_pmu_gt_(un)parked. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171121181852.16128-3-tvrtko.ursulin@linux.intel.com	2017-11-22 11:25:00 +00:00
Tvrtko Ursulin	b46a33e271	drm/i915/pmu: Expose a PMU interface for perf queries From: Chris Wilson <chris@chris-wilson.co.uk> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com> From: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> The first goal is to be able to measure GPU (and invidual ring) busyness without having to poll registers from userspace. (Which not only incurs holding the forcewake lock indefinitely, perturbing the system, but also runs the risk of hanging the machine.) As an alternative we can use the perf event counter interface to sample the ring registers periodically and send those results to userspace. Functionality we are exporting to userspace is via the existing perf PMU API and can be exercised via the existing tools. For example: perf stat -a -e i915/rcs0-busy/ -I 1000 Will print the render engine busynnes once per second. All the performance counters can be enumerated (perf list) and have their unit of measure correctly reported in sysfs. v1-v2 (Chris Wilson): v2: Use a common timer for the ring sampling. v3: (Tvrtko Ursulin) * Decouple uAPI from i915 engine ids. * Complete uAPI defines. * Refactor some code to helpers for clarity. * Skip sampling disabled engines. * Expose counters in sysfs. * Pass in fake regs to avoid null ptr deref in perf core. * Convert to class/instance uAPI. * Use shared driver code for rc6 residency, power and frequency. v4: (Dmitry Rogozhkin) * Register PMU with .task_ctx_nr=perf_invalid_context * Expose cpumask for the PMU with the single CPU in the mask * Properly support pmu->stop(): it should call pmu->read() * Properly support pmu->del(): it should call stop(event, PERF_EF_UPDATE) * Introduce refcounting of event subscriptions. * Make pmu.busy_stats a refcounter to avoid busy stats going away with some deleted event. * Expose cpumask for i915 PMU to avoid multiple events creation of the same type followed by counter aggregation by perf-stat. * Track CPUs getting online/offline to migrate perf context. If (likely) cpumask will initially set CPU0, CONFIG_BOOTPARAM_HOTPLUG_CPU0 will be needed to see effect of CPU status tracking. * End result is that only global events are supported and perf stat works correctly. * Deny perf driver level sampling - it is prohibited for uncore PMU. v5: (Tvrtko Ursulin) * Don't hardcode number of engine samplers. * Rewrite event ref-counting for correctness and simplicity. * Store initial counter value when starting already enabled events to correctly report values to all listeners. * Fix RC6 residency readout. * Comments, GPL header. v6: * Add missing entry to v4 changelog. * Fix accounting in CPU hotplug case by copying the approach from arch/x86/events/intel/cstate.c. (Dmitry Rogozhkin) v7: * Log failure message only on failure. * Remove CPU hotplug notification state on unregister. v8: * Fix error unwind on failed registration. * Checkpatch cleanup. v9: * Drop the energy metric, it is available via intel_rapl_perf. (Ville Syrjälä) * Use HAS_RC6(p). (Chris Wilson) * Handle unsupported non-engine events. (Dmitry Rogozhkin) * Rebase for intel_rc6_residency_ns needing caller managed runtime pm. * Drop HAS_RC6 checks from the read callback since creating those events will be rejected at init time already. * Add counter units to sysfs so perf stat output is nicer. * Cleanup the attribute tables for brevity and readability. v10: * Fixed queued accounting. v11: * Move intel_engine_lookup_user to intel_engine_cs.c * Commit update. (Joonas Lahtinen) v12: * More accurate sampling. (Chris Wilson) * Store and report frequency in MHz for better usability from perf stat. * Removed metrics: queued, interrupts, rc6 counters. * Sample engine busyness based on seqno difference only for less MMIO (and forcewake) on all platforms. (Chris Wilson) v13: * Comment spelling, use mul_u32_u32 to work around potential GCC issue and somne code alignment changes. (Chris Wilson) v14: * Rebase. v15: * Rebase for RPS refactoring. v16: * Use the dynamic slot in the CPU hotplug state machine so that we are free to setup our state as multi-instance. Previously we were re-using the CPUHP_AP_PERF_X86_UNCORE_ONLINE slot which is neither used as multi-instance, nor owned by our driver to start with. * Register the CPU hotplug handlers after the PMU, otherwise the callback will get called before the PMU is initialized which can end up in perf_pmu_migrate_context with an un-initialized base. * Added workaround for a probable bug in cpuhp core. v17: * Remove workaround for the cpuhp bug. v18: * Rebase for drm_i915_gem_engine_class getting upstream before us. v19: * Rebase. (trivial) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171121181852.16128-2-tvrtko.ursulin@linux.intel.com	2017-11-22 11:24:57 +00:00
Tvrtko Ursulin	c84b270546	drm/i915: Extract intel_get_cagf Code to be shared between debugfs and the PMU implementation. v2: Checkpatch cleanup. v3: Also consolidate i915_sysfs.c/gt_act_freq_mhz_show. v4: Rebase. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171121181852.16128-1-tvrtko.ursulin@linux.intel.com	2017-11-22 11:24:56 +00:00
Chris Wilson	f9eb63b98c	drm/i915/selftests: Avoid drm_gem_handle_create under struct_mutex Despite us reloading the module around every selftest, the lockclasses persist and the chains used in selftesting may then dictate how we are allowed to nest locks during runtime testing. As such we have to be just as careful, and in particular it turns out we are not allowed to nest dev->object_name_lock (drm_gem_handle_create) inside dev->struct_mutex. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103830 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171121110652.1107-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-11-21 21:44:55 +00:00
Ville Syrjälä	cff109f06d	drm/i915: Add rudimentary plane state verification Check that the planes are in the state we expect them to be. For now we can only check whether each plane is correctly enabled or disabled. In the future we may want to expand the plane state readout to support a more thorough verification. v2: Verify all planes part of the state as long as at least one crtc is doing a modeset (Daniel) v3: Fix typoes (James) Cc: James Ausmus <james.ausmus@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: James Ausmus <james.ausmus@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171117191917.11506-11-ville.syrjala@linux.intel.com Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>	2017-11-21 19:54:29 +02:00
Ville Syrjälä	2924b8cc41	drm/i915: Use plane->get_hw_state() for initial plane fb readout Since we now have a ->get_hw_state() method for planes, let's use that during the initial plane fb readout. v2: s/plane/i9xx_plane/ etc. (James) Cc: James Ausmus <james.ausmus@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: James Ausmus <james.ausmus@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171117191917.11506-10-ville.syrjala@linux.intel.com Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>	2017-11-21 19:52:28 +02:00
Ville Syrjälä	b1558c7ea1	drm/i915: Nuke crtc->plane Eliminate crtc->plane since it's pretty much a layering violation. We can always get the plane via crtc->primary if we actually need it. The only ugly thing left is plane_to_crtc_mapping[], but that's still needed by the pre-g4x watermark code. v2: Removed a misplaced comment change (Daniel) v3: Rebase due to fbc crtc->y usage removal v4: s/plane/i9xx_plane/ etc. (James) Cc: James Ausmus <james.ausmus@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20171117191917.11506-9-ville.syrjala@linux.intel.com Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>	2017-11-21 19:51:34 +02:00
Ville Syrjälä	dd57602efb	drm/i915: Switch fbc over to for_each_new_intel_plane_in_state() Stop using the old for_each_intel_plane_in_state() type iteration macro and replace it with for_each_new_intel_plane_in_state(). And similarly replace drm_atomic_get_existing_crtc_state() with intel_atomic_get_new_crtc_state(). Switch over to intel_ types as well to make the code less cluttered. v2: s/plane/i9xx_plane/ etc. (James) Cc: James Ausmus <james.ausmus@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20171117191917.11506-8-ville.syrjala@linux.intel.com Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>	2017-11-21 19:50:43 +02:00
Ville Syrjälä	81894b2fb9	drm/i915: Nuke ironlake_get_initial_plane_config() The only relevant difference between i9xx_get_initial_plane_config() and ironlake_get_initial_plane_config() is the HSW/BDW TILEOFF handling. Add that to i9xx_get_initial_plane_config() and nuke ironlake_get_initial_plane_config(). v2: s/plane/i9xx_plane/ etc. (James) Cc: James Ausmus <james.ausmus@intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20171117191917.11506-7-ville.syrjala@linux.intel.com Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>	2017-11-21 19:47:47 +02:00
Ville Syrjälä	282e83ef62	drm/i915: Cleanup enum pipe/enum plane_id/enum i9xx_plane_id in initial fb readout Use enum pipe, enum plane_id, and enum i9xx_plane_id consistently in the initial framebuffe readout. v2: Use old_plane_id in the ilk code v3: s/old_plane_id/i9xx_plane_id/ (Daniel) v4: Rebase due to GLK/CNL PLANE_COLOR_CTL alpha stuff v5: s/plane/i9xx_plane/ etc. (James) Cc: James Ausmus <james.ausmus@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20171117191917.11506-6-ville.syrjala@linux.intel.com Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>	2017-11-21 19:47:07 +02:00
Ville Syrjälä	bdaf8439ba	drm/i915: Use enum i9xx_plane_id for the .get_fifo_size() hooks Replace the 0 and 1 with PLANE_A and PLANE_B in the pre-g4x wm code. v2: s/old_plane_id/i9xx_plane_id/ (Daniel) v3: s/plane/i9xx_plane/ etc. (James) Cc: James Ausmus <james.ausmus@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20171117191917.11506-5-ville.syrjala@linux.intel.com Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>	2017-11-21 19:45:05 +02:00
Ville Syrjälä	ed15030d7a	drm/i915: s/enum plane/enum i9xx_plane_id/ Rename enum plane to enum i9xx_plane_id to make it clear that it only applies to pre-SKL platforms. enum i9xx_plane_id is a global identifier, whereas enum plane_id is per-pipe. We need the old global identifier to index the primary plane (and the pre-g4x sprite C if we ever expose it) registers on pre-SKL platforms. v2: Reorder patches v3: s/old_plane_id/i9xx_plane_id/ (Daniel) Pimp the commit message a bit Note that i9xx_plane_id doesn't apply to SKL+ v4: Rebase due to power domain handling in plane readout v5: Rebase due to crtc->dspaddr_offset removal v6: s/plane/i9xx_plane/ etc. (James) Cc: James Ausmus <james.ausmus@intel.com> Cc: Daniel Vetter <daniel@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20171117191917.11506-4-ville.syrjala@linux.intel.com Reviewed-by: James Ausmus <james.ausmus@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>	2017-11-21 19:44:03 +02:00
Ville Syrjälä	b1e01595a6	drm/i915: Redo plane sanitation during readout Unify the plane disabling during state readout by pulling the code into a new helper intel_plane_disable_noatomic(). We'll also read out the state of all planes, so that we know which planes really need to be diabled. Additonally we change the plane<->pipe mapping sanitation to work by simply disabling the offending planes instead of entire pipes. And we do it before we otherwise sanitize the crtcs, which means we don't have to worry about misassigned planes during crtc sanitation anymore. v2: Reoder patches to not depend on enum old_plane_id v3: s/for_each_pipe/for_each_intel_crtc/ Cc: Thierry Reding <thierry.reding@gmail.com> Cc: Alex Villacís Lasso <alexvillacislasso@hotmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103223 Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Tested-by: Thierry Reding <thierry.reding@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171117191917.11506-3-ville.syrjala@linux.intel.com Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>	2017-11-21 19:40:47 +02:00
Ville Syrjälä	51f5a09639	drm/i915: Add .get_hw_state() method for planes Add a .get_hw_state() method for planes, returning true or false depending on whether the plane is enabled. Use it to rewrite the plane enabled/disabled asserts in platform agnostic fashion. We do lose the pre-gen4 plane<->pipe mapping checks, but since we're supposed sanitize that anyway it doesn't really matter. v2: Reoder patches to not depend on enum old_plane_id Just call assert_plane_disabled() from assert_planes_disabled() v3: Deal with disabled power wells in .get_hw_state() v4: Rebase due skl primary plane code removal Cc: Thierry Reding <thierry.reding@gmail.com> Cc: Alex Villacís Lasso <alexvillacislasso@hotmail.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> #v2 Tested-by: Thierry Reding <thierry.reding@gmail.com> #v2 Link: https://patchwork.freedesktop.org/patch/msgid/20171117191917.11506-2-ville.syrjala@linux.intel.com Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>	2017-11-21 19:40:28 +02:00
David Weinehall	36fe778a48	drm/i915: Don't use GEN6_RC_VIDEO_FREQ on gen10+ GEN6_RC_VIDEO_FREQ is deprecated for >= gen10; don't try to program it. v2: Use IS_GEN9() instead of INTEL_GEN() and remove comment (Rodrigo) Signed-off-by: David Weinehall <david.weinehall@linux.intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171117080146.20150-1-david.weinehall@linux.intel.com	2017-11-20 14:32:22 -08:00
Chris Wilson	0ab42a7871	drm/i915/selftests: Declare we allocated the guc clients Silence smatch over drivers/gpu/drm/i915/selftests/intel_guc.c:135 igt_guc_init_doorbell_hw() error: we previously assumed 'guc->execbuf_client' could be null (see line 123) drivers/gpu/drm/i915/selftests/intel_guc.c:142 igt_guc_init_doorbell_hw() error: we previously assumed 'guc->preempt_client' could be null (see line 123) by asserting that we did succeed in creating the pair of clients for testing. References: `55bd6bd757` ("drm/i915/selftests: Add a GuC doorbells selftest") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michel Thierry <michel.thierry@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171120211907.1649-1-chris@chris-wilson.co.uk Reviewed-by: Michel Thierry <michel.thierry@intel.com>	2017-11-20 21:59:30 +00:00
Chris Wilson	93c6e966b4	drm/i915: Remove i915.semaphores modparam Having disabled the broken semaphores on Sandybridge, there is no need for a modparam any more, so remove it in favour of a simple HAS_LEGACY_SEMAPHORES() guard. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171120205504.21892-5-chris@chris-wilson.co.uk	2017-11-20 21:59:09 +00:00
Chris Wilson	af9ff6c70d	drm/i915: Move debugfs/i915_semaphore_status to i915_engine_info As the semaphores is just part of the engine, include it with the general pretty printer universally used for debugging. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171120205504.21892-4-chris@chris-wilson.co.uk	2017-11-20 21:59:08 +00:00
Chris Wilson	0da715ee60	drm/i915: Disable semaphores on Sandybridge I should have admitted defeat long ago as there has been a rare but persistent error on Sandybridge where semaphore signaling did not propagate to the waiter, leading to a GPU hang. With the work on fence signaling for v4.9, the impact of using CPU driven signaling was greatly reduced wrt to the latency of GPU semaphores, though without logical rings support, the benefit of reordering work to avoid bubbles is not realised (i.e. as it stands fence signaling is just a slower, more costly version of HW semaphores; but works more consistently). As a rough indicator of the difference, with semaphores: Sequential (3 engines, 1 processes): average 5.470us per cycle [expected 4.988us] w/o semaphores: Sequential (3 engines, 1 processes): average 15.771us per cycle [expected 4.923us] In comparison, v3.4: with semaphores: Sequential (3 engines, 1 processes): average 16.066us per cycle [expected 11.842us] w/o semaphores: Sequential (3 engines, 1 processes): average 23.460us per cycle [expected 11.839us] Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54226 #and 100+ dupes Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171120205504.21892-3-chris@chris-wilson.co.uk	2017-11-20 21:59:08 +00:00
Chris Wilson	79e6770cb1	drm/i915: Remove obsolete ringbuffer emission for gen8+ Since removing the module parameter to force selection of ringbuffer emission for gen8, the code is defunct. Remove it. To put the difference into perspective, a couple of microbenchmarks (bdw i7-5557u, 20170324): ring execlists exec continuous nops on all rings: 1.491us 2.223us exec sequential nops on each ring: 12.508us 53.682us single nop + sync: 9.272us 30.291us vblank_mode=0 glxgears: ~11000fps ~9000fps Since the earlier submission, gen8 ringbuffer submission has fallen further and further behind in features. So while ringbuffer may hold the throughput crown, in terms of interactive latency, execlists is much better. Alas, we have no convenient metrics for such, other than demonstrating things we can do with execlists but can not using legacy ringbuffer submission. We have made a few improvements to lowlevel execlists throughput, and ringbuffer currently panics on boot! (bdw i7-5557u, 20171026): ring execlists exec continuous nops on all rings: n/a 1.921us exec sequential nops on each ring: n/a 44.621us single nop + sync: n/a 21.953us vblank_mode=0 glxgears: n/a ~18500fps References: https://bugs.freedesktop.org/show_bug.cgi?id=87725 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Once-upon-a-time-Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171120205504.21892-2-chris@chris-wilson.co.uk	2017-11-20 21:54:58 +00:00
Chris Wilson	fb5c551ad5	drm/i915: Remove i915.enable_execlists module parameter Execlists and legacy ringbuffer submission are no longer feature comparable (execlists now offer greater functionality that should overcome their performance hit) and obsoletes the unsafe module parameter, i.e. comparing the two modes of execution is no longer useful, so remove the debug tool. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> #i915_perf.c Link: https://patchwork.freedesktop.org/patch/msgid/20171120205504.21892-1-chris@chris-wilson.co.uk	2017-11-20 21:53:59 +00:00
Michel Thierry	ba74cb10c7	drm/i915/execlists: Delay writing to ELSP until HW has processed the previous write The hardware needs some time to process the information received in the ExecList Submission Port, and expects us to not write anything more until it has 'acknowledged' this new submission by sending an IDLE_ACTIVE or PREEMPTED CSB event. If we do not follow this, the driver could write new data into the ELSP before HW had finishing fetching the previous one, putting us in 'undefined behaviour' space. This seems to be the problem causing the spurious PREEMPTED & COMPLETE events after a COMPLETE like the one below: [] vcs0: sw rd pointer = 2, hw wr pointer = 0, current 'head' = 3. [] vcs0: Execlist CSB[0]: 0x00000018 _ 0x00000007 [] vcs0: Execlist CSB[1]: 0x00000001 _ 0x00000000 [] vcs0: Execlist CSB[2]: 0x00000018 _ 0x00000007 <<< COMPLETE [] vcs0: Execlist CSB[3]: 0x00000012 _ 0x00000007 <<< PREEMPTED & COMPLETE [] vcs0: Execlist CSB[4]: 0x00008002 _ 0x00000006 [] vcs0: Execlist CSB[5]: 0x00000014 _ 0x00000006 The ELSP writes that lead to this CSB sequence show that the HW hadn't started executing the previous execlist (the one with only ctx 0x6) by the time the new one was submitted; this is a bit more clear in the data show in the EXECLIST_STATUS register at the time of the ELSP write. [] vcs0: ELSP[0] = 0x0_0 [execlist1] - status_reg = 0x0_302 [] vcs0: ELSP[1] = 0x6_fedb2119 [execlist0] - status_reg = 0x0_8302 [] vcs0: ELSP[2] = 0x7_fedaf119 [execlist1] - status_reg = 0x0_8308 [] vcs0: ELSP[3] = 0x6_fedb2119 [execlist0] - status_reg = 0x7_8308 Note that having to wait for this ack does not disable lite-restores, although it may reduce their numbers. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102035 Signed-off-by: Michel Thierry <michel.thierry@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/<20171118003038.7935-1-michel.thierry@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171120123458.23242-4-chris@chris-wilson.co.uk Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-11-20 17:01:38 +00:00
Chris Wilson	2a6c4241fc	drm/i915/selftest: Make guc clients static Make the private array used for stashing test clients static, to silence sparse. References: `55bd6bd757` ("drm/i915/selftests: Add a GuC doorbells selftest") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michel Thierry <michel.thierry@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171120132606.4254-1-chris@chris-wilson.co.uk Reviewed-by: Michel Thierry <michel.thierry@intel.com>	2017-11-20 16:50:27 +00:00
Lionel Landwerlin	9f9b2792b6	drm/i915/perf: reuse timestamp frequency from device info Now that we have this stored in the device info, we can drop it from perf part of the driver. Note that this requires to init perf after we've computed the frequency, hence why we move i915_perf_init() from i915_driver_init_early() to after intel_device_info_runtime_init(). v2: Use div_u64 (Chris) v3: Drop u64 divs by switching to kHz (Chris/Ville) Move i915_perf_fini to i915_driver_cleanup_hw (Matthew) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171113181902.12411-2-lionel.g.landwerlin@intel.com	2017-11-20 16:09:04 +00:00
Chris Wilson	3fef5cda97	drm/i915: Automatic i915_switch_context for legacy During request construction, after pinning the context we know whether or not we have to emit a context switch. So move this common operation from every caller into i915_gem_request_alloc() itself. v2: Always submit the request if we emitted some commands during request construction, as typically it also involves changes in global state. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171120102002.22254-2-chris@chris-wilson.co.uk	2017-11-20 15:56:16 +00:00
Chris Wilson	2113184c6f	drm/i915: Pull the unconditional GPU cache invalidation into request construction As the request will, in the following patch, implicitly invoke a context-switch on construction, we should precede that with a GPU TLB invalidation. Also, even before using GGTT, we always want to invalidate the TLBs for any updates (as well as the ppgtt invalidates that are unconditionally applied by execbuf). Since we almost always require the TLB invalidate, do it unconditionally on request allocation and so we can remove it from all other paths. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171120102002.22254-1-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>	2017-11-20 15:56:16 +00:00
Lionel Landwerlin	7c52a2219d	drm/i915/perf: replace .reg accesses with i915_mmio_reg_offset This replaces accesses to the reg field of the i915_reg_t structure with the i915_mmio_reg_offset() inline function. Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ewelina Musial <ewelina.musial@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171113233455.12085-2-lionel.g.landwerlin@intel.com	2017-11-20 15:46:02 +00:00
Chris Wilson	1f5f9edb44	drm/i915/execlists: Assert that we don't get mixed IDLE_ACTIVE \| COMPLETE events If IDLE_ACTIVE is set, then all other bits are invalid. For us, we can assert that if we see a COMPLETE \| PREEMPTED event, then it should be impossible for it to also contain an IDLE_ACTIVE flag. Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171120123458.23242-3-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-11-20 14:50:45 +00:00

1 2 3 4 5 ...

709740 Commits