commit 5a90af67c2 upstream.
Since commtit 8a7b1227e3 (cpufreq: davinci: move cpufreq driver to
drivers/cpufreq) this added dependancy only for CONFIG_ARCH_DAVINCI_DA850
where as davinci_cpufreq_init() call is used by all davinci platform.
This patch fixes following build error:
arch/arm/mach-davinci/built-in.o: In function `davinci_init_late':
:(.init.text+0x928): undefined reference to `davinci_cpufreq_init'
make: *** [vmlinux] Error 1
Fixes: 8a7b1227e3 (cpufreq: davinci: move cpufreq driver to drivers/cpufreq)
Signed-off-by: Lad, Prabhakar <prabhakar.csengg@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3617f2ca6d upstream.
When a CPU is hot removed we'll cancel all the delayed work items
via gov_cancel_work(). Normally this will just cancels a delayed
timer on each CPU that the policy is managing and the work won't
run, but if the work is already running the workqueue code will
wait for the work to finish before continuing to prevent the
work items from re-queuing themselves like they normally do. This
scheme will work most of the time, except for the case where the
work function determines that it should adjust the delay for all
other CPUs that the policy is managing. If this scenario occurs,
the canceling CPU will cancel its own work but queue up the other
CPUs works to run. For example:
CPU0 CPU1
---- ----
cpu_down()
...
__cpufreq_remove_dev()
cpufreq_governor_dbs()
case CPUFREQ_GOV_STOP:
gov_cancel_work(dbs_data, policy);
cpu0 work is canceled
timer is canceled
cpu1 work is canceled <work runs>
<waits for cpu1> od_dbs_timer()
gov_queue_work(*, *, true);
cpu0 work queued
cpu1 work queued
cpu2 work queued
...
cpu1 work is canceled
cpu2 work is canceled
...
At the end of the GOV_STOP case cpu0 still has a work queued to
run although the code is expecting all of the works to be
canceled. __cpufreq_remove_dev() will then proceed to
re-initialize all the other CPUs works except for the CPU that is
going down. The CPUFREQ_GOV_START case in cpufreq_governor_dbs()
will trample over the queued work and debugobjects will spit out
a warning:
WARNING: at lib/debugobjects.c:260 debug_print_object+0x94/0xbc()
ODEBUG: init active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x10
Modules linked in:
CPU: 0 PID: 1491 Comm: sh Tainted: G W 3.10.0 #19
[<c010c178>] (unwind_backtrace+0x0/0x11c) from [<c0109dec>] (show_stack+0x10/0x14)
[<c0109dec>] (show_stack+0x10/0x14) from [<c01904cc>] (warn_slowpath_common+0x4c/0x6c)
[<c01904cc>] (warn_slowpath_common+0x4c/0x6c) from [<c019056c>] (warn_slowpath_fmt+0x2c/0x3c)
[<c019056c>] (warn_slowpath_fmt+0x2c/0x3c) from [<c0388a7c>] (debug_print_object+0x94/0xbc)
[<c0388a7c>] (debug_print_object+0x94/0xbc) from [<c0388e34>] (__debug_object_init+0x2d0/0x340)
[<c0388e34>] (__debug_object_init+0x2d0/0x340) from [<c019e3b0>] (init_timer_key+0x14/0xb0)
[<c019e3b0>] (init_timer_key+0x14/0xb0) from [<c0635f78>] (cpufreq_governor_dbs+0x3e8/0x5f8)
[<c0635f78>] (cpufreq_governor_dbs+0x3e8/0x5f8) from [<c06325a0>] (__cpufreq_governor+0xdc/0x1a4)
[<c06325a0>] (__cpufreq_governor+0xdc/0x1a4) from [<c0633704>] (__cpufreq_remove_dev.isra.10+0x3b4/0x434)
[<c0633704>] (__cpufreq_remove_dev.isra.10+0x3b4/0x434) from [<c08989f4>] (cpufreq_cpu_callback+0x60/0x80)
[<c08989f4>] (cpufreq_cpu_callback+0x60/0x80) from [<c08a43c0>] (notifier_call_chain+0x38/0x68)
[<c08a43c0>] (notifier_call_chain+0x38/0x68) from [<c01938e0>] (__cpu_notify+0x28/0x40)
[<c01938e0>] (__cpu_notify+0x28/0x40) from [<c0892ad4>] (_cpu_down+0x7c/0x2c0)
[<c0892ad4>] (_cpu_down+0x7c/0x2c0) from [<c0892d3c>] (cpu_down+0x24/0x40)
[<c0892d3c>] (cpu_down+0x24/0x40) from [<c0893ea8>] (store_online+0x2c/0x74)
[<c0893ea8>] (store_online+0x2c/0x74) from [<c04519d8>] (dev_attr_store+0x18/0x24)
[<c04519d8>] (dev_attr_store+0x18/0x24) from [<c02a69d4>] (sysfs_write_file+0x100/0x148)
[<c02a69d4>] (sysfs_write_file+0x100/0x148) from [<c0255c18>] (vfs_write+0xcc/0x174)
[<c0255c18>] (vfs_write+0xcc/0x174) from [<c0255f70>] (SyS_write+0x38/0x64)
[<c0255f70>] (SyS_write+0x38/0x64) from [<c0106120>] (ret_fast_syscall+0x0/0x30)
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 95731ebb11 upstream.
Cpufreq governors' stop and start operations should be carried out
in sequence. Otherwise, there will be unexpected behavior, like in
the example below.
Suppose there are 4 CPUs and policy->cpu=CPU0, CPU1/2/3 are linked
to CPU0. The normal sequence is:
1) Current governor is userspace. An application tries to set the
governor to ondemand. It will call __cpufreq_set_policy() in
which it will stop the userspace governor and then start the
ondemand governor.
2) Current governor is userspace. The online of CPU3 runs on CPU0.
It will call cpufreq_add_policy_cpu() in which it will first
stop the userspace governor, and then start it again.
If the sequence of the above two cases interleaves, it becomes:
1) Application stops userspace governor
2) Hotplug stops userspace governor
which is a problem, because the governor shouldn't be stopped twice
in a row. What happens next is:
3) Application starts ondemand governor
4) Hotplug starts a governor
In step 4, the hotplug is supposed to start the userspace governor,
but now the governor has been changed by the application to ondemand,
so the ondemand governor is started once again, which is incorrect.
The solution is to prevent policy governors from being stopped
multiple times in a row. A governor should only be stopped once for
one policy. After it has been stopped, no more governor stop
operations should be executed.
Also add a mutex to serialize governor operations.
[rjw: Changelog. And you owe me a beverage of my choice.]
Signed-off-by: Xiaoguang Chen <chenxg@marvell.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d82b922a4a upstream.
The powernow-k6 driver used to read the initial multiplier from the
powernow register. However, there is a problem with this:
* If there was a frequency transition before, the multiplier read from the
register corresponds to the current multiplier.
* If there was no frequency transition since reset, the field in the
register always reads as zero, regardless of the current multiplier that
is set using switches on the mainboard and that the CPU is running at.
The zero value corresponds to multiplier 4.5, so as a consequence, the
powernow-k6 driver always assumes multiplier 4.5.
For example, if we have 550MHz CPU with bus frequency 100MHz and
multiplier 5.5, the powernow-k6 driver thinks that the multiplier is 4.5
and bus frequency is 122MHz. The powernow-k6 driver then sets the
multiplier to 4.5, underclocking the CPU to 450MHz, but reports the
current frequency as 550MHz.
There is no reliable way how to read the initial multiplier. I modified
the driver so that it contains a table of known frequencies (based on
parameters of existing CPUs and some common overclocking schemes) and sets
the multiplier according to the frequency. If the frequency is unknown
(because of unusual overclocking or underclocking), the user must supply
the bus speed and maximum multiplier as module parameters.
This patch should be backported to all stable kernels. If it doesn't
apply cleanly, change it, or ask me to change it.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e20e1d0ac0 upstream.
I found out that a system with k6-3+ processor is unstable during network
server load. The system locks up or the network card stops receiving. The
reason for the instability is the CPU frequency scaling.
During frequency transition the processor is in "EPM Stop Grant" state.
The documentation says that the processor doesn't respond to inquiry
requests in this state. Consequently, coherency of processor caches and
bus master devices is not maintained, causing the system instability.
This patch flushes the cache during frequency transition. It fixes the
instability.
Other minor changes:
* u64 invalue changed to unsigned long because the variable is 32-bit
* move the logic to set the multiplier to a separate function
powernow_k6_set_cpu_multiplier
* preserve lower 5 bits of the powernow port instead of 4 (the voltage
field has 5 bits)
* mask interrupts when reading the multiplier, so that the port is not
open during other activity (running other kernel code with the port open
shouldn't cause any misbehavior, but we should better be safe and keep
the port closed)
This patch should be backported to all stable kernels. If it doesn't
apply cleanly, change it, or ask me to change it.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c3274763bf upstream.
The powernow-k8 driver maintains a per-cpu data-structure called
powernow_data that is used to perform the frequency transitions.
It initializes this data structure only for the policy->cpu. So,
accesses to this data structure by other CPUs results in various
problems because they would have been uninitialized.
Specifically, if a cpu (!= policy->cpu) invokes the drivers' ->get()
function, it returns 0 as the KHz value, since its per-cpu memory
doesn't point to anything valid. This causes problems during
suspend/resume since cpufreq_update_policy() tries to enforce this
(0 KHz) as the current frequency of the CPU, and this madness gets
propagated to adjust_jiffies() as well. Eventually, lots of things
start breaking down, including the r8169 ethernet card, in one
particularly interesting case reported by Pierre Ossman.
Fix this by initializing the per-cpu data-structures of all the CPUs
in the policy appropriately.
References: https://bugzilla.kernel.org/show_bug.cgi?id=70311
Reported-by: Pierre Ossman <pierre@ossman.eu>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7244cb62d9 upstream.
The minimum pstate is supposed to be a percentage of the maximum P
state available. Calculate min using max pstate and not the
current max which may have been limited by the user
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fbbc5bfb44 upstream.
Calxeda's new ECX-2000 part uses the same cpufreq interface as highbank,
so add it to the driver's compatibility list.
This is a minor change that can safely be applied to the 3.10 and 3.11
stable trees.
Signed-off-by: Mark Langsdorf <mark.langsdorf@calxeda.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6c4640c3ad upstream.
This sysfs file was called ignore_nice_load earlier and commit
4d5dcc4 (cpufreq: governor: Implement per policy instances of
governors) changed its name to ignore_nice by mistake.
Lets get it renamed back to its original name.
Reported-by: Martin von Gagern <Martin.vGagern@gmx.net>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f54fe64d14 upstream.
Commit 42913c799 (MIPS: Loongson2: Use clk API instead of direct
dereferences) broke the cpufreq functionality on Loongson2 boards:
clk_set_rate() is called before the CPU frequency table is
initialized, and therefore will always fail.
Fix by moving the clk_set_rate() after the table initialization.
Tested on Lemote FuLoong mini-PC.
Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2a99859932 upstream.
Since cpufreq_cpu_put() called by __cpufreq_remove_dev() drops the
driver module refcount, __cpufreq_remove_dev() causes that refcount
to become negative for the cpufreq driver after a suspend/resume
cycle.
This is not the only bad thing that happens there, however, because
kobject_put() should only be called for the policy kobject at this
point if the CPU is not the last one for that policy.
Namely, if the given CPU is the last one for that policy, the
policy kobject's refcount should be 1 at this point, as set by
cpufreq_add_dev_interface(), and only needs to be dropped once for
the kobject to go away. This actually happens under the cpu == 1
check, so it need not be done before by cpufreq_cpu_put().
On the other hand, if the given CPU is not the last one for that
policy, this means that cpufreq_add_policy_cpu() has been called
at least once for that policy and cpufreq_cpu_get() has been
called for it too. To balance that cpufreq_cpu_get(), we need to
call cpufreq_cpu_put() in that case.
Thus, to fix the described problem and keep the reference
counters balanced in both cases, move the cpufreq_cpu_get() call
in __cpufreq_remove_dev() to the code path executed only for
CPUs that share the policy with other CPUs.
Reported-and-tested-by: Toralf Förster <toralf.foerster@gmx.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e8d05276f2 upstream.
commit 2f7021a8 "cpufreq: protect 'policy->cpus' from offlining
during __gov_queue_work()" caused a regression in CPU hotplug,
because it lead to a deadlock between cpufreq governor worker thread
and the CPU hotplug writer task.
Lockdep splat corresponding to this deadlock is shown below:
[ 60.277396] ======================================================
[ 60.277400] [ INFO: possible circular locking dependency detected ]
[ 60.277407] 3.10.0-rc7-dbg-01385-g241fd04-dirty #1744 Not tainted
[ 60.277411] -------------------------------------------------------
[ 60.277417] bash/2225 is trying to acquire lock:
[ 60.277422] ((&(&j_cdbs->work)->work)){+.+...}, at: [<ffffffff810621b5>] flush_work+0x5/0x280
[ 60.277444] but task is already holding lock:
[ 60.277449] (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff81042d8b>] cpu_hotplug_begin+0x2b/0x60
[ 60.277465] which lock already depends on the new lock.
[ 60.277472] the existing dependency chain (in reverse order) is:
[ 60.277477] -> #2 (cpu_hotplug.lock){+.+.+.}:
[ 60.277490] [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
[ 60.277503] [<ffffffff815b6157>] mutex_lock_nested+0x67/0x410
[ 60.277514] [<ffffffff81042cbc>] get_online_cpus+0x3c/0x60
[ 60.277522] [<ffffffff814b842a>] gov_queue_work+0x2a/0xb0
[ 60.277532] [<ffffffff814b7891>] cs_dbs_timer+0xc1/0xe0
[ 60.277543] [<ffffffff8106302d>] process_one_work+0x1cd/0x6a0
[ 60.277552] [<ffffffff81063d31>] worker_thread+0x121/0x3a0
[ 60.277560] [<ffffffff8106ae2b>] kthread+0xdb/0xe0
[ 60.277569] [<ffffffff815bb96c>] ret_from_fork+0x7c/0xb0
[ 60.277580] -> #1 (&j_cdbs->timer_mutex){+.+...}:
[ 60.277592] [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
[ 60.277600] [<ffffffff815b6157>] mutex_lock_nested+0x67/0x410
[ 60.277608] [<ffffffff814b785d>] cs_dbs_timer+0x8d/0xe0
[ 60.277616] [<ffffffff8106302d>] process_one_work+0x1cd/0x6a0
[ 60.277624] [<ffffffff81063d31>] worker_thread+0x121/0x3a0
[ 60.277633] [<ffffffff8106ae2b>] kthread+0xdb/0xe0
[ 60.277640] [<ffffffff815bb96c>] ret_from_fork+0x7c/0xb0
[ 60.277649] -> #0 ((&(&j_cdbs->work)->work)){+.+...}:
[ 60.277661] [<ffffffff810ab826>] __lock_acquire+0x1766/0x1d30
[ 60.277669] [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
[ 60.277677] [<ffffffff810621ed>] flush_work+0x3d/0x280
[ 60.277685] [<ffffffff81062d8a>] __cancel_work_timer+0x8a/0x120
[ 60.277693] [<ffffffff81062e53>] cancel_delayed_work_sync+0x13/0x20
[ 60.277701] [<ffffffff814b89d9>] cpufreq_governor_dbs+0x529/0x6f0
[ 60.277709] [<ffffffff814b76a7>] cs_cpufreq_governor_dbs+0x17/0x20
[ 60.277719] [<ffffffff814b5df8>] __cpufreq_governor+0x48/0x100
[ 60.277728] [<ffffffff814b6b80>] __cpufreq_remove_dev.isra.14+0x80/0x3c0
[ 60.277737] [<ffffffff815adc0d>] cpufreq_cpu_callback+0x38/0x4c
[ 60.277747] [<ffffffff81071a4d>] notifier_call_chain+0x5d/0x110
[ 60.277759] [<ffffffff81071b0e>] __raw_notifier_call_chain+0xe/0x10
[ 60.277768] [<ffffffff815a0a68>] _cpu_down+0x88/0x330
[ 60.277779] [<ffffffff815a0d46>] cpu_down+0x36/0x50
[ 60.277788] [<ffffffff815a2748>] store_online+0x98/0xd0
[ 60.277796] [<ffffffff81452a28>] dev_attr_store+0x18/0x30
[ 60.277806] [<ffffffff811d9edb>] sysfs_write_file+0xdb/0x150
[ 60.277818] [<ffffffff8116806d>] vfs_write+0xbd/0x1f0
[ 60.277826] [<ffffffff811686fc>] SyS_write+0x4c/0xa0
[ 60.277834] [<ffffffff815bbbbe>] tracesys+0xd0/0xd5
[ 60.277842] other info that might help us debug this:
[ 60.277848] Chain exists of:
(&(&j_cdbs->work)->work) --> &j_cdbs->timer_mutex --> cpu_hotplug.lock
[ 60.277864] Possible unsafe locking scenario:
[ 60.277869] CPU0 CPU1
[ 60.277873] ---- ----
[ 60.277877] lock(cpu_hotplug.lock);
[ 60.277885] lock(&j_cdbs->timer_mutex);
[ 60.277892] lock(cpu_hotplug.lock);
[ 60.277900] lock((&(&j_cdbs->work)->work));
[ 60.277907] *** DEADLOCK ***
[ 60.277915] 6 locks held by bash/2225:
[ 60.277919] #0: (sb_writers#6){.+.+.+}, at: [<ffffffff81168173>] vfs_write+0x1c3/0x1f0
[ 60.277937] #1: (&buffer->mutex){+.+.+.}, at: [<ffffffff811d9e3c>] sysfs_write_file+0x3c/0x150
[ 60.277954] #2: (s_active#61){.+.+.+}, at: [<ffffffff811d9ec3>] sysfs_write_file+0xc3/0x150
[ 60.277972] #3: (x86_cpu_hotplug_driver_mutex){+.+...}, at: [<ffffffff81024cf7>] cpu_hotplug_driver_lock+0x17/0x20
[ 60.277990] #4: (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff815a0d32>] cpu_down+0x22/0x50
[ 60.278007] #5: (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff81042d8b>] cpu_hotplug_begin+0x2b/0x60
[ 60.278023] stack backtrace:
[ 60.278031] CPU: 3 PID: 2225 Comm: bash Not tainted 3.10.0-rc7-dbg-01385-g241fd04-dirty #1744
[ 60.278037] Hardware name: Acer Aspire 5741G /Aspire 5741G , BIOS V1.20 02/08/2011
[ 60.278042] ffffffff8204e110 ffff88014df6b9f8 ffffffff815b3d90 ffff88014df6ba38
[ 60.278055] ffffffff815b0a8d ffff880150ed3f60 ffff880150ed4770 3871c4002c8980b2
[ 60.278068] ffff880150ed4748 ffff880150ed4770 ffff880150ed3f60 ffff88014df6bb00
[ 60.278081] Call Trace:
[ 60.278091] [<ffffffff815b3d90>] dump_stack+0x19/0x1b
[ 60.278101] [<ffffffff815b0a8d>] print_circular_bug+0x2b6/0x2c5
[ 60.278111] [<ffffffff810ab826>] __lock_acquire+0x1766/0x1d30
[ 60.278123] [<ffffffff81067e08>] ? __kernel_text_address+0x58/0x80
[ 60.278134] [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
[ 60.278142] [<ffffffff810621b5>] ? flush_work+0x5/0x280
[ 60.278151] [<ffffffff810621ed>] flush_work+0x3d/0x280
[ 60.278159] [<ffffffff810621b5>] ? flush_work+0x5/0x280
[ 60.278169] [<ffffffff810a9b14>] ? mark_held_locks+0x94/0x140
[ 60.278178] [<ffffffff81062d77>] ? __cancel_work_timer+0x77/0x120
[ 60.278188] [<ffffffff810a9cbd>] ? trace_hardirqs_on_caller+0xfd/0x1c0
[ 60.278196] [<ffffffff81062d8a>] __cancel_work_timer+0x8a/0x120
[ 60.278206] [<ffffffff81062e53>] cancel_delayed_work_sync+0x13/0x20
[ 60.278214] [<ffffffff814b89d9>] cpufreq_governor_dbs+0x529/0x6f0
[ 60.278225] [<ffffffff814b76a7>] cs_cpufreq_governor_dbs+0x17/0x20
[ 60.278234] [<ffffffff814b5df8>] __cpufreq_governor+0x48/0x100
[ 60.278244] [<ffffffff814b6b80>] __cpufreq_remove_dev.isra.14+0x80/0x3c0
[ 60.278255] [<ffffffff815adc0d>] cpufreq_cpu_callback+0x38/0x4c
[ 60.278265] [<ffffffff81071a4d>] notifier_call_chain+0x5d/0x110
[ 60.278275] [<ffffffff81071b0e>] __raw_notifier_call_chain+0xe/0x10
[ 60.278284] [<ffffffff815a0a68>] _cpu_down+0x88/0x330
[ 60.278292] [<ffffffff81024cf7>] ? cpu_hotplug_driver_lock+0x17/0x20
[ 60.278302] [<ffffffff815a0d46>] cpu_down+0x36/0x50
[ 60.278311] [<ffffffff815a2748>] store_online+0x98/0xd0
[ 60.278320] [<ffffffff81452a28>] dev_attr_store+0x18/0x30
[ 60.278329] [<ffffffff811d9edb>] sysfs_write_file+0xdb/0x150
[ 60.278337] [<ffffffff8116806d>] vfs_write+0xbd/0x1f0
[ 60.278347] [<ffffffff81185950>] ? fget_light+0x320/0x4b0
[ 60.278355] [<ffffffff811686fc>] SyS_write+0x4c/0xa0
[ 60.278364] [<ffffffff815bbbbe>] tracesys+0xd0/0xd5
[ 60.280582] smpboot: CPU 1 is now offline
The intention of that commit was to avoid warnings during CPU
hotplug, which indicated that offline CPUs were getting IPIs from the
cpufreq governor's work items. But the real root-cause of that
problem was commit a66b2e5 (cpufreq: Preserve sysfs files across
suspend/resume) because it totally skipped all the cpufreq callbacks
during CPU hotplug in the suspend/resume path, and hence it never
actually shut down the cpufreq governor's worker threads during CPU
offline in the suspend/resume path.
Reflecting back, the reason why we never suspected that commit as the
root-cause earlier, was that the original issue was reported with
just the halt command and nobody had brought in suspend/resume to the
equation.
The reason for _that_ in turn, as it turns out, is that earlier
halt/shutdown was being done by disabling non-boot CPUs while tasks
were frozen, just like suspend/resume.... but commit cf7df378a
(reboot: migrate shutdown/reboot to boot cpu) which came somewhere
along that very same time changed that logic: shutdown/halt no longer
takes CPUs offline. Thus, the test-cases for reproducing the bug
were vastly different and thus we went totally off the trail.
Overall, it was one hell of a confusion with so many commits
affecting each other and also affecting the symptoms of the problems
in subtle ways. Finally, now since the original problematic commit
(a66b2e5) has been completely reverted, revert this intermediate fix
too (2f7021a8), to fix the CPU hotplug deadlock. Phew!
Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reported-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Tested-by: Peter Wu <lekensteyn@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit aae760ed21 upstream.
commit a66b2e (cpufreq: Preserve sysfs files across suspend/resume)
has unfortunately caused several things in the cpufreq subsystem to
break subtly after a suspend/resume cycle.
The intention of that patch was to retain the file permissions of the
cpufreq related sysfs files across suspend/resume. To achieve that,
the commit completely removed the calls to cpufreq_add_dev() and
__cpufreq_remove_dev() during suspend/resume transitions. But the
problem is that those functions do 2 kinds of things:
1. Low-level initialization/tear-down that are critical to the
correct functioning of cpufreq-core.
2. Kobject and sysfs related initialization/teardown.
Ideally we should have reorganized the code to cleanly separate these
two responsibilities, and skipped only the sysfs related parts during
suspend/resume. Since we skipped the entire callbacks instead (which
also included some CPU and cpufreq-specific critical components),
cpufreq subsystem started behaving erratically after suspend/resume.
So revert the commit to fix the regression. We'll revisit and address
the original goal of that commit separately, since it involves quite a
bit of careful code reorganization and appears to be non-trivial.
(While reverting the commit, note that another commit f51e1eb
(cpufreq: Fix cpufreq regression after suspend/resume) already
reverted part of the original set of changes. So revert only the
remaining ones).
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f51e1eb63d upstream.
Toralf Förster reported that the cpufreq ondemand governor behaves erratically
(doesn't scale well) after a suspend/resume cycle. The problem was that the
cpufreq subsystem's idea of the cpu frequencies differed from the actual
frequencies set in the hardware after a suspend/resume cycle. Toralf bisected
the problem to commit a66b2e5 (cpufreq: Preserve sysfs files across
suspend/resume).
Among other (harmless) things, that commit skipped the call to
cpufreq_update_policy() in the resume path. But cpufreq_update_policy() plays
an important role during resume, because it is responsible for checking if
the BIOS changed the cpu frequencies behind our back and resynchronize the
cpufreq subsystem's knowledge of the cpu frequencies, and update them
accordingly.
So, restore the call to cpufreq_update_policy() in the resume path to fix
the cpufreq regression.
Reported-and-tested-by: Toralf Förster <toralf.foerster@gmx.de>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When initializing the default powersave_bias value, we need to first
make sure that this policy is running the ondemand governor.
Reported-and-tested-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Jacob Shin <jacob.shin@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
clk_set_rate() isn't supposed to accept approximate frequencies, instead
a supported frequency should be obtained from clk_round_rate() and then
used to set the clock.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit 4b31e774 (Always set P-state on initialization) fixed bug
#4634 and caused the driver to always set the target P-State at
least once since the initial P-State may not be the desired one.
Commit 5a1c0228 (cpufreq: Avoid calling cpufreq driver's target()
routine if target_freq == policy->cur) caused a regression in
this behavior.
This fixes the regression by setting policy->cur based on the CPU's
target frequency rather than the CPU's current reported frequency
(which may be different). This means that the P-State will be set
initially if the CPU's target frequency is different from the
governor's target frequency.
This fixes an issue where setting the default governor to
performance wouldn't correctly enable turbo mode on all cores.
Signed-off-by: Ross Lagerwall <rosslagerwall@gmail.com>
Reviewed-by: Len Brown <len.brown@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: 3.8+ <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Pull power management and ACPI fixes from Rafael Wysocki:
- Additional CPU ID for the intel_pstate driver from Dirk Brandewie.
- More cpufreq fixes related to ARM big.LITTLE support and locking from
Viresh Kumar.
- VIA C7 cpufreq build fix from Rafał Bilski.
- ACPI power management fix making it possible to use device power
states regardless of the CONFIG_PM setting from Rafael J Wysocki.
- New ACPI video blacklist item from Bastian Triller.
* tag 'pm+acpi-3.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / video: Add "Asus UL30A" to ACPI video detect blacklist
cpufreq: arm_big_little_dt: Instantiate as platform_driver
cpufreq: arm_big_little_dt: Register driver only if DT has valid data
cpufreq / e_powersaver: Fix linker error when ACPI processor is a module
cpufreq / intel_pstate: Add additional supported CPU ID
cpufreq: Drop rwsem lock around CPUFREQ_GOV_POLICY_EXIT
ACPI / PM: Allow device power states to be used for CONFIG_PM unset
As multiplatform build is being adopted by more and more ARM platforms, initcall
function should be used very carefully. For example, when both arm_big_little_dt
and cpufreq-cpu0 drivers are compiled in, arm_big_little_dt driver may try to
register even if we had platform device for cpufreq-cpu0 registered.
To eliminate this undesired the effect, the patch changes arm_big_little_dt
driver to have it instantiated as a platform_driver. Then it will only run on
platforms that create the platform_device "arm-bL-cpufreq-dt".
Reported-and-tested-by: Rob Herring <robherring2@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
If arm_big_little_dt driver is enabled, then it will always try to register with
big LITTLE cpufreq core driver. In case DT doesn't have relevant data for cpu
nodes, i.e. operating points aren't present, then we should exit early and
shouldn't register with big LITTLE cpufreq core driver. Otherwise we will fail
continuously from the driver->init() routine.
This patch fixes this issue.
Reported-and-tested-by: Jon Medhurst <tixy@linaro.org>
Reviewed-by: Jon Medhurst <tixy@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
on i386:
CONFIG_ACPI_PROCESSOR=m
CONFIG_X86_E_POWERSAVER=y
drivers/built-in.o: In function `eps_cpu_init.part.8':
e_powersaver.c:(.text.unlikely+0x2243): undefined reference to `acpi_processor_register_performance'
e_powersaver.c:(.text.unlikely+0x22a2): undefined reference to `acpi_processor_unregister_performance'
e_powersaver.c:(.text.unlikely+0x246b): undefined reference to `acpi_processor_get_bios_limit'
X86_E_POWERSAVER should also depend on ACPI_PROCESSOR.
Signed-off-by: Rafal Bilski <rafalbilski@interia.pl>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The file permissions of cpufreq per-cpu sysfs files are not preserved
across suspend/resume because we internally go through the CPU
Hotplug path which reinitializes the file permissions on CPU online.
But the user is not supposed to know that we are using CPU hotplug
internally within suspend/resume (IOW, the kernel should not silently
wreck the user-set file permissions across a suspend cycle).
Therefore, we need to preserve the file permissions as they are
across suspend/resume.
The simplest way to achieve that is to just not touch the sysfs files
at all - ie., just ignore the CPU hotplug notifications in the
suspend/resume path (_FROZEN) in the cpufreq hotplug callback.
Reported-by: Robert Jarzmik <robert.jarzmik@intel.com>
Reported-by: Durgadoss R <durgadoss.r@intel.com>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
I don't see how the virtual address of the tuners pointer would be of
any help to anyone so remove it.
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
devm_ioremap_resource does sanity checks on the given resource. No need to
duplicate this in the driver.
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The driver can no longer be built as a module remove the compile fence
around cpufreq tracing call.
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The ffmpeg benchmark in the phoronix test suite has threads on
multiple cores that rely on the progress on of threads on other cores
and ping pong back and forth fast enough to make the core appear less
busy than it "should" be. If the core has been at minimum p-state for
a while bump the pstate up to kick the core to see if it is in this
ping pong state. If the core is truly idle the p-state will be
reduced at the next sample time. If the core makes more progress it
will send more work to the thread bringing both threads out of the
ping pong scenario and the p-state will be selected normally.
This fixes a performance regression of approximately 30%
Cc: 3.9+ <stable@vger.kernel.org>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
There are two ways that the maximum p-state can be clamped, via a
policy change and via the sysfs file.
The acpi-thermal driver adjusts the p-state policy in response to
thermal events. These changes override the users settings at the
moment.
Use the lowest of the two requested values this ensures that we will
not exceed the requested pstate from either mechanism.
Reported-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: 3.9+ <stable@vger.kernel.org>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Idle time is taken into account in the APERF/MPERF ratio calculation
there is no reason for the driver to track it seperately. This
reduces the work in the driver and makes the code more readable.
Removal of the tracking of sample duration removes the possibility of
the divide by zero exception when the duration is sub 1us
References: https://bugzilla.kernel.org/show_bug.cgi?id=56691
Reported-by: Mike Lothian <mike@fireburn.co.uk>
Cc: 3.9+ <stable@vger.kernel.org>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Kconfig dependecies for ARM SA11xx drivers are incorrect, so fix
them.
[rjw: Changelog]
Signed-off-by: Alexander Shiyan <shc_work@mail.ru>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This fixes usage of "depends on" and "select" options in Kconfig for ARM big
LITTLE cpufreq driver. Otherwise we get these warnings:
warning: (ARM_DT_BL_CPUFREQ) selects ARM_BIG_LITTLE_CPUFREQ which
has unmet direct dependencies (ARCH_HAS_CPUFREQ && CPU_FREQ && ARM &&
ARM_CPU_TOPOLOGY)
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
With commit 1e4b545, regulator_get will now return -EPROBE_DEFER
when the cpu0-supply node is present, but the regulator is not yet
registered.
It is possible for this to occur when the regulator registration
by itself might be defered due to some dependent interface not yet
instantiated. For example: an regulator which uses I2C and GPIO might
need both systems available before proceeding, in this case, the
regulator might defer it's registration.
However, the cpufreq-cpu0 driver assumes that any un-successful
return result is equivalent of failure.
When the regulator_get returns failure other than -EPROBE_DEFER, it
makes sense to assume that supply node is not present and proceed
with the assumption that only clock control is necessary in the
platform.
With this change, we can now handle the following conditions:
a) cpu0-supply binding is not present, regulator_get will return
appropriate error result, resulting in cpufreq-cpu0 driver
controlling just the clock.
b) cpu0-supply binding is present, regulator_get returns
-EPROBE_DEFER, we retry resulting in cpufreq-cpu0 driver
registering later once the regulator is available.
c) cpu0-supply binding is present, regulator_get returns
-EPROBE_DEFER, however, regulator never registers, we retry until
cpufreq-cpu0 driver fails to register pointing at device tree
information bug. However, in this case, the fact that
cpufreq-cpu0 operates with clock only when the DT binding clearly
indicates need of a supply is a bug of it's own.
d) cpu0-supply gets an regulator at probe - cpufreq-cpu0 driver
controls both the clock and regulator
Signed-off-by: Nishanth Menon <nm@ti.com>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
We must call __cpufreq_governor(data, CPUFREQ_GOV_POLICY_EXIT) before
calling cpufreq_cpu_put(data), so that policy kobject have valid
fields. Otherwise, removing last online cpu of policy->cpus causes
this crash for ondemand/conservative governor.
[<c00fb076>] (sysfs_find_dirent+0xe/0xa8) from [<c00fb1bd>] (sysfs_get_dirent+0x21/0x58)
[<c00fb1bd>] (sysfs_get_dirent+0x21/0x58) from [<c00fc259>] (sysfs_remove_group+0x85/0xbc)
[<c00fc259>] (sysfs_remove_group+0x85/0xbc) from [<c02faad9>] (cpufreq_governor_dbs+0x369/0x4a0)
[<c02faad9>] (cpufreq_governor_dbs+0x369/0x4a0) from [<c02f66d7>] (__cpufreq_governor+0x2b/0x8c)
[<c02f66d7>] (__cpufreq_governor+0x2b/0x8c) from [<c02f6893>] (__cpufreq_remove_dev.isra.12+0x15b/0x250)
[<c02f6893>] (__cpufreq_remove_dev.isra.12+0x15b/0x250) from [<c03e91c7>] (cpufreq_cpu_callback+0x2f/0x3c)
[<c03e91c7>] (cpufreq_cpu_callback+0x2f/0x3c) from [<c0036fe1>] (notifier_call_chain+0x45/0x54)
[<c0036fe1>] (notifier_call_chain+0x45/0x54) from [<c001e611>] (__cpu_notify+0x1d/0x34)
[<c001e611>] (__cpu_notify+0x1d/0x34) from [<c03e5833>] (_cpu_down+0x63/0x1ac)
[<c03e5833>] (_cpu_down+0x63/0x1ac) from [<c03e5997>] (cpu_down+0x1b/0x30)
[<c03e5997>] (cpu_down+0x1b/0x30) from [<c03e60eb>] (store_online+0x27/0x54)
[<c03e60eb>] (store_online+0x27/0x54) from [<c0295629>] (dev_attr_store+0x11/0x18)
[<c0295629>] (dev_attr_store+0x11/0x18) from [<c00f9edd>] (sysfs_write_file+0xed/0x114)
[<c00f9edd>] (sysfs_write_file+0xed/0x114) from [<c00b42a9>] (vfs_write+0x65/0xd8)
[<c00b42a9>] (vfs_write+0x65/0xd8) from [<c00b4523>] (sys_write+0x2f/0x50)
[<c00b4523>] (sys_write+0x2f/0x50) from [<c000cdc1>] (ret_fast_syscall+0x1/0x52)
Of course this only impacted drivers which have
have_governor_per_policy set to true. i.e. big LITTLE cpufreq driver.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
There are two types of INIT/EXIT activities that we need to do for
governors:
- Done only once per governor (doesn't depend how many instances of
the governor there are). eg: cpufreq_register_notifier() for
conservative governor.
- Done per governor instance, eg: sysfs_{create|remove}_group().
There were some corner cases where current code isn't able to handle
them separately and so failing for some test cases.
We use two separate variables now for keeping track of above two
requirements.
- governor->initialized for first one
- dbs_data->usage_count for per governor instance
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The message printed at the end of driver->init() doesn't include the
"cpufreq" string at all and so is difficult to find in dmesg. Add
function name to that message to clearly state where the message is
coming from.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>