linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-07 19:30:30 +09:00

Author	SHA1	Message	Date
Archit Taneja	11354dd58d	OMAPDSS/OMAP_VOUT: Fix incorrect OMAP3-alpha compatibility setting On OMAP3, in order to enable alpha blending for LCD and TV managers, we needed to set LCDALPHABLENDERENABLE/TVALPHABLENDERENABLE bits in DISPC_CONFIG. On OMAP4, alpha blending is always enabled by default, if the above bits are set, we switch to an OMAP3 compatibility mode where the zorder values in the pipeline attribute registers are ignored and a fixed priority is configured. Rename the manager_info member "alpha_enabled" to "partial_alpha_enabled" for more clarity. Introduce two dss_features FEAT_ALPHA_FIXED_ZORDER and FEAT_ALPHA_FREE_ZORDER which represent OMAP3-alpha compatibility mode and OMAP4 alpha mode respectively. Introduce an overlay cap for ZORDER. The DSS2 user is expected to check for the ZORDER cap, if an overlay doesn't have this cap, the user is expected to set the parameter partial_alpha_enabled. If the overlay has ZORDER cap, the DSS2 user can assume that alpha blending is already enabled. Don't support OMAP3 compatibility mode for now. Trying to read/write to alpha_blending_enabled sysfs attribute issues a warning for OMAP4 and does not set the LCDALPHABLENDERENABLE/TVALPHABLENDERENABLE bits. Change alpha_enabled to partial_alpha_enabled in the omap_vout driver. Use overlay cap "OMAP_DSS_OVL_CAP_GLOBAL_ALPHA" to check if overlay supports alpha blending or not. Replace this with checks for VIDEO1 pipeline. Cc: linux-media@vger.kernel.org Cc: Lajos Molnar <molnar@ti.com> Signed-off-by: Archit Taneja <archit@ti.com> Acked-by: Vaibhav Hiremath <hvaibhav@ti.com> Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>	2011-10-03 16:51:54 +03:00
Pierre-Louis Bossart	14bc52b8fe	ALSA: hda/hdmi: expose ELD control Applications may want to read ELD information to understand what codecs are supported on the HDMI receiver and handle the a-v delay for better lip-sync. ELD information is exposed in a device-specific IFACE_PCM kcontrol. Tested both with amixer and PulseAudio; with a corresponding patch passthrough modes are enabled automagically. ELD control size is set to zero in case of errors or wrong configurations. No notifications are implemented for now, it is expected that jack detection is used to reconfigure the audio outputs. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>	2011-10-03 15:48:12 +02:00
Marc Zyngier	1e7c5fd294	genirq: percpu: allow interrupt type to be set at enable time As request_percpu_irq() doesn't allow for a percpu interrupt to have its type configured (it is generally impossible to configure it on all CPUs at once), add a 'type' argument to enable_percpu_irq(). This allows some low-level, board specific init code to be switched to a generic API. [ tglx: Added WARN_ON argument ] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Abhijeet Dharmapurikar <adharmap@codeaurora.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-10-03 15:35:27 +02:00
Marc Zyngier	31d9d9b6d8	genirq: Add support for per-cpu dev_id interrupts The ARM GIC interrupt controller offers per CPU interrupts (PPIs), which are usually used to connect local timers to each core. Each CPU has its own private interface to the GIC, and only sees the PPIs that are directly connect to it. While these timers are separate devices and have a separate interrupt line to a core, they all use the same IRQ number. For these devices, request_irq() is not the right API as it assumes that an IRQ number is visible by a number of CPUs (through the affinity setting), but makes it very awkward to express that an IRQ number can be handled by all CPUs, and yet be a different interrupt line on each CPU, requiring a different dev_id cookie to be passed back to the handler. The _percpu_irq() functions is designed to overcome these limitations, by providing a per-cpu dev_id vector: int request_percpu_irq(unsigned int irq, irq_handler_t handler, const char devname, void __percpu percpu_dev_id); void free_percpu_irq(unsigned int, void __percpu ); int setup_percpu_irq(unsigned int irq, struct irqaction new); void remove_percpu_irq(unsigned int irq, struct irqaction act); void enable_percpu_irq(unsigned int irq); void disable_percpu_irq(unsigned int irq); The API has a number of limitations: - no interrupt sharing - no threading - common handler across all the CPUs Once the interrupt is requested using setup_percpu_irq() or request_percpu_irq(), it must be enabled by each core that wishes its local interrupt to be delivered. Based on an initial patch by Thomas Gleixner. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1316793788-14500-2-git-send-email-marc.zyngier@arm.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-10-03 15:35:26 +02:00
Axel Lin	9b5999b1bc	ASoC: Fix setting update bits for WM8741_DACRMSB_ATTENUATION After checking the code and datasheet, I think what we want in the second snd_soc_update_bits call is to update WM8741_DACRMSB_ATTENUATION register instead of WM8741_DACRLSB_ATTENUATION. Signed-off-by: Axel Lin <axel.lin@gmail.com> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>	2011-10-03 14:15:40 +01:00
Axel Lin	f6a74ced46	ASoC: txx9: Add __exit_p at necessary place We have __exit annotation for txx9aclc_generic_remove(), thus add __devexit_p to wrap it. Signed-off-by: Axel Lin <axel.lin@gmail.com> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>	2011-10-03 14:15:40 +01:00
Axel Lin	1c6927f872	ASoC: Staticise ep93xx_ac97_dai Signed-off-by: Axel Lin <axel.lin@gmail.com> Acked-by: Mika Westerberg <mika.westerberg@iki.fi> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>	2011-10-03 14:15:32 +01:00
Axel Lin	2e4891a5ec	ASoC: Staticise simtec_audio_resume() It is exported via resume callback of struct dev_pm_ops rather than referenced directly and so should be staticised. Signed-off-by: Axel Lin <axel.lin@gmail.com> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>	2011-10-03 14:15:23 +01:00
Axel Lin	43419b80fa	ASoC: Remove needless codec->dapm.bias_level assignment to SND_SOC_BIAS_OFF This assignment is done by the snd_soc_register_codec so there is no need to redo it in probe function of a codec driver. Signed-off-by: Axel Lin <axel.lin@gmail.com> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>	2011-10-03 14:15:22 +01:00
Wu Fengguang	b00949aa2d	writeback: per-bdi background threshold One thing puzzled me is that in JBOD case, the per-disk writeout performance is smaller than the corresponding single-disk case even when they have comparable bdi_thresh. Tracing shows find that in single disk case, bdi_writeback is always kept high while in JBOD case, it could drop low from time to time and correspondingly bdi_reclaimable could sometimes rush high. The fix is to watch bdi_reclaimable and kick background writeback as soon as it goes high. This resembles the global background threshold but in per-bdi manner. The trick is, as long as bdi_reclaimable does not go high, bdi_writeback naturally won't go low because bdi_reclaimable+bdi_writeback ~= bdi_thresh. With less fluctuated writeback pages, JBOD performance is observed to increase noticeably in various cases. vmstat:nr_written values before/after patch: 3.1.0-rc4-wo-underrun+ 3.1.0-rc4-bgthresh3+ ------------------------ ------------------------ 125596480 +25.9% 158179363 JBOD-10HDD-16G/ext4-100dd-1M-24p-16384M-20:10-X 61790815 +110.4% 130032231 JBOD-10HDD-16G/ext4-10dd-1M-24p-16384M-20:10-X 58853546 -0.1% 58823828 JBOD-10HDD-16G/ext4-1dd-1M-24p-16384M-20:10-X 110159811 +24.7% 137355377 JBOD-10HDD-16G/xfs-100dd-1M-24p-16384M-20:10-X 69544762 +10.8% 77080047 JBOD-10HDD-16G/xfs-10dd-1M-24p-16384M-20:10-X 50644862 +0.5% 50890006 JBOD-10HDD-16G/xfs-1dd-1M-24p-16384M-20:10-X 42677090 +28.0% 54643527 JBOD-10HDD-thresh=100M/ext4-100dd-1M-24p-16384M-100M:10-X 47491324 +13.3% 53785605 JBOD-10HDD-thresh=100M/ext4-10dd-1M-24p-16384M-100M:10-X 52548986 +0.9% 53001031 JBOD-10HDD-thresh=100M/ext4-1dd-1M-24p-16384M-100M:10-X 26783091 +36.8% 36650248 JBOD-10HDD-thresh=100M/xfs-100dd-1M-24p-16384M-100M:10-X 35526347 +14.0% 40492312 JBOD-10HDD-thresh=100M/xfs-10dd-1M-24p-16384M-100M:10-X 44670723 -1.1% `44177606` JBOD-10HDD-thresh=100M/xfs-1dd-1M-24p-16384M-100M:10-X 127996037 +22.4% 156719990 JBOD-10HDD-thresh=2G/ext4-100dd-1M-24p-16384M-2048M:10-X 57518856 +3.8% 59677625 JBOD-10HDD-thresh=2G/ext4-10dd-1M-24p-16384M-2048M:10-X 51919909 +12.2% 58269894 JBOD-10HDD-thresh=2G/ext4-1dd-1M-24p-16384M-2048M:10-X 86410514 +79.0% 154660433 JBOD-10HDD-thresh=2G/xfs-100dd-1M-24p-16384M-2048M:10-X 40132519 +38.6% 55617893 JBOD-10HDD-thresh=2G/xfs-10dd-1M-24p-16384M-2048M:10-X 48423248 +7.5% 52042927 JBOD-10HDD-thresh=2G/xfs-1dd-1M-24p-16384M-2048M:10-X 206041046 +44.1% 296846536 JBOD-10HDD-thresh=4G/xfs-100dd-1M-24p-16384M-4096M:10-X 72312903 -19.4% 58272885 JBOD-10HDD-thresh=4G/xfs-10dd-1M-24p-16384M-4096M:10-X 50635672 -0.5% 50384787 JBOD-10HDD-thresh=4G/xfs-1dd-1M-24p-16384M-4096M:10-X 68308534 +115.7% 147324758 JBOD-10HDD-thresh=800M/ext4-100dd-1M-24p-16384M-800M:10-X 57882933 +14.5% 66269621 JBOD-10HDD-thresh=800M/ext4-10dd-1M-24p-16384M-800M:10-X 52183472 +12.8% 58855181 JBOD-10HDD-thresh=800M/ext4-1dd-1M-24p-16384M-800M:10-X 53788956 +94.2% 104460352 JBOD-10HDD-thresh=800M/xfs-100dd-1M-24p-16384M-800M:10-X 44493342 +35.5% 60298210 JBOD-10HDD-thresh=800M/xfs-10dd-1M-24p-16384M-800M:10-X 42641209 +18.9% 50681038 JBOD-10HDD-thresh=800M/xfs-1dd-1M-24p-16384M-800M:10-X Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>	2011-10-03 21:08:58 +08:00
Wu Fengguang	8927f66c4e	writeback: dirty position control - bdi reserve area Keep a minimal pool of dirty pages for each bdi, so that the disk IO queues won't underrun. Also gently increase a small bdi_thresh to avoid it stuck in 0 for some light dirtied bdi. It's particularly useful for JBOD and small memory system. It may result in (pos_ratio > 1) at the setpoint and push the dirty pages high. This is more or less intended because the bdi is in the danger of IO queue underflow. Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>	2011-10-03 21:08:58 +08:00
Wu Fengguang	57fc978cfb	writeback: control dirty pause time The dirty pause time shall ultimately be controlled by adjusting nr_dirtied_pause, since there is relationship pause = pages_dirtied / task_ratelimit Assuming pages_dirtied ~= nr_dirtied_pause task_ratelimit ~= dirty_ratelimit We get nr_dirtied_pause ~= dirty_ratelimit * desired_pause Here dirty_ratelimit is preferred over task_ratelimit because it's more stable. It's also important to limit possible large transitional errors: - bw is changing quickly - pages_dirtied << nr_dirtied_pause on entering dirty exceeded area - pages_dirtied >> nr_dirtied_pause on btrfs (to be improved by a separate fix, but still expect non-trivial errors) So we end up using the above formula inside clamp_val(). The best test case for this code is to run 100 "dd bs=4M" tasks on btrfs and check its pause time distribution. Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>	2011-10-03 21:08:58 +08:00
Wu Fengguang	c8462cc9de	writeback: limit max dirty pause time Apply two policies to scale down the max pause time for 1) small number of concurrent dirtiers 2) small memory system (comparing to storage bandwidth) MAX_PAUSE=200ms may only be suitable for high end servers with lots of concurrent dirtiers, where the large pause time can reduce much overheads. Otherwise, smaller pause time is desirable whenever possible, so as to get good responsiveness and smooth user experiences. It's actually required for good disk utilization in the case when all the dirty pages can be synced to disk within MAX_PAUSE=200ms. Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>	2011-10-03 21:08:57 +08:00
Wu Fengguang	143dfe8611	writeback: IO-less balance_dirty_pages() As proposed by Chris, Dave and Jan, don't start foreground writeback IO inside balance_dirty_pages(). Instead, simply let it idle sleep for some time to throttle the dirtying task. In the mean while, kick off the per-bdi flusher thread to do background writeback IO. RATIONALS ========= - disk seeks on concurrent writeback of multiple inodes (Dave Chinner) If every thread doing writes and being throttled start foreground writeback, it leads to N IO submitters from at least N different inodes at the same time, end up with N different sets of IO being issued with potentially zero locality to each other, resulting in much lower elevator sort/merge efficiency and hence we seek the disk all over the place to service the different sets of IO. OTOH, if there is only one submission thread, it doesn't jump between inodes in the same way when congestion clears - it keeps writing to the same inode, resulting in large related chunks of sequential IOs being issued to the disk. This is more efficient than the above foreground writeback because the elevator works better and the disk seeks less. - lock contention and cache bouncing on concurrent IO submitters (Dave Chinner) With this patchset, the fs_mark benchmark on a 12-drive software RAID0 goes from CPU bound to IO bound, freeing "3-4 CPUs worth of spinlock contention". * "CPU usage has dropped by ~55%", "it certainly appears that most of the CPU time saving comes from the removal of contention on the inode_wb_list_lock" (IMHO at least 10% comes from the reduction of cacheline bouncing, because the new code is able to call much less frequently into balance_dirty_pages() and hence access the global page states) * the user space "App overhead" is reduced by 20%, by avoiding the cacheline pollution by the complex writeback code path * "for a ~5% throughput reduction", "the number of write IOs have dropped by ~25%", and the elapsed time reduced from 41:42.17 to 40:53.23. * On a simple test of 100 dd, it reduces the CPU %system time from 30% to 3%, and improves IO throughput from 38MB/s to 42MB/s. - IO size too small for fast arrays and too large for slow USB sticks The write_chunk used by current balance_dirty_pages() cannot be directly set to some large value (eg. 128MB) for better IO efficiency. Because it could lead to more than 1 second user perceivable stalls. Even the current 4MB write size may be too large for slow USB sticks. The fact that balance_dirty_pages() starts IO on itself couples the IO size to wait time, which makes it hard to do suitable IO size while keeping the wait time under control. Now it's possible to increase writeback chunk size proportional to the disk bandwidth. In a simple test of 50 dd's on XFS, 1-HDD, 3GB ram, the larger writeback size dramatically reduces the seek count to 1/10 (far beyond my expectation) and improves the write throughput by 24%. - long block time in balance_dirty_pages() hurts desktop responsiveness Many of us may have the experience: it often takes a couple of seconds or even long time to stop a heavy writing dd/cp/tar command with Ctrl-C or "kill -9". - IO pipeline broken by bumpy write() progress There are a broad class of "loop {read(buf); write(buf);}" applications whose read() pipeline will be under-utilized or even come to a stop if the write()s have long latencies _or_ don't progress in a constant rate. The current threshold based throttling inherently transfers the large low level IO completion fluctuations to bumpy application write()s, and further deteriorates with increasing number of dirtiers and/or bdi's. For example, when doing 50 dd's + 1 remote rsync to an XFS partition, the rsync progresses very bumpy in legacy kernel, and throughput is improved by 67% by this patchset. (plus the larger write chunk size, it will be 93% speedup). The new rate based throttling can support 1000+ dd's with excellent smoothness, low latency and low overheads. For the above reasons, it's much better to do IO-less and low latency pauses in balance_dirty_pages(). Jan Kara, Dave Chinner and me explored the scheme to let balance_dirty_pages() wait for enough writeback IO completions to safeguard the dirty limit. However it's found to have two problems: - in large NUMA systems, the per-cpu counters may have big accounting errors, leading to big throttle wait time and jitters. - NFS may kill large amount of unstable pages with one single COMMIT. Because NFS server serves COMMIT with expensive fsync() IOs, it is desirable to delay and reduce the number of COMMITs. So it's not likely to optimize away such kind of bursty IO completions, and the resulted large (and tiny) stall times in IO completion based throttling. So here is a pause time oriented approach, which tries to control the pause time in each balance_dirty_pages() invocations, by controlling the number of pages dirtied before calling balance_dirty_pages(), for smooth and efficient dirty throttling: - avoid useless (eg. zero pause time) balance_dirty_pages() calls - avoid too small pause time (less than 4ms, which burns CPU power) - avoid too large pause time (more than 200ms, which hurts responsiveness) - avoid big fluctuations of pause times It can control pause times at will. The default policy (in a followup patch) will be to do ~10ms pauses in 1-dd case, and increase to ~100ms in 1000-dd case. BEHAVIOR CHANGE =============== (1) dirty threshold Users will notice that the applications will get throttled once crossing the global (background + dirty)/2=15% threshold, and then balanced around 17.5%. Before patch, the behavior is to just throttle it at 20% dirtyable memory in 1-dd case. Since the task will be soft throttled earlier than before, it may be perceived by end users as performance "slow down" if his application happens to dirty more than 15% dirtyable memory. (2) smoothness/responsiveness Users will notice a more responsive system during heavy writeback. "killall dd" will take effect instantly. Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>	2011-10-03 21:08:57 +08:00
Wu Fengguang	9d823e8f6b	writeback: per task dirty rate limit Add two fields to task_struct. 1) account dirtied pages in the individual tasks, for accuracy 2) per-task balance_dirty_pages() call intervals, for flexibility The balance_dirty_pages() call interval (ie. nr_dirtied_pause) will scale near-sqrt to the safety gap between dirty pages and threshold. The main problem of per-task nr_dirtied is, if 1k+ tasks start dirtying pages at exactly the same time, each task will be assigned a large initial nr_dirtied_pause, so that the dirty threshold will be exceeded long before each task reached its nr_dirtied_pause and hence call balance_dirty_pages(). The solution is to watch for the number of pages dirtied on each CPU in between the calls into balance_dirty_pages(). If it exceeds ratelimit_pages (3% dirty threshold), force call balance_dirty_pages() for a chance to set bdi->dirty_exceeded. In normal situations, this safeguarding condition is not expected to trigger at all. On the sqrt in dirty_poll_interval(): It will serve as an initial guess when dirty pages are still in the freerun area. When dirty pages are floating inside the dirty control scope [freerun, limit], a followup patch will use some refined dirty poll interval to get the desired pause time. thresh-dirty (MB) sqrt 1 16 2 22 4 32 8 45 16 64 32 90 64 128 128 181 256 256 512 362 1024 512 The above table means, given 1MB (or 1GB) gap and the dd tasks polling balance_dirty_pages() on every 16 (or 512) pages, the dirty limit won't be exceeded as long as there are less than 16 (or 512) concurrent dd's. So sqrt naturally leads to less overheads and more safe concurrent tasks for large memory servers, which have large (thresh-freerun) gaps. peter: keep the per-CPU ratelimit for safeguarding the 1k+ tasks case CC: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: Andrea Righi <andrea@betterlinux.com> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>	2011-10-03 21:08:57 +08:00
Wu Fengguang	7381131cbc	writeback: stabilize bdi->dirty_ratelimit There are some imperfections in balanced_dirty_ratelimit. 1) large fluctuations The dirty_rate used for computing balanced_dirty_ratelimit is merely averaged in the past 200ms (very small comparing to the 3s estimation period for write_bw), which makes rather dispersed distribution of balanced_dirty_ratelimit. It's pretty hard to average out the singular points by increasing the estimation period. Considering that the averaging technique will introduce very undesirable time lags, I give it up totally. (btw, the 3s write_bw averaging time lag is much more acceptable because its impact is one-way and therefore won't lead to oscillations.) The more practical way is filtering -- most singular balanced_dirty_ratelimit points can be filtered out by remembering some prev_balanced_rate and prev_prev_balanced_rate. However the more reliable way is to guard balanced_dirty_ratelimit with task_ratelimit. 2) due to truncates and fs redirties, the (write_bw <=> dirty_rate) match could become unbalanced, which may lead to large systematical errors in balanced_dirty_ratelimit. The truncates, due to its possibly bumpy nature, can hardly be compensated smoothly. So let's face it. When some over-estimated balanced_dirty_ratelimit brings dirty_ratelimit high, dirty pages will go higher than the setpoint. task_ratelimit will in turn become lower than dirty_ratelimit. So if we consider both balanced_dirty_ratelimit and task_ratelimit and update dirty_ratelimit only when they are on the same side of dirty_ratelimit, the systematical errors in balanced_dirty_ratelimit won't be able to bring dirty_ratelimit far away. The balanced_dirty_ratelimit estimation may also be inaccurate near @limit or @freerun, however is less an issue. 3) since we ultimately want to - keep the fluctuations of task ratelimit as small as possible - keep the dirty pages around the setpoint as long time as possible the update policy used for (2) also serves the above goals nicely: if for some reason the dirty pages are high (task_ratelimit < dirty_ratelimit), and dirty_ratelimit is low (dirty_ratelimit < balanced_dirty_ratelimit), there is no point to bring up dirty_ratelimit in a hurry only to hurt both the above two goals. So, we make use of task_ratelimit to limit the update of dirty_ratelimit in two ways: 1) avoid changing dirty rate when it's against the position control target (the adjusted rate will slow down the progress of dirty pages going back to setpoint). 2) limit the step size. task_ratelimit is changing values step by step, leaving a consistent trace comparing to the randomly jumping balanced_dirty_ratelimit. task_ratelimit also has the nice smaller errors in stable state and typically larger errors when there are big errors in rate. So it's a pretty good limiting factor for the step size of dirty_ratelimit. Note that bdi->dirty_ratelimit is always tracking balanced_dirty_ratelimit. task_ratelimit is merely used as a limiting factor. Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>	2011-10-03 21:08:57 +08:00
Wu Fengguang	be3ffa2764	writeback: dirty rate control It's all about bdi->dirty_ratelimit, which aims to be (write_bw / N) when there are N dd tasks. On write() syscall, use bdi->dirty_ratelimit ============================================ balance_dirty_pages(pages_dirtied) { task_ratelimit = bdi->dirty_ratelimit * bdi_position_ratio(); pause = pages_dirtied / task_ratelimit; sleep(pause); } On every 200ms, update bdi->dirty_ratelimit =========================================== bdi_update_dirty_ratelimit() { task_ratelimit = bdi->dirty_ratelimit * bdi_position_ratio(); balanced_dirty_ratelimit = task_ratelimit * write_bw / dirty_rate; bdi->dirty_ratelimit = balanced_dirty_ratelimit } Estimation of balanced bdi->dirty_ratelimit =========================================== balanced task_ratelimit ----------------------- balance_dirty_pages() needs to throttle tasks dirtying pages such that the total amount of dirty pages stays below the specified dirty limit in order to avoid memory deadlocks. Furthermore we desire fairness in that tasks get throttled proportionally to the amount of pages they dirty. IOW we want to throttle tasks such that we match the dirty rate to the writeout bandwidth, this yields a stable amount of dirty pages: dirty_rate == write_bw (1) The fairness requirement gives us: task_ratelimit = balanced_dirty_ratelimit == write_bw / N (2) where N is the number of dd tasks. We don't know N beforehand, but still can estimate balanced_dirty_ratelimit within 200ms. Start by throttling each dd task at rate task_ratelimit = task_ratelimit_0 (3) (any non-zero initial value is OK) After 200ms, we measured dirty_rate = # of pages dirtied by all dd's / 200ms write_bw = # of pages written to the disk / 200ms For the aggressive dd dirtiers, the equality holds dirty_rate == N * task_rate == N * task_ratelimit_0 (4) Or task_ratelimit_0 == dirty_rate / N (5) Now we conclude that the balanced task ratelimit can be estimated by write_bw balanced_dirty_ratelimit = task_ratelimit_0 * ---------- (6) dirty_rate Because with (4) and (5) we can get the desired equality (1): write_bw balanced_dirty_ratelimit == (dirty_rate / N) * ---------- dirty_rate == write_bw / N Then using the balanced task ratelimit we can compute task pause times like: task_pause = task->nr_dirtied / task_ratelimit task_ratelimit with position control ------------------------------------ However, while the above gives us means of matching the dirty rate to the writeout bandwidth, it at best provides us with a stable dirty page count (assuming a static system). In order to control the dirty page count such that it is high enough to provide performance, but does not exceed the specified limit we need another control. The dirty position control works by extending (2) to task_ratelimit = balanced_dirty_ratelimit * pos_ratio (7) where pos_ratio is a negative feedback function that subjects to 1) f(setpoint) = 1.0 2) df/dx < 0 That is, if the dirty pages are ABOVE the setpoint, we throttle each task a bit more HEAVY than balanced_dirty_ratelimit, so that the dirty pages are created less fast than they are cleaned, thus DROP to the setpoints (and the reverse). Based on (7) and the assumption that both dirty_ratelimit and pos_ratio remains CONSTANT for the past 200ms, we get task_ratelimit_0 = balanced_dirty_ratelimit * pos_ratio (8) Putting (8) into (6), we get the formula used in bdi_update_dirty_ratelimit(): write_bw balanced_dirty_ratelimit = pos_ratio ---------- (9) dirty_rate Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>	2011-10-03 21:08:56 +08:00
Wu Fengguang	af6a311384	writeback: add bg_threshold parameter to __bdi_update_bandwidth() No behavior change. Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>	2011-10-03 21:08:56 +08:00
Wu Fengguang	6c14ae1e92	writeback: dirty position control bdi_position_ratio() provides a scale factor to bdi->dirty_ratelimit, so that the resulted task rate limit can drive the dirty pages back to the global/bdi setpoints. Old scheme is, \| free run area \| throttle area ----------------------------------------+----------------------------> thresh^ dirty pages New scheme is, ^ task rate limit \| \| * \| * \| * \|[free run] * [smooth throttled] \| * \| * \| * ..bdi->dirty_ratelimit..........* \| . * \| . * \| . * \| . * \| . * +-------------------------------.-----------------------------------> setpoint^ limit^ dirty pages The slope of the bdi control line should be 1) large enough to pull the dirty pages to setpoint reasonably fast 2) small enough to avoid big fluctuations in the resulted pos_ratio and hence task ratelimit Since the fluctuation range of the bdi dirty pages is typically observed to be within 1-second worth of data, the bdi control line's slope is selected to be a linear function of bdi write bandwidth, so that it can adapt to slow/fast storage devices well. Assume the bdi control line pos_ratio = 1.0 + k (dirty - bdi_setpoint) where k is the negative slope. If targeting for 12.5% fluctuation range in pos_ratio when dirty pages are fluctuating in range [bdi_setpoint - write_bw/2, bdi_setpoint + write_bw/2], we get slope k = - 1 / (8 * write_bw) Let pos_ratio(x_intercept) = 0, we get the parameter used in code: x_intercept = bdi_setpoint + 8 * write_bw The global/bdi slopes are nicely complementing each other when the system has only one major bdi (indicated by bdi_thresh ~= thresh): 1) slope of global control line => scaling to the control scope size 2) slope of main bdi control line => scaling to the writeout bandwidth so that - in memory tight systems, (1) becomes strong enough to squeeze dirty pages inside the control scope - in large memory systems where the "gravity" of (1) for pulling the dirty pages to setpoint is too weak, (2) can back (1) up and drive dirty pages to bdi_setpoint ~= setpoint reasonably fast. Unfortunately in JBOD setups, the fluctuation range of bdi threshold is related to memory size due to the interferences between disks. In this case, the bdi slope will be weighted sum of write_bw and bdi_thresh. Given equations span = x_intercept - bdi_setpoint k = df/dx = - 1 / span and the extremum values span = bdi_thresh dx = bdi_thresh we get df = - dx / span = - 1.0 That means, when bdi_dirty deviates bdi_thresh up, pos_ratio and hence task ratelimit will fluctuate by -100%. peter: use 3rd order polynomial for the global control line CC: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>	2011-10-03 21:08:56 +08:00
Wu Fengguang	c8e28ce049	writeback: account per-bdi accumulated dirtied pages Introduce the BDI_DIRTIED counter. It will be used for estimating the bdi's dirty bandwidth. CC: Jan Kara <jack@suse.cz> CC: Michael Rubin <mrubin@google.com> CC: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>	2011-10-03 21:08:56 +08:00
Nobuhiro Iwamatsu	d762cc290b	HID: Add support MacbookAir 4,1 keyboard Added USB device IDs and keyboard map for MacBookAir 4,1 keyboard. Signed-off-by: Nobuhiro Iwamatsu <iwamatsu@nigauri.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2011-10-03 14:00:05 +02:00
Kalle Valo	62c83ac4d6	ath6kl: include vmalloc.h in debug.c Fixes compilation errors when compiling for ARM: ath6kl/debug.c:312: error: implicit declaration of function 'vmalloc' ath6kl/debug.c:312: warning: assignment makes pointer from integer without a cast ath6kl/debug.c:342: error: implicit declaration of function 'vfree' ath6kl/debug.c:696: warning: assignment makes pointer from integer without a cast ath6kl/debug.c:871: warning: assignment makes pointer from integer without a cast Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>	2011-10-03 13:52:46 +03:00
Dimitris Papastamos	0eef6b0415	regmap: Fix doc comment Signed-off-by: Dimitris Papastamos <dp@opensource.wolfsonmicro.com> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>	2011-10-03 11:50:53 +01:00
Dimitris Papastamos	c08604b8ae	regmap: Optimize the lookup path to use binary search Since there are more lookups than insertions in a typical scenario, optimize the linear search into a binary search. For this to work, we need to keep reg_defaults sorted upon insertions, for now be lazy and use sort(). Signed-off-by: Dimitris Papastamos <dp@opensource.wolfsonmicro.com> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>	2011-10-03 11:50:53 +01:00
Florian Westphal	98d9ae841a	netfilter: nf_conntrack: fix event flooding in GRE protocol tracker GRE connections cause ctnetlink event flood because the ASSURED event is set for every packet received. Reported-by: Denys Fedoryshchenko <denys@visp.net.lb> Tested-by: Denys Fedoryshchenko <denys@visp.net.lb> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2011-10-03 12:43:24 +02:00
Arnd Bergmann	28748782b7	video/omap: fix build dependencies Four of the LCD panel drivers depend on the backlight class, so add the dependency in Kconfig. Selecting the BACKLIGHT_CLASS_DEVICE symbol does not generally work since it has other dependencies. Signed-off-by: Arnd Bergmann <arnd@arndb.de> [tomi.valkeinen@ti.com: changed also N8x0 panel] Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>	2011-10-03 13:28:14 +03:00
Linus Walleij	b1e3be0647	clocksource: fixup ux500 build problems Based on a patch from Arnd Bergmann this fixes up the build problem of assigning a non-existing global when the ux500 PRCMU timer is not linked in by passing its base address to the init function. We also add a missing <linux/errno.h> inclusion and staticize the dummy function. Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>	2011-10-03 09:34:16 +02:00
Linus Torvalds	9b13776977	Merge branch 'for-linus' of git://git.infradead.org/users/sameo/mfd-2.6 * 'for-linus' of git://git.infradead.org/users/sameo/mfd-2.6: mfd: Fix generic irq chip ack function name for jz4740-adc	2011-10-02 19:23:44 -07:00
Linus Torvalds	4edf5886bb	Merge branch 'for-linus' of git://github.com/tiwai/sound * 'for-linus' of git://github.com/tiwai/sound: ALSA: hda - Fix a regression of the position-buffer check	2011-10-02 19:22:44 -07:00
Sachin Kamat	6ca3f8bdf8	ARM: EXYNOS4: Add HDMI support for ORIGEN This patch adds HDMI (TVout) support for ORIGEN board. Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org> Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>	2011-10-03 08:42:22 +09:00
Sachin Kamat	c86cfdd012	ARM: EXYNOS4: Add keypad support for ORIGEN This patch adds keypad support for ORIGEN board as GPIO keys. Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org> Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>	2011-10-03 08:42:22 +09:00
Tushar Behera	cf1dad9d7f	ARM: EXYNOS4: Add support for secondary MMC port on ORIGEN Secondary MMC port on ORIGEN is connected to sdhci instance 0. Support for secondary MMC port is extended by registering sdhci instance 0. Since sdhci instance 2 can contain a bootable media, sdhci instance 0 is registered after instance 2. Signed-off-by: Tushar Behera <tushar.behera@linaro.org> [kgene.kim@samsung.com: Added comments in registering sdhci] Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>	2011-10-03 08:42:22 +09:00
Tushar Behera	92e41efd73	ARM: EXYNOS4: Fix sdhci card detection for ORIGEN Fix incorrect value of cd_type field in platform data for sdhci device. Signed-off-by: Tushar Behera <tushar.behera@linaro.org> Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>	2011-10-03 08:42:22 +09:00
Giridhar Maruthy	9edff0f79f	ARM: EXYNOS4: Add PWM backlight support on ORIGEN This patch adds support for LCD backlight using PWM timer on ORIGEN board. Signed-off-by: Giridhar Maruthy <giridhar.maruthy@linaro.org> Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org> Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>	2011-10-03 08:42:22 +09:00
Sachin Kamat	6f8eb324a3	ARM: EXYNOS4: Add FIMC device on ORIGEN board This patch adds definitions to enable support for s5p-fimc driver on ORIGEN board. Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org> [kgene.kim@samsung.com: re-ordering changes] Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>	2011-10-03 08:42:21 +09:00
Sachin Kamat	24f9e1f314	ARM: EXYNOS4: Add USB EHCI device to ORIGEN board Signed-off-by: Pankaj Dubey <pankaj.dubey@samsung.com> Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org> [kgene.kim@samsung.com: re-ordering changes in Kconfig] Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>	2011-10-03 08:42:21 +09:00
Kukjin Kim	218f82e624	Merge branch 'next-samsung-board-v3.1' into next/topic-exynos4-devel-origen	2011-10-03 08:42:01 +09:00
Arnd Bergmann	ca7aceef21	ASoC: sh: use correct __iomem annotations This removes a few unnecessary type casts and avoids sparse warnings. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>	2011-10-02 21:51:27 +01:00
Arnd Bergmann	b381bc8ab8	ASoC: imx: eukrea_tlv320 needs i2c Add a missing dependency that is required for random configurations. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>	2011-10-02 21:51:19 +01:00
Mark Brown	6010c4c6c8	Merge branch 'for-3.1' into for-3.2 Conflicts: sound/soc/omap/mcpdm.c sound/soc/omap/mcpdm.h	2011-10-02 20:20:35 +01:00
Arnd Bergmann	b5c49d49b9	ASoC: omap_mcpdm_remove cannot be __devexit omap_mcpdm_remove is used from asoc_mcpdm_probe, which is an initcall, and must not be discarded when HOTPLUG is disabled. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>	2011-10-02 20:19:59 +01:00
Axel Lin	4b8713fd54	ASoC: Remove unused srate variable in tegra_spdif_hw_params Signed-off-by: Axel Lin <axel.lin@gmail.com> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>	2011-10-02 19:58:45 +01:00
Axel Lin	c6add3f0e6	ASoC: Remove unused rate variable in magician_playback_hw_params Signed-off-by: Axel Lin <axel.lin@gmail.com> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>	2011-10-02 19:58:45 +01:00
Axel Lin	8fccf46905	ASoC: Staticise sh4_ssi_dai Signed-off-by: Axel Lin <axel.lin@gmail.com> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>	2011-10-02 19:58:45 +01:00
Axel Lin	bc6bd90eb7	ASoC: Staticise samsung_spdif_dai Signed-off-by: Axel Lin <axel.lin@gmail.com> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>	2011-10-02 19:58:45 +01:00
Axel Lin	019cd3b25a	ASoC: tegra: Staticise tegra_i2s_dai and tegra_spdif_dai Signed-off-by: Axel Lin <axel.lin@gmail.com> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>	2011-10-02 19:58:44 +01:00
Axel Lin	c4c5839f98	ASoC: samsung: Add __devexit_p at necessary places According to the comments in include/linux/init.h: "Pointers to __devexit functions must use __devexit_p(function_name), the wrapper will insert either the function_name or NULL, depending on the confi options." Signed-off-by: Axel Lin <axel.lin@gmail.com> Cc: Jaswinder Singh <jassi.brar@samsung.com> Cc: Ben Dooks <ben@simtec.co.uk> Cc: Seungwhan Youn <sw.youn@samsung.com> Cc: Jassi Brar <jassisinghbrar@gmail.com> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>	2011-10-02 19:58:44 +01:00
Axel Lin	48860e409d	ASoC: kirkwood-i2s: Add __devexit_p at necessary place According to the comments in include/linux/init.h: "Pointers to __devexit functions must use __devexit_p(function_name), the wrapper will insert either the function_name or NULL, depending on the config options." We have __devexit annotation for kirkwood_i2s_dev_remove(), thus add __devexit_p at necessary place. Signed-off-by: Axel Lin <axel.lin@gmail.com> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>	2011-10-02 19:58:44 +01:00
Axel Lin	3f0456bfd7	ASoC: wm8782: Add __devexit_p at necessary place According to the comments in include/linux/init.h: "Pointers to __devexit functions must use __devexit_p(function_name), the wrapper will insert either the function_name or NULL, depending on the config options." We have __devexit annotation for wm8782_remove(), thus add __devexit_p at necessary place. Signed-off-by: Axel Lin <axel.lin@gmail.com> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>	2011-10-02 19:58:44 +01:00
Mark Brown	ea945ab4d2	Merge branch 'for-3.1' into for-3.2	2011-10-02 19:57:19 +01:00

... 97 98 99 100 101 ...

275188 Commits