linux

mirror of https://github.com/hardkernel/linux.git synced 2026-04-06 13:13:06 +09:00

Author	SHA1	Message	Date
Krzysztof Kozlowski	bddd8ddd9f	drivers/rtc/rtc-at91sam9.c: constify struct regmap_config The regmap_config struct may be const because it is not modified by the driver and regmap_init() accepts pointer to const. Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-02-13 21:21:43 -08:00
Juergen Borleis	46edeffa1f	drivers/rtc/rtc-imxdi.c: add more known register bits Intended for monitoring and controlling the security features. These bits are required to bring this unit back to live after a security violation event was detected. The code to bring it back to live will follow after a vendor clearance. Signed-off-by: Juergen Borleis <jbe@pengutronix.de> Cc: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-02-13 21:21:43 -08:00
Juergen Borleis	6df17a6577	drivers/rtc/rtc-imxdi.c: trivial clean up code Signed-off-by: Juergen Borleis <jbe@pengutronix.de> Cc: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-02-13 21:21:43 -08:00
Arnaud Ebalard	298ff0122a	rtc: rtc-isl12057: add isil,irq2-can-wakeup-machine property for in-tree users Current in-tree users of ISL12057 RTC chip (NETGEAR ReadyNAS 102, 104 and 2120) do not have the IRQ#2 pin of the chip (associated w/ the Alarm1 mechanism) connected to their SoC, but to a PMIC (TPS65251 FWIW). This specific hardware configuration allows the NAS to wake up when the alarms rings. Recently introduced alarm support for ISL12057 relies on the provision of an "interrupts" property in system .dts file, which previous three users will never get. For that reason, alarm support on those devices is not function. To support this use case, this patch adds a new DT property for ISL12057 (isil,irq2-can-wakeup-machine) to indicate that the chip is capable of waking up the device using its IRQ#2 pin (even though it does not have its IRQ#2 pin connected directly to the SoC). This specific configuration was tested on a ReadyNAS 102 by setting an alarm, powering off the device and see it reboot as expected when the alarm rang w/: # echo `date '+%s' -d '+ 1 minutes'` > /sys/class/rtc/rtc0/wakealarm # shutdown -h now As a side note, the ISL12057 remains in the list of trivial devices, because the property is not per se required by the device to work but can help handle system w/ specific requirements. In exchange, the new feature is described in details in a specific documentation file. Signed-off-by: Arnaud Ebalard <arno@natisbad.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: Peter Huewe <peter.huewe@infineon.com> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Thierry Reding <treding@nvidia.com> Cc: Mark Brown <broonie@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Darshana Padmadas <darshanapadmadas@gmail.com> Cc: Rob Herring <rob.herring@calxeda.com> Cc: Pawel Moll <pawel.moll@arm.com> Cc: Stephen Warren <swarren@wwwdotorg.org> Cc: Ian Campbell <ijc+devicetree@hellion.org.uk> Cc: Grant Likely <grant.likely@linaro.org> Cc: Rob Landley <rob@landley.net> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Cc: Kumar Gala <galak@codeaurora.org> Cc: Uwe Kleine-König <uwe@kleine-koenig.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-02-13 21:21:42 -08:00
Arnaud Ebalard	fd71493d67	drivers/rtc/rtc-isl12057.c: add alarm support to Intersil ISL12057 RTC driver This patch adds alarm support to Intersil ISL12057 driver. This allows to configure the chip to generate an interrupt when the alarm matches current time value. Alarm can be programmed up to one month in the future and is accurate to the second. The patch was developed to support two different configurations: systems w/ and w/o RTC chip IRQ line connected to the main CPU. The latter is the one found on current 3 kernel users of the chip for which support was initially developed (Netgear ReadyNAS 102, 104 and 2120 NAS). On those devices, the IRQ#2 pin of the chip is not connected to the SoC but to a PMIC. This allows setting an alarm, powering off the device and have it wake up when the alarm rings. To support that configuration the driver does the following: 1. it has alarm_irq_enable() function returns -ENOTTY when no IRQ is passed to the driver. 2. it marks the device as a wakeup source in all cases (whether an IRQ is passed to the driver or not) to have 'wakealarm' sysfs entry created. 3. it marks the device has not supporting UIE mode when no IRQ is passed to the driver (see the commmit message of `c9f5c7e7a8`) This specific configuration was tested on a ReadyNAS 102 by setting an alarm, powering off the device and see it reboot as expected when the alarm rang. The former configuration was tested on a Netgear ReadyNAS 102 after some soldering of the IRQ#2 pin of the RTC chip to a MPP line of the SoC (the one used usually handles the reset button). The test was performed using a modified .dts file reflecting this change (see below) and rtc-test.c program available in Documentation/rtc.txt. This test program ran as expected, which validates alarm supports, including interrupt support. As a side note, the ISL12057 remains in the list of trivial devices, i.e. no specific DT binding being added by this patch: i2c core automatically handles extraction of IRQ line info from .dts file. For instance, if one wants to reference the interrupt line for the alarm in its .dts file, adding interrupt and interrupt-parent properties works as expected: isl12057: isl12057@68 { compatible =3D "isil,isl12057"; interrupt-parent =3D <&gpio0>; interrupts =3D <6 IRQ_TYPE_EDGE_FALLING>; reg =3D <0x68>; }; FWIW, if someone is looking for a way to test alarm support on a system on which the chip IRQ line has the ability to boot the system (e.g. ReadyNAS 102, 104, etc): # echo 0 > /sys/class/rtc/rtc0/wakealarm # echo `date '+%s' -d '+ 1 minutes'` > /sys/class/rtc/rtc0/wakealarm # shutdown -h now With the commands above, after a minute, the system comes back to life. Signed-off-by: Arnaud Ebalard <arno@natisbad.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: Peter Huewe <peter.huewe@infineon.com> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Thierry Reding <treding@nvidia.com> Cc: Mark Brown <broonie@kernel.org> Cc: Grant Likely <grant.likely@linaro.org> Cc: Uwe Kleine-König <uwe@kleine-koenig.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-02-13 21:21:42 -08:00
Joshua Clayton	3fc70077e6	drivers/rtc/rtc-pcf2123.c: add support for devicetree Add compatible string "nxp,rtc-pcf2123" Document the binding Signed-off-by: Joshua Clayton <stillcompiling@gmail.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: Grant Likely <grant.likely@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-02-13 21:21:42 -08:00
Andrey Ryabinin	393f203f5f	x86_64: kasan: add interceptors for memset/memmove/memcpy functions Recently instrumentation of builtin functions calls was removed from GCC 5.0. To check the memory accessed by such functions, userspace asan always uses interceptors for them. So now we should do this as well. This patch declares memset/memmove/memcpy as weak symbols. In mm/kasan/kasan.c we have our own implementation of those functions which checks memory before accessing it. Default memset/memmove/memcpy now now always have aliases with '__' prefix. For files that built without kasan instrumentation (e.g. mm/slub.c) original mem* replaced (via #define) with prefixed variants, cause we don't want to check memory accesses there. Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Konstantin Serebryany <kcc@google.com> Cc: Dmitry Chernenkov <dmitryc@google.com> Signed-off-by: Andrey Konovalov <adech.fo@gmail.com> Cc: Yuri Gribov <tetra2005@gmail.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Christoph Lameter <cl@linux.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-02-13 21:21:41 -08:00
Andrey Ryabinin	0b24becc81	kasan: add kernel address sanitizer infrastructure Kernel Address sanitizer (KASan) is a dynamic memory error detector. It provides fast and comprehensive solution for finding use-after-free and out-of-bounds bugs. KASAN uses compile-time instrumentation for checking every memory access, therefore GCC > v4.9.2 required. v4.9.2 almost works, but has issues with putting symbol aliases into the wrong section, which breaks kasan instrumentation of globals. This patch only adds infrastructure for kernel address sanitizer. It's not available for use yet. The idea and some code was borrowed from [1]. Basic idea: The main idea of KASAN is to use shadow memory to record whether each byte of memory is safe to access or not, and use compiler's instrumentation to check the shadow memory on each memory access. Address sanitizer uses 1/8 of the memory addressable in kernel for shadow memory and uses direct mapping with a scale and offset to translate a memory address to its corresponding shadow address. Here is function to translate address to corresponding shadow address: unsigned long kasan_mem_to_shadow(unsigned long addr) { return (addr >> KASAN_SHADOW_SCALE_SHIFT) + KASAN_SHADOW_OFFSET; } where KASAN_SHADOW_SCALE_SHIFT = 3. So for every 8 bytes there is one corresponding byte of shadow memory. The following encoding used for each shadow byte: 0 means that all 8 bytes of the corresponding memory region are valid for access; k (1 <= k <= 7) means that the first k bytes are valid for access, and other (8 - k) bytes are not; Any negative value indicates that the entire 8-bytes are inaccessible. Different negative values used to distinguish between different kinds of inaccessible memory (redzones, freed memory) (see mm/kasan/kasan.h). To be able to detect accesses to bad memory we need a special compiler. Such compiler inserts a specific function calls (__asan_load(addr), __asan_store(addr)) before each memory access of size 1, 2, 4, 8 or 16. These functions check whether memory region is valid to access or not by checking corresponding shadow memory. If access is not valid an error printed. Historical background of the address sanitizer from Dmitry Vyukov: "We've developed the set of tools, AddressSanitizer (Asan), ThreadSanitizer and MemorySanitizer, for user space. We actively use them for testing inside of Google (continuous testing, fuzzing, running prod services). To date the tools have found more than 10'000 scary bugs in Chromium, Google internal codebase and various open-source projects (Firefox, OpenSSL, gcc, clang, ffmpeg, MySQL and lots of others): [2] [3] [4]. The tools are part of both gcc and clang compilers. We have not yet done massive testing under the Kernel AddressSanitizer (it's kind of chicken and egg problem, you need it to be upstream to start applying it extensively). To date it has found about 50 bugs. Bugs that we've found in upstream kernel are listed in [5]. We've also found ~20 bugs in out internal version of the kernel. Also people from Samsung and Oracle have found some. [...] As others noted, the main feature of AddressSanitizer is its performance due to inline compiler instrumentation and simple linear shadow memory. User-space Asan has ~2x slowdown on computational programs and ~2x memory consumption increase. Taking into account that kernel usually consumes only small fraction of CPU and memory when running real user-space programs, I would expect that kernel Asan will have ~10-30% slowdown and similar memory consumption increase (when we finish all tuning). I agree that Asan can well replace kmemcheck. We have plans to start working on Kernel MemorySanitizer that finds uses of unitialized memory. Asan+Msan will provide feature-parity with kmemcheck. As others noted, Asan will unlikely replace debug slab and pagealloc that can be enabled at runtime. Asan uses compiler instrumentation, so even if it is disabled, it still incurs visible overheads. Asan technology is easily portable to other architectures. Compiler instrumentation is fully portable. Runtime has some arch-dependent parts like shadow mapping and atomic operation interception. They are relatively easy to port." Comparison with other debugging features: ======================================== KMEMCHECK: - KASan can do almost everything that kmemcheck can. KASan uses compile-time instrumentation, which makes it significantly faster than kmemcheck. The only advantage of kmemcheck over KASan is detection of uninitialized memory reads. Some brief performance testing showed that kasan could be x500-x600 times faster than kmemcheck: $ netperf -l 30 MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost (127.0.0.1) port 0 AF_INET Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec no debug: 87380 16384 16384 30.00 41624.72 kasan inline: 87380 16384 16384 30.00 12870.54 kasan outline: 87380 16384 16384 30.00 10586.39 kmemcheck: 87380 16384 16384 30.03 20.23 - Also kmemcheck couldn't work on several CPUs. It always sets number of CPUs to 1. KASan doesn't have such limitation. DEBUG_PAGEALLOC: - KASan is slower than DEBUG_PAGEALLOC, but KASan works on sub-page granularity level, so it able to find more bugs. SLUB_DEBUG (poisoning, redzones): - SLUB_DEBUG has lower overhead than KASan. - SLUB_DEBUG in most cases are not able to detect bad reads, KASan able to detect both reads and writes. - In some cases (e.g. redzone overwritten) SLUB_DEBUG detect bugs only on allocation/freeing of object. KASan catch bugs right before it will happen, so we always know exact place of first bad read/write. [1] https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerForKernel [2] https://code.google.com/p/address-sanitizer/wiki/FoundBugs [3] https://code.google.com/p/thread-sanitizer/wiki/FoundBugs [4] https://code.google.com/p/memory-sanitizer/wiki/FoundBugs [5] https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerForKernel#Trophies Based on work by Andrey Konovalov. Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com> Acked-by: Michal Marek <mmarek@suse.cz> Signed-off-by: Andrey Konovalov <adech.fo@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Konstantin Serebryany <kcc@google.com> Cc: Dmitry Chernenkov <dmitryc@google.com> Cc: Yuri Gribov <tetra2005@gmail.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Christoph Lameter <cl@linux.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-02-13 21:21:40 -08:00
Andrew Morton	0f989f749b	MODULE_DEVICE_TABLE: fix some callsites The patch "module: fix types of device tables aliases" newly requires that invocations of MODULE_DEVICE_TABLE(type, name); come after the definition of `name'. That is reasonable, but some drivers weren't doing this. Fix them. Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Andrey Ryabinin <a.ryabinin@samsung.com> Cc: David Miller <davem@davemloft.net> Cc: Hans Verkuil <hverkuil@xs4all.nl> Acked-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-02-13 21:21:40 -08:00
Tejun Heo	f799b1a7fb	drivers/base: use %pb[l] to print bitmaps including cpumasks and nodemasks printk and friends can now format bitmaps using '%pb[l]'. cpumask and nodemask also provide cpumask_pr_args() and nodemask_pr_args() respectively which can be used to generate the two printf arguments necessary to format the specified cpu/nodemask. * Line termination only requires one extra space at the end of the buffer. Use PAGE_SIZE - 1 instead of PAGE_SIZE - 2 when formatting. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-02-13 21:21:38 -08:00
Tejun Heo	125918dbd8	usb: use %pb[l] to print bitmaps including cpumasks and nodemasks printk and friends can now format bitmaps using '%pb[l]'. cpumask and nodemask also provide cpumask_pr_args() and nodemask_pr_args() respectively which can be used to generate the two printf arguments necessary to format the specified cpu/nodemask. * drivers/uwb/drp.c::uwb_drp_handle_alien_drp() was formatting mas.bm into a buffer but never used it. Removed. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-02-13 21:21:38 -08:00
Tejun Heo	c7badc9017	scsi: use %pb[l] to print bitmaps including cpumasks and nodemasks printk and friends can now format bitmaps using '%pb[l]'. cpumask and nodemask also provide cpumask_pr_args() and nodemask_pr_args() respectively which can be used to generate the two printf arguments necessary to format the specified cpu/nodemask. * map_show()'s return value is too high by one and the function could modify beyond the end of the buffer when the formatted text is long enough. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-02-13 21:21:38 -08:00
Tejun Heo	0b480037e8	input: use %pb[l] to print bitmaps including cpumasks and nodemasks printk and friends can now format bitmaps using '%pb[l]'. cpumask and nodemask also provide cpumask_pr_args() and nodemask_pr_args() respectively which can be used to generate the two printf arguments necessary to format the specified cpu/nodemask. * Line termination only requires one extra space at the end of the buffer. Use PAGE_SIZE - 1 instead of PAGE_SIZE - 2 when formatting. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-02-13 21:21:38 -08:00
Tejun Heo	898600380c	wireless: use %pb[l] to print bitmaps including cpumasks and nodemasks printk and friends can now format bitmaps using '%pb[l]'. cpumask and nodemask also provide cpumask_pr_args() and nodemask_pr_args() respectively which can be used to generate the two printf arguments necessary to format the specified cpu/nodemask. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: "John W. Linville" <linville@tuxdriver.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-02-13 21:21:38 -08:00
Tejun Heo	660e5ec02d	arm: use %pb[l] to print bitmaps including cpumasks and nodemasks printk and friends can now format bitmaps using '%pb[l]'. cpumask and nodemask also provide cpumask_pr_args() and nodemask_pr_args() respectively which can be used to generate the two printf arguments necessary to format the specified cpu/nodemask. * Line termination only requires one extra space at the end of the buffer. Use PAGE_SIZE - 1 instead of PAGE_SIZE - 2 when formatting. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Russell King <linux@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-02-13 21:21:37 -08:00
Tejun Heo	839b268033	tile: use %pb[l] to print bitmaps including cpumasks and nodemasks printk and friends can now format bitmaps using '%pb[l]'. cpumask and nodemask also provide cpumask_pr_args() and nodemask_pr_args() respectively which can be used to generate the two printf arguments necessary to format the specified cpu/nodemask. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Chris Metcalf <cmetcalf@tilera.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-02-13 21:21:37 -08:00
Andrzej Hajda	612936f212	clk: convert clock name allocations to kstrdup_const Clock subsystem frequently performs duplication of strings located in read-only memory section. Replacing kstrdup by kstrdup_const allows to avoid such operations. Signed-off-by: Andrzej Hajda <a.hajda@samsung.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Kyungmin Park <kyungmin.park@samsung.com> Cc: Mike Turquette <mturquette@linaro.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Tejun Heo <tj@kernel.org> Cc: Greg KH <greg@kroah.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-02-13 21:21:36 -08:00
Nicholas Bellinger	aa04dae454	target: Set LBPWS10 bit in Logical Block Provisioning EVPD This patch sets the missing LBPWS10 bit within spc_emulate_evpd_b2() in order to signal WRITE_SAME (10) w/ UNMAP support, following the existing LBPWS bit to signal WRITE_SAME (16) w/ UNMAP support. Cc: Martin Petersen <martin.petersen@oracle.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2015-02-14 02:54:49 +00:00
Nicholas Bellinger	61fdb4acc8	target: Fail UNMAP when emulate_tpu=0 This patch adds a check within sbc_parse_cdb() to fail a UNMAP op, if the backend device has emulate_tpu disabled. Cc: Martin Petersen <martin.petersen@oracle.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2015-02-14 02:54:48 +00:00
Nicholas Bellinger	d0a9129555	target: Fail WRITE_SAME w/ UNMAP=1 when emulate_tpws=0 This patch adds a check within sbc_setup_write_same() to fail a WRITE_SAME w/ UNMAP=1 op, if the backend device has emulate_tpws disabled. Cc: Martin Petersen <martin.petersen@oracle.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2015-02-14 02:54:40 +00:00
Nicholas Bellinger	fde9f50f80	target: Add sanity checks for DPO/FUA bit usage This patch adds a sbc_check_dpofua() function that performs sanity checks for DPO/FUA command bits. It introduces checks to fail when either bit is set, but the backend device is not advertising support for them. It also moves the existing cmd->se_cmd_flags \|= SCF_FUA assignement into the new helper function. Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2015-02-14 02:09:46 +00:00
Nicholas Bellinger	afd73f1b60	target: Perform PROTECT sanity checks for WRITE_SAME This patch adds a call to sbc_check_prot() within sbc_setup_write_same() code to perform the various protection releated sanity checks, including failing if WRPROTECT or RDPROTECT is set for a backend device that has not advertised support for T10-PI. Also, since WRITE_SAME + T10-PI is currently not supported by IBLOCK + FILEIO backends, go ahead and fail if ->execute_write_same() is invoked with a non zero cmd->prot_op. Cc: Martin Petersen <martin.petersen@oracle.com> Cc: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2015-02-14 02:09:45 +00:00
Nicholas Bellinger	f7b7c06f38	target: Fail I/O with PROTECT bit when protection is unsupported This patch adds an explicit check for WRPROTECT + RDPROTECT bit usage within sbc_check_prot(), and fails with TCM_INVALID_CDB_FIELD if the backend device does not have protection enabled. Also, update sbc_check_prot() to return sense_reason_t in order to propigate up the correct sense ASQ. Cc: Martin Petersen <martin.petersen@oracle.com> Cc: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2015-02-14 02:09:44 +00:00
Nicholas Bellinger	aa179935ed	target: Check for LBA + sectors wrap-around in sbc_parse_cdb This patch adds a check to sbc_parse_cdb() in order to detect when an LBA + sector vs. end-of-device calculation wraps when the LBA is sufficently large enough (eg: 0xFFFFFFFFFFFFFFFF). Cc: Martin Petersen <martin.petersen@oracle.com> Cc: Christoph Hellwig <hch@lst.de> Cc: stable@vger.kernel.org Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2015-02-14 02:09:44 +00:00
Nicholas Bellinger	8e575c50a1	target: Add missing WRITE_SAME end-of-device sanity check This patch adds a check to sbc_setup_write_same() to verify the incoming WRITE_SAME LBA + number of blocks does not exceed past the end-of-device. Also check for potential LBA wrap-around as well. Reported-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Martin Petersen <martin.petersen@oracle.com> Cc: Christoph Hellwig <hch@lst.de> Cc: stable@vger.kernel.org # 3.8+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2015-02-14 02:09:35 +00:00
Darrick J. Wong	37527b8692	dm io: reject unsupported DISCARD requests with EOPNOTSUPP I created a dm-raid1 device backed by a device that supports DISCARD and another device that does NOT support DISCARD with the following dm configuration: # echo '0 2048 mirror core 1 512 2 /dev/sda 0 /dev/sdb 0' \| dmsetup create moo # lsblk -D NAME DISC-ALN DISC-GRAN DISC-MAX DISC-ZERO sda 0 4K 1G 0 `-moo (dm-0) 0 4K 1G 0 sdb 0 0B 0B 0 `-moo (dm-0) 0 4K 1G 0 Notice that the mirror device /dev/mapper/moo advertises DISCARD support even though one of the mirror halves doesn't. If I issue a DISCARD request (via fstrim, mount -o discard, or ioctl BLKDISCARD) through the mirror, kmirrord gets stuck in an infinite loop in do_region() when it tries to issue a DISCARD request to sdb. The problem is that when we call do_region() against sdb, num_sectors is set to zero because q->limits.max_discard_sectors is zero. Therefore, "remaining" never decreases and the loop never terminates. To fix this: before entering the loop, check for the combination of REQ_DISCARD and no discard and return -EOPNOTSUPP to avoid hanging up the mirror device. This bug was found by the unfortunate coincidence of pvmove and a discard operation in the RHEL 6.5 kernel; upstream is also affected. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Acked-by: "Martin K. Petersen" <martin.petersen@oracle.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org	2015-02-13 19:51:09 -05:00
Mikulas Patocka	f2ed51ac64	dm mirror: do not degrade the mirror on discard error It may be possible that a device claims discard support but it rejects discards with -EOPNOTSUPP. It happens when using loopback on ext2/ext3 filesystem driven by the ext4 driver. It may also happen if the underlying devices are moved from one disk on another. If discard error happens, we reject the bio with -EOPNOTSUPP, but we do not degrade the array. This patch fixes failed test shell/lvconvert-repair-transient.sh in the lvm2 testsuite if the testsuite is extracted on an ext2 or ext3 filesystem and it is being driven by the ext4 driver. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org	2015-02-13 19:50:46 -05:00
Mike Snitzer	145b9006a0	dm space map disk: fix sm_disk_count_is_more_than_one() dm_tm_shadow_block() is the only caller of dm_sm_count_is_more_than_one() which only ever operates on a metadata space-map. So in practice, sm_disk_count_is_more_than_one() isn't actually used (which explains why this bug never amounted to anything). But fix sm_disk_count_is_more_than_one() to properly set *result and return 0. Reported-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2015-02-13 19:32:58 -05:00
Rafael J. Wysocki	3810631332	PM / sleep: Re-implement suspend-to-idle handling In preparation for adding support for quiescing timers in the final stage of suspend-to-idle transitions, rework the freeze_enter() function making the system wait on a wakeup event, the freeze_wake() function terminating the suspend-to-idle loop and the mechanism by which deep idle states are entered during suspend-to-idle. First of all, introduce a simple state machine for suspend-to-idle and make the code in question use it. Second, prevent freeze_enter() from losing wakeup events due to race conditions and ensure that the number of online CPUs won't change while it is being executed. In addition to that, make it force all of the CPUs re-enter the idle loop in case they are in idle states already (so they can enter deeper idle states if possible). Next, drop cpuidle_use_deepest_state() and replace use_deepest_state checks in cpuidle_select() and cpuidle_reflect() with a single suspend-to-idle state check in cpuidle_idle_call(). Finally, introduce cpuidle_enter_freeze() that will simply find the deepest idle state available to the given CPU and enter it using cpuidle_enter(). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>	2015-02-13 23:49:36 +01:00
Linus Torvalds	18320f2a68	Merge tag 'pm+acpi-3.20-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more ACPI and power management updates from Rafael Wysocki: "These are two reverts related to system suspend breakage by one of a recent commits, a fix for a recently introduced bug in devfreq and a bunch of other things that didn't make it into my previous pull request, but otherwise are ready to go. Specifics: - Revert two ACPI EC driver commits, one that broke system suspend on Acer Aspire S5 and one that depends on it (Rafael J Wysocki). - Fix a typo leading to an incorrect check in the exynos-ppmu devfreq driver (Dan Carpenter). - Add support for one more Broadwell CPU model to intel_idle (Len Brown). - Fix an obscure problem with state transitions related to interrupts in the speedstep-smi cpufreq driver (Mikulas Patocka). - Remove some unnecessary messages related to the "out of memory" condition from the core PM code (Quentin Lambert). - Update turbostat parameters and documentation, add support for one more Broadwell CPU model to it and modify it to skip printing disabled package C-states (Len Brown)" * tag 'pm+acpi-3.20-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM / devfreq: event: testing the wrong variable cpufreq: speedstep-smi: enable interrupts when waiting PM / OPP / clk: Remove unnecessary OOM message Revert "ACPI / EC: Add query flushing support" Revert "ACPI / EC: Add GPE reference counting debugging messages" tools/power turbostat: support additional Broadwell model intel_idle: support additional Broadwell model tools/power turbostat: update parameters, documentation tools/power turbostat: Skip printing disabled package C-states	2015-02-13 13:45:57 -08:00
Rafael J. Wysocki	c7fb90dfbe	Merge branches 'pm-cpufreq', 'pm-cpuidle', 'pm-devfreq', 'pm-opp' and 'pm-tools' * pm-cpufreq: cpufreq: speedstep-smi: enable interrupts when waiting * pm-cpuidle: intel_idle: support additional Broadwell model * pm-devfreq: PM / devfreq: event: testing the wrong variable * pm-opp: PM / OPP / clk: Remove unnecessary OOM message * pm-tools: tools/power turbostat: support additional Broadwell model tools/power turbostat: update parameters, documentation tools/power turbostat: Skip printing disabled package C-states	2015-02-13 21:39:06 +01:00
Rafael J. Wysocki	69bf75e9ae	Merge branch 'acpi-ec' * acpi-ec: Revert "ACPI / EC: Add query flushing support" Revert "ACPI / EC: Add GPE reference counting debugging messages"	2015-02-13 21:38:20 +01:00
Roi Dayan	c6c95ef4ce	IB/iser: Use correct dma direction when unmapping SGs We always unmap SGs with the same direction instead of unmapping with the direction the mapping was done, fix that. Fixes: `9a8b08fad2` ("IB/iser: Generalize iser_unmap_task_data and [...]") Signed-off-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2015-02-13 11:27:31 -08:00
Rickard Strandqvist	d6522223e4	IB/ipath: Remove unused function in ipath_wc_ppc64 Remove the function ipath_unordered_wc() that is not used anywhere. This was partially found by using a static code analysis program called cppcheck. Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se> Acked-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2015-02-13 11:17:29 -08:00
Hariprasad S	c62e689631	RDMA/cxgb4: Serialize CQ event upcalls with CQ destruction A race exists where the application can be destroying the CQ concurrently with a HW interrupt indicating a completion has been inserted into the CQ. This can cause an event notification upcall to the application after the CQ has been destroyed. The solution is to serialize looking up the CQ in the IDR table and referencing the CQ in c4iw_ev_handler() with removing the CQID from the IDR table and blocking until the refcnt reaches 0 in c4iw_destroy_cq(). Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2015-02-13 11:13:16 -08:00
Linus Torvalds	db3ecdee1c	Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/linux-leds Pull LED subsystem update from Bryan Wu: "The big change of LED subsystem is introducing a new LED class for Flash type LEDs which will be used for V4L2 subsystem. Also we got some cleanup and fixes" * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/linux-leds: leds: leds-gpio: Pass on error codes unmodified DT: leds: Add led-sources property leds: Add LED Flash class extension to the LED subsystem leds: leds-mc13783: Use of_get_child_by_name() instead of refcount hack leds: Use setup_timer leds: Don't allow brightness values greater than max_brightness DT: leds: Add flash LED devices related properties	2015-02-13 10:54:44 -08:00
Linus Torvalds	b9085bcbf5	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM update from Paolo Bonzini: "Fairly small update, but there are some interesting new features. Common: Optional support for adding a small amount of polling on each HLT instruction executed in the guest (or equivalent for other architectures). This can improve latency up to 50% on some scenarios (e.g. O_DSYNC writes or TCP_RR netperf tests). This also has to be enabled manually for now, but the plan is to auto-tune this in the future. ARM/ARM64: The highlights are support for GICv3 emulation and dirty page tracking s390: Several optimizations and bugfixes. Also a first: a feature exposed by KVM (UUID and long guest name in /proc/sysinfo) before it is available in IBM's hypervisor! :) MIPS: Bugfixes. x86: Support for PML (page modification logging, a new feature in Broadwell Xeons that speeds up dirty page tracking), nested virtualization improvements (nested APICv---a nice optimization), usual round of emulation fixes. There is also a new option to reduce latency of the TSC deadline timer in the guest; this needs to be tuned manually. Some commits are common between this pull and Catalin's; I see you have already included his tree. Powerpc: Nothing yet. The KVM/PPC changes will come in through the PPC maintainers, because I haven't received them yet and I might end up being offline for some part of next week" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (130 commits) KVM: ia64: drop kvm.h from installed user headers KVM: x86: fix build with !CONFIG_SMP KVM: x86: emulate: correct page fault error code for NoWrite instructions KVM: Disable compat ioctl for s390 KVM: s390: add cpu model support KVM: s390: use facilities and cpu_id per KVM KVM: s390/CPACF: Choose crypto control block format s390/kernel: Update /proc/sysinfo file with Extended Name and UUID KVM: s390: reenable LPP facility KVM: s390: floating irqs: fix user triggerable endless loop kvm: add halt_poll_ns module parameter kvm: remove KVM_MMIO_SIZE KVM: MIPS: Don't leak FPU/DSP to guest KVM: MIPS: Disable HTW while in guest KVM: nVMX: Enable nested posted interrupt processing KVM: nVMX: Enable nested virtual interrupt delivery KVM: nVMX: Enable nested apic register virtualization KVM: nVMX: Make nested control MSRs per-cpu KVM: nVMX: Enable nested virtualize x2apic mode KVM: nVMX: Prepare for using hardware MSR bitmap ...	2015-02-13 09:55:09 -08:00
Aleksander Morgado	0416605548	hso: fix rx parsing logic when skb allocation fails If skb allocation fails once the IP header has been received, the rx state is being set to WAIT_SYNC. The logic, though, shouldn't directly return, as the buffer may contain a full packet, and therefore the WAIT_SYNC state needs to be processed (resetting state to WAIT_IP, clearing rx_buf_size and re-initializing rx_buf_missing). So, just let the while loop continue so that in the next iteration the WAIT_SYNC state cleanly stops the loop. The WAIT_SYNC processing will be done just after that, only if the end of packet is flagged. Signed-off-by: Aleksander Morgado <aleksander@aleksander.es> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-13 07:17:05 -08:00
Alex Deucher	09b6e85fc8	drm/radeon: fix voltage setup on hawaii Missing parameter when fetching the real voltage values from atom. Fixes problems with dynamic clocking on certain boards. bug: https://bugs.freedesktop.org/show_bug.cgi?id=87457 Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2015-02-13 10:03:48 -05:00
Alex Deucher	66c2b84ba6	drm/radeon/dp: Set EDP_CONFIGURATION_SET for bridge chips if necessary Don't restrict it to just eDP panels. Some LVDS bridge chips require this. Fixes blank panels on resume on certain laptops. Noticed by mrnuke on IRC. bug: https://bugs.freedesktop.org/show_bug.cgi?id=42960 Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2015-02-13 10:03:33 -05:00
Dave Airlie	ab07881a2a	Merge branch 'exynos-drm-next' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-next Summary: - Add code cleanups and bug fixups. - Add a new display controller dirver, DECON which is a new display controller of Exynos7 SoC. This device is much different from FIMD of Exynos4 and Exynos4 SoC series. * 'exynos-drm-next' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos: drm/exynos: Add DECON driver drm/exynos: fix NULL pointer reference drm/exynos: remove exynos_plane_dpms drm/exynos: remove mode property of exynos crtc drm/exynos: Remove exynos_plane_dpms() call with no effect drm/exynos: fix DMA_ATTR_NO_KERNEL_MAPPING usage drm/exynos: hdmi: replace fb size with mode size from win commit drm/exynos: fix no hdmi output drm/exynos: use driver internal struct drm/exynos: fix wrong pipe calculation for crtc drm/exynos: remove to use unnecessary MODULE_xxx macro drm/exynos: remove DRM_EXYNOS_DMABUF config drm/exynos: IOMMU support should not be selectable by user drm/exynos: add support for 'hdmi' clock	2015-02-13 13:02:49 +10:00
Linus Torvalds	818099574b	Merge branch 'akpm' (patches from Andrew) Merge third set of updates from Andrew Morton: - the rest of MM [ This includes getting rid of the numa hinting bits, in favor of just generic protnone logic. Yay. - Linus ] - core kernel - procfs - some of lib/ (lots of lib/ material this time) * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (104 commits) lib/lcm.c: replace include lib/percpu_ida.c: remove redundant includes lib/strncpy_from_user.c: replace module.h include lib/stmp_device.c: replace module.h include lib/sort.c: move include inside #if 0 lib/show_mem.c: remove redundant include lib/radix-tree.c: change to simpler include lib/plist.c: remove redundant include lib/nlattr.c: remove redundant include lib/kobject_uevent.c: remove redundant include lib/llist.c: remove redundant include lib/md5.c: simplify include lib/list_sort.c: rearrange includes lib/genalloc.c: remove redundant include lib/idr.c: remove redundant include lib/halfmd4.c: simplify includes lib/dynamic_queue_limits.c: simplify includes lib/sort.c: use simpler includes lib/interval_tree.c: simplify includes hexdump: make it return number of bytes placed in buffer ...	2015-02-12 18:54:28 -08:00
Rasmus Villemoes	02f1f2170d	kernel.h: remove ancient __FUNCTION__ hack __FUNCTION__ hasn't been treated as a string literal since gcc 3.4, so this only helps people who only test-compile using 3.3 (compiler-gcc3.h barks at anything older than that). Besides, there are almost no occurrences of __FUNCTION__ left in the tree. [akpm@linux-foundation.org: convert remaining __FUNCTION__ references] Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Michal Nazarewicz <mina86@mina86.com> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-02-12 18:54:13 -08:00
Ganesh Mahendran	3eba0c6a56	mm/zpool: add name argument to create zpool Currently the underlay of zpool: zsmalloc/zbud, do not know who creates them. There is not a method to let zsmalloc/zbud find which caller they belong to. Now we want to add statistics collection in zsmalloc. We need to name the debugfs dir for each pool created. The way suggested by Minchan Kim is to use a name passed by caller(such as zram) to create the zsmalloc pool. /sys/kernel/debug/zsmalloc/zram0 This patch adds an argument `name' to zs_create_pool() and other related functions. Signed-off-by: Ganesh Mahendran <opensource.ganesh@gmail.com> Acked-by: Minchan Kim <minchan@kernel.org> Cc: Seth Jennings <sjennings@variantweb.net> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Dan Streetman <ddstreet@ieee.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-02-12 18:54:12 -08:00
Sergey Senozhatsky	ee98016010	zram: remove request_queue from struct zram `struct zram' contains both `struct gendisk' and `struct request_queue'. the latter can be deleted, because zram->disk carries ->queue pointer, and ->queue carries zram pointer: create_device() zram->queue->queuedata = zram zram->disk->queue = zram->queue zram->disk->private_data = zram so zram->queue is not needed, we can access all necessary data anyway. Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: Nitin Gupta <ngupta@vflare.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-02-12 18:54:12 -08:00
Minchan Kim	08eee69fcf	zram: remove init_lock in zram_make_request Admin could reset zram during I/O operation going on so we have used zram->init_lock as read-side lock in I/O path to prevent sudden zram meta freeing. However, the init_lock is really troublesome. We can't do call zram_meta_alloc under init_lock due to lockdep splat because zram_rw_page is one of the function under reclaim path and hold it as read_lock while other places in process context hold it as write_lock. So, we have used allocation out of the lock to avoid lockdep warn but it's not good for readability and fainally, I met another lockdep splat between init_lock and cpu_hotplug from kmem_cache_destroy during working zsmalloc compaction. :( Yes, the ideal is to remove horrible init_lock of zram in rw path. This patch removes it in rw path and instead, add atomic refcount for meta lifetime management and completion to free meta in process context. It's important to free meta in process context because some of resource destruction needs mutex lock, which could be held if we releases the resource in reclaim context so it's deadlock, again. As a bonus, we could remove init_done check in rw path because zram_meta_get will do a role for it, instead. Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: Ganesh Mahendran <opensource.ganesh@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-02-12 18:54:12 -08:00
Minchan Kim	2b269ce6fc	zram: check bd_openers instead of bd_holders bd_holders is increased only when user open the device file as FMODE_EXCL so if something opens zram0 as !FMODE_EXCL and request I/O while another user reset zram0, we can see following warning. zram0: detected capacity change from 0 to 64424509440 Buffer I/O error on dev zram0, logical block 180823, lost async page write Buffer I/O error on dev zram0, logical block 180824, lost async page write Buffer I/O error on dev zram0, logical block 180825, lost async page write Buffer I/O error on dev zram0, logical block 180826, lost async page write Buffer I/O error on dev zram0, logical block 180827, lost async page write Buffer I/O error on dev zram0, logical block 180828, lost async page write Buffer I/O error on dev zram0, logical block 180829, lost async page write Buffer I/O error on dev zram0, logical block 180830, lost async page write Buffer I/O error on dev zram0, logical block 180831, lost async page write Buffer I/O error on dev zram0, logical block 180832, lost async page write ------------[ cut here ]------------ WARNING: CPU: 11 PID: 1996 at fs/block_dev.c:57 __blkdev_put+0x1d7/0x210() Modules linked in: CPU: 11 PID: 1996 Comm: dd Not tainted 3.19.0-rc6-next-20150202+ #1125 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: dump_stack+0x45/0x57 warn_slowpath_common+0x8a/0xc0 warn_slowpath_null+0x1a/0x20 __blkdev_put+0x1d7/0x210 blkdev_put+0x50/0x130 blkdev_close+0x25/0x30 __fput+0xdf/0x1e0 ____fput+0xe/0x10 task_work_run+0xa7/0xe0 do_notify_resume+0x49/0x60 int_signal+0x12/0x17 ---[ end trace 274fbbc5664827d2 ]--- The warning comes from bdev_write_node in blkdev_put path. static void bdev_write_inode(struct inode *inode) { spin_lock(&inode->i_lock); while (inode->i_state & I_DIRTY) { spin_unlock(&inode->i_lock); WARN_ON_ONCE(write_inode_now(inode, true)); <========= here. spin_lock(&inode->i_lock); } spin_unlock(&inode->i_lock); } The reason is dd process encounters I/O fails due to sudden block device disappear so in filemap_check_errors in __writeback_single_inode returns -EIO. If we check bd_openers instead of bd_holders, we could address the problem. When I see the brd, it already have used it rather than bd_holders so although I'm not a expert of block layer, it seems to be better. I can make following warning with below simple script. In addition, I added msleep(2000) below set_capacity(zram->disk, 0) after applying your patch to make window huge(Kudos to Ganesh!) script: echo $((60<<30)) > /sys/block/zram0/disksize setsid dd if=/dev/zero of=/dev/zram0 & sleep 1 setsid echo 1 > /sys/block/zram0/reset Signed-off-by: Minchan Kim <minchan@kernel.org> Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: Ganesh Mahendran <opensource.ganesh@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-02-12 18:54:11 -08:00
Sergey Senozhatsky	a096cafc31	zram: rework reset and destroy path We need to return set_capacity(disk, 0) from reset_store() back to zram_reset_device(), a catch by Ganesh Mahendran. Potentially, we can race set_capacity() calls from init and reset paths. The problem is that zram_reset_device() is also getting called from zram_exit(), which performs operations in misleading reversed order -- we first create_device() and then init it, while zram_exit() perform destroy_device() first and then does zram_reset_device(). This is done to remove sysfs group before we reset device, so we can continue with device reset/destruction not being raced by sysfs attr write (f.e. disksize). Apart from that, destroy_device() releases zram->disk (but we still have ->disk pointer), so we cannot acces zram->disk in later zram_reset_device() call, which may cause additional errors in the future. So, this patch rework and cleanup destroy path. 1) remove several unneeded goto labels in zram_init() 2) factor out zram_init() error path and zram_exit() into destroy_devices() function, which takes the number of devices to destroy as its argument. 3) remove sysfs group in destroy_devices() first, so we can reorder operations -- reset device (as expected) goes before disk destroy and queue cleanup. So we can always access ->disk in zram_reset_device(). 4) and, finally, return set_capacity() back under ->init_lock. [akpm@linux-foundation.org: tweak comment] Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Reported-by: Ganesh Mahendran <opensource.ganesh@gmail.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: Nitin Gupta <ngupta@vflare.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-02-12 18:54:11 -08:00
Sergey Senozhatsky	ba6b17d68c	zram: fix umount-reset_store-mount race condition Ganesh Mahendran was the first one who proposed to use bdev->bd_mutex to avoid ->bd_holders race condition: CPU0 CPU1 umount /* zram->init_done is true */ reset_store() bdev->bd_holders == 0 mount ... zram_make_request() zram_reset_device() However, his solution required some considerable amount of code movement, which we can avoid. Apart from using bdev->bd_mutex in reset_store(), this patch also simplifies zram_reset_device(). zram_reset_device() has a bool parameter reset_capacity which tells it whether disk capacity and itself disk should be reset. There are two zram_reset_device() callers: -- zram_exit() passes reset_capacity=false -- reset_store() passes reset_capacity=true So we can move reset_capacity-sensitive work out of zram_reset_device() and perform it unconditionally in reset_store(). This also lets us drop reset_capacity parameter from zram_reset_device() and pass zram pointer only. Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Reported-by: Ganesh Mahendran <opensource.ganesh@gmail.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-02-12 18:54:11 -08:00
Ganesh Mahendran	1fec117281	zram: free meta table in zram_meta_free zram_meta_alloc() and zram_meta_free() are a pair. In zram_meta_alloc(), meta table is allocated. So it it better to free it in zram_meta_free(). Signed-off-by: Ganesh Mahendran <opensource.ganesh@gmail.com> Acked-by: Minchan Kim <minchan@kernel.org> Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Nitin Gupta <ngupta@vflare.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-02-12 18:54:11 -08:00

... 17 18 19 20 21 ...

256091 Commits