commit 42dd037c08 upstream.
Doing rbd_obj_request_put() in rbd_img_request_fill() error paths is
not only insufficient, but also triggers an rbd_assert() in
rbd_obj_request_destroy():
Assertion failure in rbd_obj_request_destroy() at line 1867:
rbd_assert(obj_request->img_request == NULL);
rbd_img_obj_request_add() adds obj_requests to the img_request, the
opposite is rbd_img_obj_request_del(). Use it.
Fixes: http://tracker.ceph.com/issues/7327
Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Reviewed-by: Alex Elder <elder@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2145e15e05 upstream.
Do not leak kernel-only floppy_raw_cmd structure members to userspace.
This includes the linked-list pointer and the pointer to the allocated
DMA space.
Signed-off-by: Matthew Daley <mattd@bugfuzz.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ef87dbe761 upstream.
Always clear out these floppy_raw_cmd struct members after copying the
entire structure from userspace so that the in-kernel version is always
valid and never left in an interdeterminate state.
Signed-off-by: Matthew Daley <mattd@bugfuzz.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1044b1bb92 upstream.
We need to set the queue bounce limit during the device initialization to
prevent excessive bouncing on 32 bit architectures.
Signed-off-by: Felipe Franciosi <felipe@paradoxo.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 668f9abbd4 upstream.
Commit bf6bddf192 ("mm: introduce compaction and migration for
ballooned pages") introduces page_count(page) into memory compaction
which dereferences page->first_page if PageTail(page).
This results in a very rare NULL pointer dereference on the
aforementioned page_count(page). Indeed, anything that does
compound_head(), including page_count() is susceptible to racing with
prep_compound_page() and seeing a NULL or dangling page->first_page
pointer.
This patch uses Andrea's implementation of compound_trans_head() that
deals with such a race and makes it the default compound_head()
implementation. This includes a read memory barrier that ensures that
if PageTail(head) is true that we return a head page that is neither
NULL nor dangling. The patch then adds a store memory barrier to
prep_compound_page() to ensure page->first_page is set.
This is the safest way to ensure we see the head page that we are
expecting, PageTail(page) is already in the unlikely() path and the
memory barriers are unfortunately required.
Hugetlbfs is the exception, we don't enforce a store memory barrier
during init since no race is possible.
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Holger Kiehl <Holger.Kiehl@dwd.de>
Cc: Christoph Lameter <cl@linux.com>
Cc: Rafael Aquini <aquini@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3661371701 upstream.
Backend drivers shouldn't transistion to CLOSED unless the frontend is
CLOSED. If a backend does transition to CLOSED too soon then the
frontend may not see the CLOSING state and will not properly shutdown.
So, treat an unexpected backend CLOSED state the same as CLOSING.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit da6a6b6397 upstream.
rbd_snap_name() calls rbd_dev_v{1,2}_snap_name() depending on the
format of the image. The format 1 version returns NULL on error, which
is handled by the caller. The format 2 version returns an ERR_PTR,
which the caller of rbd_snap_name() does not expect.
Fortunately this is unlikely to occur in practice because
rbd_snap_id_by_name() is called before rbd_snap_name(). This would hit
similar errors to rbd_snap_name() (like the snapshot not existing) and
return early, so rbd_snap_name() would not hit an error unless the
snapshot was removed between the two calls or memory was exhausted.
Use an ERR_PTR in rbd_dev_v1_snap_name() so that the specific error
can be propagated, and it is consistent with rbd_dev_v2_snap_name().
Handle the ERR_PTR in the only rbd_snap_name() caller.
Suggested-by: Alex Elder <alex.elder@linaro.org>
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Alex Elder <elder@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit efadc98aab upstream.
This prevents erroring out while adding a device when a snapshot
unrelated to the current mapping is deleted between reading the
snapshot context and reading the snapshot names. If the mapped
snapshot name is not found an error still occurs as usual.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Alex Elder <elder@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9875201e10 upstream.
Removing a device deallocates the disk, unschedules the watch, and
finally cleans up the rbd_dev structure. rbd_dev_refresh(), called
from the watch callback, updates the disk size and rbd_dev
structure. With no locking between them, rbd_dev_refresh() may use the
device or rbd_dev after they've been freed.
To fix this, check whether RBD_DEV_FLAG_REMOVING is set before
updating the disk size in rbd_dev_refresh(). In order to prevent a
race where rbd_dev_refresh() is already revalidating the disk when
rbd_remove() is called, move the call to rbd_bus_del_dev() after the
watch is unregistered and all notifies are complete. It's safe to
defer deleting this structure because no new requests can be submitted
once the RBD_DEV_FLAG_REMOVING is set, since the device cannot be
opened.
Fixes: http://tracker.ceph.com/issues/5636
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Alex Elder <elder@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 20e0af67ce upstream.
The only user of rbd_obj_notify_ack() is rbd_watch_cb(). It used
asynchronously with no tracking of when the notify ack completes, so
it may still be in progress when the osd_client is shut down. This
results in a BUG() since the osd client assumes no requests are in
flight when it stops. Since all notifies are flushed before the
osd_client is stopped, waiting for the notify ack to complete before
returning from the watch callback ensures there are no notify acks in
flight during shutdown.
Rename rbd_obj_notify_ack() to rbd_obj_notify_ack_sync() to reflect
its new synchronous nature.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Alex Elder <elder@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9abc59908e upstream.
To ensure rbd_dev is not used after it's released, flush all pending
notify callbacks before calling rbd_dev_image_release(). No new
notifies can be added to the queue at this point because the watch has
already be unregistered with the osd_client.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Alex Elder <elder@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c35455791c upstream.
The order parameter is sometimes NULL in _rbd_dev_v2_snap_size(), but
the dout() always derefences it. Move this to another dout() protected
by a check that order is non-NULL.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <alex.elder@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 03507db631 upstream.
rbd_osd_req_create() needs to know the snapshot context size to create
a buffer large enough to send it with the message front. It gets this
from the img_request, which was not set for the obj_request yet. This
resulted in trying to write past the end of the front payload, hitting
this BUG:
libceph: BUG_ON(p > msg->front.iov_base + msg->front.iov_len);
Fix this by associating the obj_request with its img_request
immediately after it's created, before the osd request is created.
Fixes: http://tracker.ceph.com/issues/5760
Suggested-by: Alex Elder <alex.elder@linaro.org>
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Alex Elder <alex.elder@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e976cad0f0 upstream.
gcc isn't quite smart enough and generates these warnings:
drivers/block/rbd.c: In function 'rbd_img_request_fill':
drivers/block/rbd.c:1266:22: warning: 'bio_list' may be used uninitialized in this function [-Wmaybe-uninitialized]
drivers/block/rbd.c:2186:14: note: 'bio_list' was declared here
drivers/block/rbd.c:2247:10: warning: 'pages' may be used uninitialized in this function [-Wmaybe-uninitialized]
even though they are initialized for their respective code paths.
Signed-off-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 82a442d239 upstream.
Make sure two concurrent unmap operations on the same rbd device
won't collide, by only proceeding with the removal and cleanup of a
device if is not already underway.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 751cc0e3cf upstream.
When unmapping a device, its id is supplied, and that is used to
look up which rbd device should be unmapped. Looking up the
device involves searching the rbd device list while holding
a spinlock that protects access to that list.
Currently all of this is done under protection of the control lock,
but that protection is going away soon. To ensure the rbd_dev is
still valid (still on the list) while setting its REMOVING flag, do
so while still holding the list lock. To do so, get rid of
__rbd_get_dev(), and open code what it did in the one place it
was used.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 17c1cc1d92 upstream.
When a request returns an error, the driver needs to report the entire
extent of the request as completed. Writes already did this, since
they always set xferred = length, but reads were skipping that step if
an error other than -ENOENT occurred. Instead, rbd would end up
passing 0 xferred to blk_end_request(), which would always report
needing more data. This resulted in an assert failing when more data
was required by the block layer, but all the object requests were
done:
[ 1868.719077] rbd: obj_request read result -108 xferred 0
[ 1868.719077]
[ 1868.719518] end_request: I/O error, dev rbd1, sector 0
[ 1868.719739]
[ 1868.719739] Assertion failure in rbd_img_obj_callback() at line 1736:
[ 1868.719739]
[ 1868.719739] rbd_assert(more ^ (which == img_request->obj_request_count));
Without this assert, reads that hit errors would hang forever, since
the block layer considered them incomplete.
Fixes: http://tracker.ceph.com/issues/5647
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Alex Elder <alex.elder@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b7649158a0 upstream.
In blkif_queue_request blkfront iterates over the scatterlist in order
to set the segments of the request, and in blkif_completion blkfront
iterates over the raw request, which makes it hard to know the exact
position of the source and destination memory positions.
This can be solved by allocating a scatterlist for each request, that
will be keep until the request is finished, allowing us to copy the
data back to the original memory without having to iterate over the
raw request.
Oracle-Bug: 16660413 - LARGE ASYNCHRONOUS READS APPEAR BROKEN ON 2.6.39-400
CC: stable@vger.kernel.org
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reported-and-Tested-by: Anne Milicia <anne.milicia@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 604c499cbb upstream.
We need to make sure that the device is not RO or that
the request is not past the number of sectors we want to
issue the DISCARD operation for.
This fixes CVE-2013-2140.
Acked-by: Jan Beulich <JBeulich@suse.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
[v1: Made it pr_warn instead of pr_debug]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c378f70adb upstream.
Currently, when a disconnect is requested by the user (via NBD_DISCONNECT
ioctl) the return from NBD_DO_IT is undefined (it is usually one of
several error codes). This means that nbd-client does not know if a
manual disconnect was performed or whether a network error occurred.
Because of this, nbd-client's persist mode (which tries to reconnect after
error, but not after manual disconnect) does not always work correctly.
This change fixes this by causing NBD_DO_IT to always return 0 if a user
requests a disconnect. This means that nbd-client can correctly either
persist the connection (if an error occurred) or disconnect (if the user
requested it).
Signed-off-by: Paul Clements <paul.clements@steeleye.com>
Acked-by: Rob Landley <rob@landley.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ffc8b30866 upstream.
Disk names may contain arbitrary strings, so they must not be
interpreted as format strings. It seems that only md allows arbitrary
strings to be used for disk names, but this could allow for a local
memory corruption from uid 0 into ring 0.
CVE-2013-2851
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull Ceph fix from Sage Weil:
"This is a recently spotted regression in the snapshot behavior...
It turns out several tests weren't being run in the nightlies so this
took a while to spot"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
rbd: send snapshot context with writes
Sending the right snapshot context with each write is required for
snapshots to work. Due to the ordering of calls, the snapshot context
is never set for any requests. This causes writes to the current
version of the image to be reflected in all snapshots, which are
supposed to be read-only.
This happens because rbd_osd_req_format_write() sets the snapshot
context based on obj_request->img_request. At this point, however,
obj_request->img_request has not been set yet, to the snapshot context
is set to NULL. Fix this by moving rbd_img_obj_request_add(), which
sets obj_request->img_request, before the osd request formatting
calls.
This resolves:
http://tracker.ceph.com/issues/5465
Reported-by: Karol Jurak <karol.jurak@gmail.com>
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@linaro.org>
Pull Ceph fix from Sage Weil:
"This fixes another problem with using v2 images on 3.10 due to the
order in which fields are read from the image header.
Hopefully this is the last one"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
rbd: fetch object order before using it
rbd_dev_v2_header_onetime() fetches striping information, and
checks whether the image can be read by compariing the stripe unit
to the object size. It determines the object size by shifting
the object order, which is 0 at this point since it has not been
read yet. Move the call to get the image size and object order
before rbd_dev_v2_header_onetime() so it is set before use.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Pull Ceph fix from Sage Weil:
"This fixes a problem preventing the kernel and userland librbd
libraries from sharing data with the new format 2 images"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
rbd: use the correct length for format 2 object names
Format 2 objects use 16 characters for the object name suffix to be
able to express the full 64-bit range of object numbers. Format 1
images only use 12 characters for this. Using 12-character names for
format 2 caused userspace and kernel rbd clients to read differently
named objects, which made an image written by one client look empty to
the other client.
CC: stable@vger.kernel.org # 3.9+
Reported-by: Chris Dunlop <chris@onthe.net.au>
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Pull block layer fixes from Jens Axboe:
"Outside of bcache (which really isn't super big), these are all
few-liners. There are a few important fixes in here:
- Fix blk pm sleeping when holding the queue lock
- A small collection of bcache fixes that have been done and tested
since bcache was included in this merge window.
- A fix for a raid5 regression introduced with the bio changes.
- Two important fixes for mtip32xx, fixing an oops and potential data
corruption (or hang) due to wrong bio iteration on stacked devices."
* 'for-linus' of git://git.kernel.dk/linux-block:
scatterlist: sg_set_buf() argument must be in linear mapping
raid5: Initialize bi_vcnt
pktcdvd: silence static checker warning
block: remove refs to XD disks from documentation
blkpm: avoid sleep when holding queue lock
mtip32xx: Correctly handle bio->bi_idx != 0 conditions
mtip32xx: Fix NULL pointer dereference during module unload
bcache: Fix error handling in init code
bcache: clarify free/available/unused space
bcache: drop "select CLOSURES"
bcache: Fix incompatible pointer type warning
Merge misc fixes from Andrew Morton:
"Bunch of fixes and one little addition to math64.h"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (27 commits)
include/linux/math64.h: add div64_ul()
mm: memcontrol: fix lockless reclaim hierarchy iterator
frontswap: fix incorrect zeroing and allocation size for frontswap_map
kernel/audit_tree.c:audit_add_tree_rule(): protect `rule' from kill_rules()
mm: migration: add migrate_entry_wait_huge()
ocfs2: add missing lockres put in dlm_mig_lockres_handler
mm/page_alloc.c: fix watermark check in __zone_watermark_ok()
drivers/misc/sgi-gru/grufile.c: fix info leak in gru_get_config_info()
aio: fix io_destroy() regression by using call_rcu()
rtc-at91rm9200: use shadow IMR on at91sam9x5
rtc-at91rm9200: add shadow interrupt mask
rtc-at91rm9200: refactor interrupt-register handling
rtc-at91rm9200: add configuration support
rtc-at91rm9200: add match-table compile guard
fs/ocfs2/namei.c: remove unecessary ERROR when removing non-empty directory
swap: avoid read_swap_cache_async() race to deadlock while waiting on discard I/O completion
drivers/rtc/rtc-twl.c: fix missing device_init_wakeup() when booted with device tree
cciss: fix broken mutex usage in ioctl
audit: wait_for_auditd() should use TASK_UNINTERRUPTIBLE
drivers/rtc/rtc-cmos.c: fix accidentally enabling rtc channel
...
If a new logical drive is added and the CCISS_REGNEWD ioctl is invoked
(as is normal with the Array Configuration Utility) the process will
hang as below. It attempts to acquire the same mutex twice, once in
do_ioctl() and once in cciss_unlocked_open(). The BKL was recursive,
the mutex isn't.
Linux version 3.10.0-rc2 (scameron@localhost.localdomain) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-3) (GCC) ) #1 SMP Fri May 24 14:32:12 CDT 2013
[...]
acu D 0000000000000001 0 3246 3191 0x00000080
Call Trace:
schedule+0x29/0x70
schedule_preempt_disabled+0xe/0x10
__mutex_lock_slowpath+0x17b/0x220
mutex_lock+0x2b/0x50
cciss_unlocked_open+0x2f/0x110 [cciss]
__blkdev_get+0xd3/0x470
blkdev_get+0x5c/0x1e0
register_disk+0x182/0x1a0
add_disk+0x17c/0x310
cciss_add_disk+0x13a/0x170 [cciss]
cciss_update_drive_info+0x39b/0x480 [cciss]
rebuild_lun_table+0x258/0x370 [cciss]
cciss_ioctl+0x34f/0x470 [cciss]
do_ioctl+0x49/0x70 [cciss]
__blkdev_driver_ioctl+0x28/0x30
blkdev_ioctl+0x200/0x7b0
block_ioctl+0x3c/0x40
do_vfs_ioctl+0x89/0x350
SyS_ioctl+0xa1/0xb0
system_call_fastpath+0x16/0x1b
This mutex usage was added into the ioctl path when the big kernel lock
was removed. As it turns out, these paths are all thread safe anyway
(or can easily be made so) and we don't want ioctl() to be single
threaded in any case.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Mike Miller <mike.miller@hp.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull ceph fixes from Sage Weil:
"There is a pair of fixes for double-frees in the recent bundle for
3.10, a couple of fixes for long-standing bugs (sleep while atomic and
an endianness fix), and a locking fix that can be triggered when osds
are going down"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
rbd: fix cleanup in rbd_add()
rbd: don't destroy ceph_opts in rbd_add()
ceph: ceph_pagelist_append might sleep while atomic
ceph: add cpu_to_le32() calls when encoding a reconnect capability
libceph: must hold mutex for reset_changed_osds()
Pull NVMe fixes from Matthew Wilcox.
* 'fixes-3.10' of git://git.infradead.org/users/willy/linux-nvme:
NVMe: Add MSI support
NVMe: Use dma_set_mask() correctly
Return the result from user admin command IOCTL even in case of failure
NVMe: Do not cancel command multiple times
NVMe: fix error return code in nvme_submit_bio_queue()
NVMe: check for integer overflow in nvme_map_user_pages()
MAINTAINERS: update NVM EXPRESS DRIVER file list
NVMe: Fix a signedness bug in nvme_trans_modesel_get_mp
NVMe: Remove redundant version.h header include
Some devices only have support for MSI, not MSI-X. While MSI is more
limited, it still provides better performance than line-based interrupts.
Signed-off-by: Ramachandra Gajula <rama@fastorsystems.com>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
Jiri writes:
please pull from
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/linux-block.git for-jens
to receive one pktcdvd fix. It fixes a highly theoretical issue with using.
pktcdvd to work with media that'd be larger than 2TB :) But it's a correct.
fix and makes static checkers shut up about improperly cleaning upper.
32bits.
Static checkers complain about widening the binary "not" operations here
because sectors are u64 and "(pd)->settings.size" is unsigned int.
It unintentionally clears the high 32 bits of the sector. This means
that the driver won't work for devices with over 2TB of space. Since
this is a DVD drive, we're unlikely to reach that limit, but we may as
well silence the warning.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
In some circumstances setting a 64-bit DMA mask can fail, as explained
in Documentation/DMA-API-HOWTO.txt. Use the recommended code sequence
to set a 32-bit DMA mask if setting a 64-bit DMA mask fails.
Reported-by: Chayan Biswas <Chayan.Biswas@sandisk.com>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
The index on the page must be set before it is inserted in the radix
tree. Otherwise there is a small race which can occur during lookup
where the page can be found with the incorrect index. This will trigger
the BUG_ON() in brd_lookup_page().
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reported-by: Chris Wedgwood <cw@f00f.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We copy the result to user if the command is completed from the
controller even if it completes with failure (non-zero) status.
A return status of < 0 indicates the command was not completed
by the controller. The user application may expect the error code
in the result field in case of failure.
Signed-off-by: Chayan Biswas <Chayan.Biswas@sandisk.com>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
Bjorn Helgaas pointed out that a recent commit introduced a
use-after-free condition in an error path for rbd_add().
He correctly stated:
I think b536f69a3a "rbd: set up devices only for mapped images"
introduced a use-after-free error in rbd_add():
...
If rbd_dev_device_setup() returns an error, we call
rbd_dev_image_release(), which ultimately kfrees rbd_dev.
Then we call rbd_dev_destroy(), which references fields in
the already-freed rbd_dev struct before kfreeing it again.
The simple fix is to return the error code after the call to
rbd_dev_image_release().
Closer examination revealed that there's no need to clean up
rbd_opts in that function, so fix that too.
Update some other comments that have also become out of date.
Reported-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Whether rbd_client_create() successfully creates a new client or
not, it takes responsibility for getting the ceph_opts structure
it's passed destroyed. If successful, the structure becomes
associated with the created client; if not, rbd_client_create()
will destroy it.
Previously, rbd_get_client() would call ceph_destroy_options()
if rbd_get_client() failed, and that meant it got called twice.
That led freeing various pointers more than once, which is never a
good idea.
This resolves:
http://tracker.ceph.com/issues/4559
Cc: stable@vger.kernel.org # 3.8+
Reported-by: Dan van der Ster <dan@vanderster.com>
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Cancelling an already cancelled command does not do anything, so check
the command context before cancelling it, continuing if had already been
cancelled so we do not log the same problem every second if a device
stops responding.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
nvme_submit_flush_data() might overwrite the initialisation of the
return value with 0, so move the -ENOMEM setting close to the usage.
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>