Commit Graph

79574 Commits

Author SHA1 Message Date
Jesse Gross
9f9818f8c1 tunnels: Remove encapsulation offloads on decap.
commit a09a4c8dd1 upstream.

If a packet is either locally encapsulated or processed through GRO
it is marked with the offloads that it requires. However, when it is
decapsulated these tunnel offload indications are not removed. This
means that if we receive an encapsulated TCP packet, aggregate it with
GRO, decapsulate, and retransmit the resulting frame on a NIC that does
not support encapsulation, we won't be able to take advantage of hardware
offloads even though it is just a simple TCP packet at this point.

This fixes the problem by stripping off encapsulation offload indications
when packets are decapsulated.

The performance impacts of this bug are significant. In a test where a
Geneve encapsulated TCP stream is sent to a hypervisor, GRO'ed, decapsulated,
and bridged to a VM performance is improved by 60% (5Gbps->8Gbps) as a
result of avoiding unnecessary segmentation at the VM tap interface.

Reported-by: Ramu Ramamurthy <sramamur@linux.vnet.ibm.com>
Fixes: 68c33163 ("v4 GRE: Add TCP segmentation offload for GRE")
Signed-off-by: Jesse Gross <jesse@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
(backported from commit a09a4c8dd1)
[adapt iptunnel_pull_header arguments, avoid 7f290c9]
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Juerg Haefliger <juerg.haefliger@hpe.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-31 04:13:59 -06:00
Jesse Gross
5699b3431e tunnels: Don't apply GRO to multiple layers of encapsulation.
commit fac8e0f579 upstream.

When drivers express support for TSO of encapsulated packets, they
only mean that they can do it for one layer of encapsulation.
Supporting additional levels would mean updating, at a minimum,
more IP length fields and they are unaware of this.

No encapsulation device expresses support for handling offloaded
encapsulated packets, so we won't generate these types of frames
in the transmit path. However, GRO doesn't have a check for
multiple levels of encapsulation and will attempt to build them.

UDP tunnel GRO actually does prevent this situation but it only
handles multiple UDP tunnels stacked on top of each other. This
generalizes that solution to prevent any kind of tunnel stacking
that would cause problems.

Fixes: bf5a755f ("net-gre-gro: Add GRE support to the GRO stack")
Signed-off-by: Jesse Gross <jesse@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Juerg Haefliger <juerg.haefliger@hpe.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-31 04:13:59 -06:00
Jan Kara
57c9cfdb61 posix_acl: Clear SGID bit when setting file permissions
commit 073931017b upstream.

When file permissions are modified via chmod(2) and the user is not in
the owning group or capable of CAP_FSETID, the setgid bit is cleared in
inode_change_ok().  Setting a POSIX ACL via setxattr(2) sets the file
permissions as well as the new ACL, but doesn't clear the setgid bit in
a similar way; this allows to bypass the check in chmod(2).  Fix that.

References: CVE-2016-7097
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-31 04:13:58 -06:00
Chris Wilson
7a9ddd2f1f drm/prime: Pass the right module owner through to dma_buf_export()
commit 56a76c0123 upstream.

dma_buf_export() adds a reference to the owning module to the dmabuf (to
prevent the driver from being unloaded whilst a third party still refers
to the dmabuf). However, drm_gem_prime_export() was passing its own
THIS_MODULE (i.e. drm.ko) rather than the driver. Extract the right
owner from the device->fops instead.

v2: Use C99 initializers to zero out unset elements of
dma_buf_export_info
v3: Extract the right module from dev->fops.

Testcase: igt/vgem_basic/unload
Reported-by: Petri Latvala <petri.latvala@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Petri Latvala <petri.latvala@intel.com>
Cc: Christian König <christian.koenig@amd.com>
Tested-by: Petri Latvala <petri.latvala@intel.com>
Reviewed-by: Petri Latvala <petri.latvala@intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20161005122145.1507-1-chris@chris-wilson.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-31 04:13:57 -06:00
Nicholas Bellinger
f4186e0b11 target: Make EXTENDED_COPY 0xe4 failure return COPY TARGET DEVICE NOT REACHABLE
commit 449a137846 upstream.

This patch addresses a bug where EXTENDED_COPY across multiple LUNs
results in a CHECK_CONDITION when the source + destination are not
located on the same physical node.

ESX Host environments expect sense COPY_ABORTED w/ COPY TARGET DEVICE
NOT REACHABLE to be returned when this occurs, in order to signal
fallback to local copy method.

As described in section 6.3.3 of spc4r22:

  "If it is not possible to complete processing of a segment because the
   copy manager is unable to establish communications with a copy target
   device, because the copy target device does not respond to INQUIRY,
   or because the data returned in response to INQUIRY indicates
   an unsupported logical unit, then the EXTENDED COPY command shall be
   terminated with CHECK CONDITION status, with the sense key set to
   COPY ABORTED, and the additional sense code set to COPY TARGET DEVICE
   NOT REACHABLE."

Tested on v4.1.y with ESX v5.5u2+ with BlockCopy across multiple nodes.

Reported-by: Nixon Vincent <nixon.vincent@calsoftinc.com>
Tested-by: Nixon Vincent <nixon.vincent@calsoftinc.com>
Cc: Nixon Vincent <nixon.vincent@calsoftinc.com>
Tested-by: Dinesh Israni <ddi@datera.io>
Signed-off-by: Dinesh Israni <ddi@datera.io>
Cc: Dinesh Israni <ddi@datera.io>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28 03:01:36 -04:00
Vladimir Murzin
76fd38eae3 irqchip/gic-v3-its: Fix entry size mask for GITS_BASER
commit 9224eb77e6 upstream.

Entry Size in GITS_BASER<n> occupies 5 bits [52:48], but we mask out 8
bits.

Fixes: cc2d3216f5 ("irqchip: GICv3: ITS command queue")
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28 03:01:34 -04:00
Linus Torvalds
e765b19209 Using BUG_ON() as an assert() is _never_ acceptable
commit 21f54ddae4 upstream.

That just generally kills the machine, and makes debugging only much
harder, since the traces may long be gone.

Debugging by assert() is a disease.  Don't do it.  If you can continue,
you're much better off doing so with a live machine where you have a
much higher chance that the report actually makes it to the system logs,
rather than result in a machine that is just completely dead.

The only valid situation for BUG_ON() is when continuing is not an
option, because there is massive corruption.  But if you are just
verifying that something is true, you warn about your broken assumptions
(preferably just once), and limp on.

Fixes: 22f2ac51b6 ("mm: workingset: fix crash in shadow node shrinker caused by replace_page_cache_page()")
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28 03:01:34 -04:00
Johannes Weiner
f84311d7cd mm: workingset: fix crash in shadow node shrinker caused by replace_page_cache_page()
commit 22f2ac51b6 upstream.

Antonio reports the following crash when using fuse under memory pressure:

  kernel BUG at /build/linux-a2WvEb/linux-4.4.0/mm/workingset.c:346!
  invalid opcode: 0000 [#1] SMP
  Modules linked in: all of them
  CPU: 2 PID: 63 Comm: kswapd0 Not tainted 4.4.0-36-generic #55-Ubuntu
  Hardware name: System manufacturer System Product Name/P8H67-M PRO, BIOS 3904 04/27/2013
  task: ffff88040cae6040 ti: ffff880407488000 task.ti: ffff880407488000
  RIP: shadow_lru_isolate+0x181/0x190
  Call Trace:
    __list_lru_walk_one.isra.3+0x8f/0x130
    list_lru_walk_one+0x23/0x30
    scan_shadow_nodes+0x34/0x50
    shrink_slab.part.40+0x1ed/0x3d0
    shrink_zone+0x2ca/0x2e0
    kswapd+0x51e/0x990
    kthread+0xd8/0xf0
    ret_from_fork+0x3f/0x70

which corresponds to the following sanity check in the shadow node
tracking:

  BUG_ON(node->count & RADIX_TREE_COUNT_MASK);

The workingset code tracks radix tree nodes that exclusively contain
shadow entries of evicted pages in them, and this (somewhat obscure)
line checks whether there are real pages left that would interfere with
reclaim of the radix tree node under memory pressure.

While discussing ways how fuse might sneak pages into the radix tree
past the workingset code, Miklos pointed to replace_page_cache_page(),
and indeed there is a problem there: it properly accounts for the old
page being removed - __delete_from_page_cache() does that - but then
does a raw raw radix_tree_insert(), not accounting for the replacement
page.  Eventually the page count bits in node->count underflow while
leaving the node incorrectly linked to the shadow node LRU.

To address this, make sure replace_page_cache_page() uses the tracked
page insertion code, page_cache_tree_insert().  This fixes the page
accounting and makes sure page-containing nodes are properly unlinked
from the shadow node LRU again.

Also, make the sanity checks a bit less obscure by using the helpers for
checking the number of pages and shadows in a radix tree node.

[mhocko@suse.com: backport for 4.4]
Fixes: 449dd6984d ("mm: keep page cache radix tree nodes in check")
Link: http://lkml.kernel.org/r/20160919155822.29498-1-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Antonio SJ Musumeci <trapexit@spawn.link>
Debugged-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28 03:01:34 -04:00
Jens Axboe
f2e81c0260 lightnvm: ensure that nvm_dev_ops can be used without CONFIG_NVM
commit a7fd9a4f3e upstream.

null_blk defines an empty version of this ops structure if CONFIG_NVM
isn't set, but it doesn't know the type. Move those bits out of the
protection of CONFIG_NVM in the main lightnvm include.

Signed-off-by: Jens Axboe <axboe@fb.com>
[pebolle: backport to v4.4]
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28 03:01:32 -04:00
Manfred Spraul
f6031d9532 ipc/sem.c: fix complex_count vs. simple op race
commit 5864a2fd30 upstream.

Commit 6d07b68ce1 ("ipc/sem.c: optimize sem_lock()") introduced a
race:

sem_lock has a fast path that allows parallel simple operations.
There are two reasons why a simple operation cannot run in parallel:
 - a non-simple operations is ongoing (sma->sem_perm.lock held)
 - a complex operation is sleeping (sma->complex_count != 0)

As both facts are stored independently, a thread can bypass the current
checks by sleeping in the right positions.  See below for more details
(or kernel bugzilla 105651).

The patch fixes that by creating one variable (complex_mode)
that tracks both reasons why parallel operations are not possible.

The patch also updates stale documentation regarding the locking.

With regards to stable kernels:
The patch is required for all kernels that include the
commit 6d07b68ce1 ("ipc/sem.c: optimize sem_lock()") (3.10?)

The alternative is to revert the patch that introduced the race.

The patch is safe for backporting, i.e. it makes no assumptions
about memory barriers in spin_unlock_wait().

Background:
Here is the race of the current implementation:

Thread A: (simple op)
- does the first "sma->complex_count == 0" test

Thread B: (complex op)
- does sem_lock(): This includes an array scan. But the scan can't
  find Thread A, because Thread A does not own sem->lock yet.
- the thread does the operation, increases complex_count,
  drops sem_lock, sleeps

Thread A:
- spin_lock(&sem->lock), spin_is_locked(sma->sem_perm.lock)
- sleeps before the complex_count test

Thread C: (complex op)
- does sem_lock (no array scan, complex_count==1)
- wakes up Thread B.
- decrements complex_count

Thread A:
- does the complex_count test

Bug:
Now both thread A and thread C operate on the same array, without
any synchronization.

Fixes: 6d07b68ce1 ("ipc/sem.c: optimize sem_lock()")
Link: http://lkml.kernel.org/r/1469123695-5661-1-git-send-email-manfred@colorfullife.com
Reported-by: <felixh@informatik.uni-bremen.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: <1vier1@web.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28 03:01:32 -04:00
Kees Cook
e0d61779d3 lib: move strtobool() to kstrtobool()
commit ef95159907 upstream.

Create the kstrtobool_from_user() helper and move strtobool() logic into
the new kstrtobool() (matching all the other kstrto* functions).
Provides an inline wrapper for existing strtobool() callers.

Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Joe Perches <joe@perches.com>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Amitkumar Karwar <akarwar@marvell.com>
Cc: Nishant Sarmukadam <nishants@marvell.com>
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: Steve French <sfrench@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28 03:01:30 -04:00
Lin Huang
8bfd0c41d2 PM / devfreq: event: remove duplicate devfreq_event_get_drvdata()
commit c8a9a6dacc upstream.

there define two devfreq_event_get_drvdata() function in devfreq-event.h
when disable CONFIG_PM_DEVFREQ_EVENT, it will lead to build fail. So
remove devfreq_event_get_drvdata() function.

Fixes: f262f28c14 ("PM / devfreq: event: Add devfreq_event class")
Signed-off-by: Lin Huang <hl@rock-chips.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28 03:01:26 -04:00
Miklos Szeredi
b3b4283f7e vfs: move permission checking into notify_change() for utimes(NULL)
commit f2b20f6ee8 upstream.

This fixes a bug where the permission was not properly checked in
overlayfs.  The testcase is ltp/utimensat01.

It is also cleaner and safer to do the permission checking in the vfs
helper instead of the caller.

This patch introduces an additional ia_valid flag ATTR_TOUCH (since
touch(1) is the most obvious user of utimes(NULL)) that is passed into
notify_change whenever the conditions for this special permission checking
mode are met.

Reported-by: Aihua Zhang <zhangaihua1@huawei.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Tested-by: Aihua Zhang <zhangaihua1@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-22 12:26:56 +02:00
Marcelo Cerri
4f8c2ad3ee crypto: ghash-generic - move common definitions to a new header file
commit a397ba829d upstream.

Move common values and types used by ghash-generic to a new header file
so drivers can directly use ghash-generic as a fallback implementation.

Fixes: cc333cd68d ("crypto: vmx - Adding GHASH routines for VMX module")
Signed-off-by: Marcelo Cerri <marcelo.cerri@canonical.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-22 12:26:56 +02:00
Linus Torvalds
1294d35588 mm: remove gup_flags FOLL_WRITE games from __get_user_pages()
commit 19be0eaffa upstream.

This is an ancient bug that was actually attempted to be fixed once
(badly) by me eleven years ago in commit 4ceb5db975 ("Fix
get_user_pages() race for write access") but that was then undone due to
problems on s390 by commit f33ea7f404 ("fix get_user_pages bug").

In the meantime, the s390 situation has long been fixed, and we can now
fix it by checking the pte_dirty() bit properly (and do it better).  The
s390 dirty bit was implemented in abf09bed3c ("s390/mm: implement
software dirty bits") which made it into v3.9.  Earlier kernels will
have to look at the page state itself.

Also, the VM has become more scalable, and what used a purely
theoretical race back then has become easier to trigger.

To fix it, we introduce a new internal FOLL_COW flag to mark the "yes,
we already did a COW" rather than play racy games with FOLL_WRITE that
is very fundamental, and then use the pte dirty flag to validate that
the FOLL_COW flag is still valid.

Reported-and-tested-by: Phil "not Paul" Oester <kernel@linuxace.com>
Acked-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Michal Hocko <mhocko@suse.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Nick Piggin <npiggin@gmail.com>
Cc: Greg Thelen <gthelen@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-20 10:00:47 +02:00
Dan Carpenter
0114e3e527 mfd: 88pm80x: Double shifting bug in suspend/resume
commit 9a6dc64451 upstream.

set_bit() and clear_bit() take the bit number so this code is really
doing "1 << (1 << irq)" which is a double shift bug.  It's done
consistently so it won't cause a problem unless "irq" is more than 4.

Fixes: 70c6cce040 ('mfd: Support 88pm80x in 80x driver')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-16 17:36:14 +02:00
Subash Abhinov Kasiviswanathan
70cd763eb1 sysctl: handle error writing UINT_MAX to u32 fields
commit e7d316a02f upstream.

We have scripts which write to certain fields on 3.18 kernels but this
seems to be failing on 4.4 kernels.  An entry which we write to here is
xfrm_aevent_rseqth which is u32.

  echo 4294967295  > /proc/sys/net/core/xfrm_aevent_rseqth

Commit 230633d109 ("kernel/sysctl.c: detect overflows when converting
to int") prevented writing to sysctl entries when integer overflow
occurs.  However, this does not apply to unsigned integers.

Heinrich suggested that we introduce a new option to handle 64 bit
limits and set min as 0 and max as UINT_MAX.  This might not work as it
leads to issues similar to __do_proc_doulongvec_minmax.  Alternatively,
we would need to change the datatype of the entry to 64 bit.

  static int __do_proc_doulongvec_minmax(void *data, struct ctl_table
  {
      i = (unsigned long *) data;   //This cast is causing to read beyond the size of data (u32)
      vleft = table->maxlen / sizeof(unsigned long); //vleft is 0 because maxlen is sizeof(u32) which is lesser than sizeof(unsigned long) on x86_64.

Introduce a new proc handler proc_douintvec.  Individual proc entries
will need to be updated to use the new handler.

[akpm@linux-foundation.org: coding-style fixes]
Fixes: 230633d109 ("kernel/sysctl.c:detect overflows when converting to int")
Link: http://lkml.kernel.org/r/1471479806-5252-1-git-send-email-subashab@codeaurora.org
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Cc: Heinrich Schuchardt <xypron.glpk@gmx.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-07 15:23:46 +02:00
Guenter Roeck
8e8a07f900 nvmem: Declare nvmem_cell_read() consistently
commit a6c5091250 upstream.

nvmem_cell_read() is declared as void * if CONFIG_NVMEM is enabled, and
as char * otherwise. This can result in a build warning if CONFIG_NVMEM
is not enabled and a caller asigns the result to a type other than char *
without using a typecast. Use a consistent declaration to avoid the
problem.

Fixes: e2a5402ec7 ("nvmem: Add nvmem_device based consumer apis.")
Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-07 15:23:41 +02:00
Sergei Miroshnichenko
7677956e40 can: dev: fix deadlock reported after bus-off
commit 9abefcb1aa upstream.

A timer was used to restart after the bus-off state, leading to a
relatively large can_restart() executed in an interrupt context,
which in turn sets up pinctrl. When this happens during system boot,
there is a high probability of grabbing the pinctrl_list_mutex,
which is locked already by the probe() of other device, making the
kernel suspect a deadlock condition [1].

To resolve this issue, the restart_timer is replaced by a delayed
work.

[1] https://github.com/victronenergy/venus/issues/24

Signed-off-by: Sergei Miroshnichenko <sergeimir@emcraft.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-07 15:23:40 +02:00
Al Viro
3f5d8326a8 fix fault_in_multipages_...() on architectures with no-op access_ok()
commit e23d4159b1 upstream.

Switching iov_iter fault-in to multipages variants has exposed an old
bug in underlying fault_in_multipages_...(); they break if the range
passed to them wraps around.  Normally access_ok() done by callers will
prevent such (and it's a guaranteed EFAULT - ERR_PTR() values fall into
such a range and they should not point to any valid objects).

However, on architectures where userland and kernel live in different
MMU contexts (e.g. s390) access_ok() is a no-op and on those a range
with a wraparound can reach fault_in_multipages_...().

Since any wraparound means EFAULT there, the fix is trivial - turn
those

    while (uaddr <= end)
	    ...
into

    if (unlikely(uaddr > end))
	    return -EFAULT;
    do
	    ...
    while (uaddr <= end);

Reported-by: Jan Stancek <jstancek@redhat.com>
Tested-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:37 +02:00
Jan Kara
6e67de3922 fanotify: fix list corruption in fanotify_get_response()
commit 96d41019e3 upstream.

fanotify_get_response() calls fsnotify_remove_event() when it finds that
group is being released from fanotify_release() (bypass_perm is set).

However the event it removes need not be only in the group's notification
queue but it can have already moved to access_list (userspace read the
event before closing the fanotify instance fd) which is protected by a
different lock.  Thus when fsnotify_remove_event() races with
fanotify_release() operating on access_list, the list can get corrupted.

Fix the problem by moving all the logic removing permission events from
the lists to one place - fanotify_release().

Fixes: 5838d4442b ("fanotify: fix double free of pending permission events")
Link: http://lkml.kernel.org/r/1473797711-14111-3-git-send-email-jack@suse.cz
Signed-off-by: Jan Kara <jack@suse.cz>
Reported-by: Miklos Szeredi <mszeredi@redhat.com>
Tested-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:37 +02:00
Jan Kara
af426ec184 fsnotify: add a way to stop queueing events on group shutdown
commit 12703dbfeb upstream.

Implement a function that can be called when a group is being shutdown
to stop queueing new events to the group.  Fanotify will use this.

Fixes: 5838d4442b ("fanotify: fix double free of pending permission events")
Link: http://lkml.kernel.org/r/1473797711-14111-2-git-send-email-jack@suse.cz
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:37 +02:00
Russell King
8c945f5aac net: smc91x: fix SMC accesses
[ Upstream commit 2fb04fdf30 ]

Commit b70661c708 ("net: smc91x: use run-time configuration on all ARM
machines") broke some ARM platforms through several mistakes.  Firstly,
the access size must correspond to the following rule:

(a) at least one of 16-bit or 8-bit access size must be supported
(b) 32-bit accesses are optional, and may be enabled in addition to
    the above.

Secondly, it provides no emulation of 16-bit accesses, instead blindly
making 16-bit accesses even when the platform specifies that only 8-bit
is supported.

Reorganise smc91x.h so we can make use of the existing 16-bit access
emulation already provided - if 16-bit accesses are supported, use
16-bit accesses directly, otherwise if 8-bit accesses are supported,
use the provided 16-bit access emulation.  If neither, BUG().  This
exactly reflects the driver behaviour prior to the commit being fixed.

Since the conversion incorrectly cut down the available access sizes on
several platforms, we also need to go through every platform and fix up
the overly-restrictive access size: Arnd assumed that if a platform can
perform 32-bit, 16-bit and 8-bit accesses, then only a 32-bit access
size needed to be specified - not so, all available access sizes must
be specified.

This likely fixes some performance regressions in doing this: if a
platform does not support 8-bit accesses, 8-bit accesses have been
emulated by performing a 16-bit read-modify-write access.

Tested on the Intel Assabet/Neponset platform, which supports only 8-bit
accesses, which was broken by the original commit.

Fixes: b70661c708 ("net: smc91x: use run-time configuration on all ARM machines")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Tested-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:37 +02:00
Linus Torvalds
9b5390d7b6 af_unix: split 'u->readlock' into two: 'iolock' and 'bindlock'
commit 6e1ce3c345 upstream.

Right now we use the 'readlock' both for protecting some of the af_unix
IO path and for making the bind be single-threaded.

The two are independent, but using the same lock makes for a nasty
deadlock due to ordering with regards to filesystem locking.  The bind
locking would want to nest outside the VSF pathname locking, but the IO
locking wants to nest inside some of those same locks.

We tried to fix this earlier with commit c845acb324 ("af_unix: Fix
splice-bind deadlock") which moved the readlock inside the vfs locks,
but that caused problems with overlayfs that will then call back into
filesystem routines that take the lock in the wrong order anyway.

Splitting the locks means that we can go back to having the bind lock be
the outermost lock, and we don't have any deadlocks with lock ordering.

Acked-by: Rainer Weikusat <rweikusat@cyberadapt.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:36 +02:00
Mahesh Bandewar
f357a79839 bonding: Fix bonding crash
[ Upstream commit 24b27fc4cd ]

Following few steps will crash kernel -

  (a) Create bonding master
      > modprobe bonding miimon=50
  (b) Create macvlan bridge on eth2
      > ip link add link eth2 dev mvl0 address aa:0:0:0:0:01 \
	   type macvlan
  (c) Now try adding eth2 into the bond
      > echo +eth2 > /sys/class/net/bond0/bonding/slaves
      <crash>

Bonding does lots of things before checking if the device enslaved is
busy or not.

In this case when the notifier call-chain sends notifications, the
bond_netdev_event() assumes that the rx_handler /rx_handler_data is
registered while the bond_enslave() hasn't progressed far enough to
register rx_handler for the new slave.

This patch adds a rx_handler check that can be performed right at the
beginning of the enslave code to avoid getting into this situation.

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:36 +02:00
Eric Dumazet
0f55fa7541 tcp: fix use after free in tcp_xmit_retransmit_queue()
[ Upstream commit bb1fceca22 ]

When tcp_sendmsg() allocates a fresh and empty skb, it puts it at the
tail of the write queue using tcp_add_write_queue_tail()

Then it attempts to copy user data into this fresh skb.

If the copy fails, we undo the work and remove the fresh skb.

Unfortunately, this undo lacks the change done to tp->highest_sack and
we can leave a dangling pointer (to a freed skb)

Later, tcp_xmit_retransmit_queue() can dereference this pointer and
access freed memory. For regular kernels where memory is not unmapped,
this might cause SACK bugs because tcp_highest_sack_seq() is buggy,
returning garbage instead of tp->snd_nxt, but with various debug
features like CONFIG_DEBUG_PAGEALLOC, this can crash the kernel.

This bug was found by Marco Grassi thanks to syzkaller.

Fixes: 6859d49475 ("[TCP]: Abstract tp->highest_sack accessing & point to next skb")
Reported-by: Marco Grassi <marco.gra@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
2016-09-30 10:18:34 +02:00
Michal Nazarewicz
df12772578 include/linux/kernel.h: change abs() macro so it uses consistent return type
commit 8f57e4d930 upstream.

Rewrite abs() so that its return type does not depend on the
architecture and no unexpected type conversion happen inside of it.  The
only conversion is from unsigned to signed type.  char is left as a
return type but treated as a signed type regradless of it's actual
signedness.

With the old version, int arguments were promoted to long and depending
on architecture a long argument might result in s64 or long return type
(which may or may not be the same).

This came after some back and forth with Nicolas.  The current macro has
different return type (for the same input type) depending on
architecture which might be midly iritating.

An alternative version would promote to int like so:

	#define abs(x)	__abs_choose_expr(x, long long,			\
			__abs_choose_expr(x, long,			\
			__builtin_choose_expr(				\
				sizeof(x) <= sizeof(int),		\
				({ int __x = (x); __x<0?-__x:__x; }),	\
				((void)0))))

I have no preference but imagine Linus might.  :] Nicolas argument against
is that promoting to int causes iconsistent behaviour:

	int main(void) {
		unsigned short a = 0, b = 1, c = a - b;
		unsigned short d = abs(a - b);
		unsigned short e = abs(c);
		printf("%u %u\n", d, e);  // prints: 1 65535
	}

Then again, no sane person expects consistent behaviour from C integer
arithmetic.  ;)

Note:

  __builtin_types_compatible_p(unsigned char, char) is always false, and
  __builtin_types_compatible_p(signed char, char) is also always false.

Signed-off-by: Michal Nazarewicz <mina86@mina86.com>
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:33 +02:00
Al Viro
2e51ca2d94 asm-generic: make copy_from_user() zero the destination properly
commit 2545e5da08 upstream.

... in all cases, including the failing access_ok()

Note that some architectures using asm-generic/uaccess.h have
__copy_from_user() not zeroing the tail on failure halfway
through.  This variant works either way.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-24 10:07:45 +02:00
Al Viro
c6a4404dc2 asm-generic: make get_user() clear the destination on errors
commit 9ad18b75c2 upstream.

both for access_ok() failures and for faults halfway through

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-24 10:07:44 +02:00
Al Viro
99526912c9 fix iov_iter_fault_in_readable()
commit d4690f1e1c upstream.

... by turning it into what used to be multipages counterpart

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-24 10:07:43 +02:00
Boris Brezillon
17b54ccf49 genirq: Provide irq_gc_{lock_irqsave,unlock_irqrestore}() helpers
commit ebf9ff753c upstream.

Some irqchip drivers need to take the generic chip lock outside of the
irq context.

Provide the irq_gc_{lock_irqsave,unlock_irqrestore}() helpers to allow
one to disable irqs while entering a critical section protected by
gc->lock.

Note that we do not provide optimized version of these helpers for !SMP,
because they are not called from the hot-path.

[ tglx: Added a comment when these helpers should be [not] used ]

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Link: http://lkml.kernel.org/r/1473775109-4192-1-git-send-email-boris.brezillon@free-electrons.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-24 10:07:43 +02:00
Arnd Bergmann
b214985cfa net: simplify napi_synchronize() to avoid warnings
commit facc432faa upstream.

The napi_synchronize() function is defined twice: The definition
for SMP builds waits for other CPUs to be done, while the uniprocessor
variant just contains a barrier and ignores its argument.

In the mvneta driver, this leads to a warning about an unused variable
when we lookup the NAPI struct of another CPU and then don't use it:

ethernet/marvell/mvneta.c: In function 'mvneta_percpu_notifier':
ethernet/marvell/mvneta.c:2910:30: error: unused variable 'other_port' [-Werror=unused-variable]

There are no other CPUs on a UP build, so that code never runs, but
gcc does not know this.

The nicest solution seems to be to turn the napi_synchronize() helper
into an inline function for the UP case as well, as that leads gcc to
not complain about the argument being unused. Once we do that, we can
also combine the two cases into a single function definition and use
if(IS_ENABLED()) rather than #ifdef to make it look a bit nicer.

The warning first came up in linux-4.4, but I failed to catch it
earlier.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: f864288544 ("net: mvneta: Statically assign queues to CPUs")
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-24 10:07:42 +02:00
Arnd Bergmann
59e62eb42a mmc: dw_mmc: use resource_size_t to store physical address
commit 260b316436 upstream.

The dw_mmc driver stores the physical address of the MMIO registers
in a pointer, which requires the use of type casts, and is actually
broken if anyone ever has this device on a 32-bit SoC in registers
above 4GB. Gcc warns about this possibility when the driver is built
with ARM LPAE enabled:

mmc/host/dw_mmc.c: In function 'dw_mci_edmac_start_dma':
mmc/host/dw_mmc.c:702:17: warning: cast from pointer to integer of different size
  cfg.dst_addr = (dma_addr_t)(host->phy_regs + fifo_offset);
                 ^
mmc/host/dw_mmc-pltfm.c: In function 'dw_mci_pltfm_register':
mmc/host/dw_mmc-pltfm.c:63:19: warning: cast to pointer from integer of different size
  host->phy_regs = (void *)(regs->start);

This changes the code to use resource_size_t, which gets rid of the
warning, the bug and the useless casts.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-24 10:07:42 +02:00
Vignesh R
023e76b1d1 iio: adc: ti_am335x_adc: Increase timeout value waiting for ADC sample
commit 7175cce1c3 upstream.

Now that open delay and sample delay for each channel is configurable
via DT, the default IDLE_TIMEOUT value is not enough as this is
calculated based on hardcoded macros. This results in driver returning
EBUSY sometimes. Fix this by increasing the timeout
value based on maximum value possible to open delay and sample delays
for each channel.

Fixes: 5dc11e8106 ("iio: adc: ti_am335x_adc: make sample delay, open delay, averaging DT parameters")
Signed-off-by: Vignesh R <vigneshr@ti.com>
Acked-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-24 10:07:38 +02:00
Mateusz Guzik
f750847daa mm: introduce get_task_exe_file
commit cd81a9170e upstream.

For more convenient access if one has a pointer to the task.

As a minor nit take advantage of the fact that only task lock + rcu are
needed to safely grab ->exe_file. This saves mm refcount dance.

Use the helper in proc_exe_link.

Signed-off-by: Mateusz Guzik <mguzik@redhat.com>
Acked-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Acked-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-24 10:07:36 +02:00
Al Viro
d72e9b2566 wrappers for ->i_mutex access
commit 5955102c99 upstream

parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
inode_foo(inode) being mutex_foo(&inode->i_mutex).

Please, use those for access to ->i_mutex; over the coming cycle
->i_mutex will become rwsem, with ->lookup() done with it held
only shared.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
[only the fs.h change included to make backports easier - gregkh]
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-15 08:27:52 +02:00
Tyler Hicks
ad7c1399b7 kernel: Add noaudit variant of ns_capable()
commit 98f368e9e2 upstream.

When checking the current cred for a capability in a specific user
namespace, it isn't always desirable to have the LSMs audit the check.
This patch adds a noaudit variant of ns_capable() for when those
situations arise.

The common logic between ns_capable() and the new ns_capable_noaudit()
is moved into a single, shared function to keep duplicated code to a
minimum and ease maintainability.

Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-15 08:27:50 +02:00
Matthew R. Ochs
f88503578d cxlflash: Fix to avoid virtual LUN failover failure
[ Upstream commit d5e26bb1d8 ]

Applications which use virtual LUN's that are backed by a physical LUN
over both adapter ports may experience an I/O failure in the event of a
link loss (e.g. cable pull).

Virtual LUNs may be accessed through one or both ports of the adapter.
This access is encoded in the translation entries that comprise the
virtual LUN and used by the AFU for load-balancing I/O and handling
failover scenarios. In a link loss scenario, even though the AFU is able
to maintain connectivity to the LUN, it is up to the application to
retry the failed I/O. When applications are unaware of the virtual LUN's
underlying topology, they are unable to make a sound decision of when to
retry an I/O and therefore are forced to make their reaction to a failed
I/O absolute. The result is either a failure to retry I/O or increased
latency for scenarios where a retry is pointless.

To remedy this scenario, provide feedback back to the application on
virtual LUN creation as to which ports the LUN may be accessed. LUN's
spanning both ports are candidates for a retry in a presence of an I/O
failure.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Acked-by: Manoj Kumar <manoj@linux.vnet.ibm.com>
Reviewed-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-15 08:27:49 +02:00
Christoph Hellwig
f7cd8506b3 block: fix blk_rq_get_max_sectors for driver private requests
[ Upstream commit f21018427c ]

Driver private request types should not get the artifical cap for the
FS requests.  This is important to use the full device capabilities
for internal command or NVMe pass through commands.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Jeff Lien <Jeff.Lien@hgst.com>
Tested-by: Jeff Lien <Jeff.Lien@hgst.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>

Updated by me to use an explicit check for the one command type that
does support extended checking, instead of relying on the ordering
of the enum command values - as suggested by Keith.

Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-15 08:27:47 +02:00
John Stultz
e79e7333c3 time: Verify time values in adjtimex ADJ_SETOFFSET to avoid overflow
[ Upstream commit 37cf4dc337 ]

For adjtimex()'s ADJ_SETOFFSET, make sure the tv_usec value is
sane. We might multiply them later which can cause an overflow
and undefined behavior.

This patch introduces new helper functions to simplify the
checking code and adds comments to clarify

Orginally this patch was by Sasha Levin, but I've basically
rewritten it, so he should get credit for finding the issue
and I should get the blame for any mistakes made since.

Also, credit to Richard Cochran for the phrasing used in the
comment for what is considered valid here.

Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-15 08:27:47 +02:00
Vikas Shivappa
0b152db042 perf/x86/cqm: Fix CQM handling of grouping events into a cache_group
[ Upstream commit a223c1c7ab ]

Currently CQM (cache quality of service monitoring) is grouping all
events belonging to same PID to use one RMID. However its not counting
all of these different events. Hence we end up with a count of zero
for all events other than the group leader.

The patch tries to address the issue by keeping a flag in the
perf_event.hw which has other CQM related fields. The field is updated
at event creation and during grouping.

Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
[peterz: Changed hw_perf_event::is_group_event to an int]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: fenghua.yu@intel.com
Cc: h.peter.anvin@intel.com
Cc: ravi.v.shankar@intel.com
Cc: vikas.shivappa@intel.com
Link: http://lkml.kernel.org/r/1457652732-4499-2-git-send-email-vikas.shivappa@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-15 08:27:46 +02:00
Johannes Weiner
f3de8fbe2a proc: revert /proc/<pid>/maps [stack:TID] annotation
[ Upstream commit 65376df582 ]

Commit b76437579d ("procfs: mark thread stack correctly in
proc/<pid>/maps") added [stack:TID] annotation to /proc/<pid>/maps.

Finding the task of a stack VMA requires walking the entire thread list,
turning this into quadratic behavior: a thousand threads means a
thousand stacks, so the rendering of /proc/<pid>/maps needs to look at a
million combinations.

The cost is not in proportion to the usefulness as described in the
patch.

Drop the [stack:TID] annotation to make /proc/<pid>/maps (and
/proc/<pid>/numa_maps) usable again for higher thread counts.

The [stack] annotation inside /proc/<pid>/task/<tid>/maps is retained, as
identifying the stack VMA there is an O(1) operation.

Siddesh said:
 "The end users needed a way to identify thread stacks programmatically and
  there wasn't a way to do that.  I'm afraid I no longer remember (or have
  access to the resources that would aid my memory since I changed
  employers) the details of their requirement.  However, I did do this on my
  own time because I thought it was an interesting project for me and nobody
  really gave any feedback then as to its utility, so as far as I am
  concerned you could roll back the main thread maps information since the
  information is available in the thread-specific files"

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Siddhesh Poyarekar <siddhesh.poyarekar@gmail.com>
Cc: Shaohua Li <shli@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-15 08:27:46 +02:00
Aviv Greenberg
840a59324e UVC: Add support for R200 depth camera
[ Upstream commit 5d8d8db851 ]

Add support for Intel R200 depth camera in uvc driver.
This includes adding new uvc GUIDs for the new pixel formats,
adding new V4L pixel format definition to user api headers,
and updating the uvc driver GUID-to-4cc tables with the new formats.

Tested-by: Greenberg, Aviv D <aviv.d.greenberg@intel.com>
Signed-off-by: Aviv Greenberg <aviv.d.greenberg@intel.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-15 08:27:45 +02:00
Michał Winiarski
bea3a6d7c5 drm/i915/skl: Add missing SKL ids
[ Upstream commit 7157bb27e7 ]

Used by production devices:
    Intel(R) Iris Graphics 540 (Skylake GT3e)
    Intel(R) Iris Graphics 550 (Skylake GT3e)

v2: More ids
v3: Less ids (GT1 got duplicated)

Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1454674902-26207-1-git-send-email-michal.winiarski@intel.com
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-15 08:27:44 +02:00
Imre Deak
c459dadb94 drm/i915/bxt: update list of PCIIDs
[ Upstream commit 985dd4360f ]

Add PCIIDs for new versions of the SOC, based on BSpec. Also add the
name of the versions as code comment where this is available. The new
versions don't have any changes visible to the kernel driver.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1453989852-13569-1-git-send-email-imre.deak@intel.com
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-15 08:27:44 +02:00
Olaf Hering
10b8e4ebc4 tools: hv: report ENOSPC errors in hv_fcopy_daemon
[ Upstream commit b4ed5d1682 ]

Currently some "Unspecified error 0x80004005" is reported on the Windows
side if something fails. Handle the ENOSPC case and return
ERROR_DISK_FULL, which allows at least Copy-VMFile to report a meaning
full error.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-15 08:27:40 +02:00
Matias Bjørling
12e2d36594 lightnvm: fix missing grown bad block type
[ Upstream commit b5d4acd4cb ]

The get/set bad block interface defines good block, factory bad block,
grown bad block, device reserved block, and host reserved block.
Unfortunately the grown bad block was missing, leaving the offsets wrong
for device and host side reserved blocks.

This patch adds the missing type and corrects the offsets.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-15 08:27:40 +02:00
Lorenzo Pieralisi
0b21b21b58 ACPI / drivers: fix typo in ACPI_DECLARE_PROBE_ENTRY macro
commit 3feab13c91 upstream.

When the ACPI_DECLARE_PROBE_ENTRY macro was added in
commit e647b53227 ("ACPI: Add early device probing infrastructure"),
a stub macro adding an unused entry was added for the !CONFIG_ACPI
Kconfig option case to make sure kernel code making use of the
macro did not require to be guarded within CONFIG_ACPI in order to
be compiled.

The stub macro was never used since all kernel code that defines
ACPI_DECLARE_PROBE_ENTRY entries is currently guarded within
CONFIG_ACPI; it contains a typo that should be nonetheless fixed.

Fix the typo in the stub (ie !CONFIG_ACPI) ACPI_DECLARE_PROBE_ENTRY()
macro so that it can actually be used if needed.

Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Fixes: e647b53227 (ACPI: Add early device probing infrastructure)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-07 08:32:45 +02:00
Dmitry Torokhov
11dd037e42 Input: i8042 - break load dependency between atkbd/psmouse and i8042
commit 4097461897 upstream.

As explained in 1407814240-4275-1-git-send-email-decui@microsoft.com we
have a hard load dependency between i8042 and atkbd which prevents
keyboard from working on Gen2 Hyper-V VMs.

> hyperv_keyboard invokes serio_interrupt(), which needs a valid serio
> driver like atkbd.c.  atkbd.c depends on libps2.c because it invokes
> ps2_command().  libps2.c depends on i8042.c because it invokes
> i8042_check_port_owner().  As a result, hyperv_keyboard actually
> depends on i8042.c.
>
> For a Generation 2 Hyper-V VM (meaning no i8042 device emulated), if a
> Linux VM (like Arch Linux) happens to configure CONFIG_SERIO_I8042=m
> rather than =y, atkbd.ko can't load because i8042.ko can't load(due to
> no i8042 device emulated) and finally hyperv_keyboard can't work and
> the user can't input: https://bugs.archlinux.org/task/39820
> (Ubuntu/RHEL/SUSE aren't affected since they use CONFIG_SERIO_I8042=y)

To break the dependency we move away from using i8042_check_port_owner()
and instead allow serio port owner specify a mutex that clients should use
to serialize PS/2 command stream.

Reported-by: Mark Laws <mdl@60hz.org>
Tested-by: Mark Laws <mdl@60hz.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-07 08:32:44 +02:00
Tomeu Vizoso
d91c348e4c mfd: cros_ec: Add cros_ec_cmd_xfer_status() helper
commit 9798ac6d32 upstream.

So that callers of cros_ec_cmd_xfer() don't have to repeat boilerplate
code when checking for errors from the EC side.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Benson Leung <bleung@chromium.org>
Signed-off-by: Brian Norris <briannorris@chromium.org>
Acked-by: Lee Jones <lee.jones@linaro.org>
Tested-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-07 08:32:43 +02:00