Commit Graph

574597 Commits

Author SHA1 Message Date
Linus Torvalds
1d3671df72 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull UDF fixes and quota cleanups from Jan Kara:
 "Several UDF fixes and some minor quota cleanups"

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  udf: Check output buffer length when converting name to CS0
  udf: Prevent buffer overrun with multi-byte characters
  quota: constify qtree_fmt_operations structures
  udf: avoid uninitialized variable use
  udf: Fix lost indirect extent block
  udf: Factor out code for creating indirect extent
  udf: limit the maximum number of indirect extents in a row
  udf: limit the maximum number of TD redirections
  fs: make quota/dquot.c explicitly non-modular
  fs: make quota/netlink.c explicitly non-modular
2016-01-15 11:51:51 -08:00
Sjoerd Simons
113c74d83e net: phy: turn carrier off on phy attach
The operstate of a networking device initially IF_OPER_UNKNOWN aka
"unknown", updated on carrier state changes (with carrier state being on
by default). This means it will stay unknown unless the carrier state
goes to off at some point, which is not the case if the phy is already
up/connected at startup.

Explicitly turn off the carrier on phy attach, leaving the phy state
machine to turn the carrier on when it has done the initial negotiation.

Signed-off-by: Sjoerd Simons <sjoerd.simons@collabora.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-15 14:49:11 -05:00
Nathan Sullivan
244683749e net: macb: clear interrupts when disabling them
Disabling interrupts with the IDR register does not stop the macb hardware
from asserting its interrupt line if there are interrupts pending.  Always
clear the interrupts using ISR, and be sure to write it on hardware that
is not read-to-clear, like Zynq.  Not doing so will cause interrupts when
the driver doesn't expect them.

Signed-off-by: Nathan Sullivan <nathan.sullivan@ni.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-15 14:47:09 -05:00
Linus Torvalds
875fc4f5dd Merge branch 'akpm' (patches from Andrew)
Merge first patch-bomb from Andrew Morton:

 - A few hotfixes which missed 4.4 becasue I was asleep.  cc'ed to
   -stable

 - A few misc fixes

 - OCFS2 updates

 - Part of MM.  Including pretty large changes to page-flags handling
   and to thp management which have been buffered up for 2-3 cycles now.

  I have a lot of MM material this time.

[ It turns out the THP part wasn't quite ready, so that got dropped from
  this series  - Linus ]

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (117 commits)
  zsmalloc: reorganize struct size_class to pack 4 bytes hole
  mm/zbud.c: use list_last_entry() instead of list_tail_entry()
  zram/zcomp: do not zero out zcomp private pages
  zram: pass gfp from zcomp frontend to backend
  zram: try vmalloc() after kmalloc()
  zram/zcomp: use GFP_NOIO to allocate streams
  mm: add tracepoint for scanning pages
  drivers/base/memory.c: fix kernel warning during memory hotplug on ppc64
  mm/page_isolation: use macro to judge the alignment
  mm: fix noisy sparse warning in LIBCFS_ALLOC_PRE()
  mm: rework virtual memory accounting
  include/linux/memblock.h: fix ordering of 'flags' argument in comments
  mm: move lru_to_page to mm_inline.h
  Documentation/filesystems: describe the shared memory usage/accounting
  memory-hotplug: don't BUG() in register_memory_resource()
  hugetlb: make mm and fs code explicitly non-modular
  mm/swapfile.c: use list_for_each_entry_safe in free_swap_count_continuations
  mm: /proc/pid/clear_refs: no need to clear VM_SOFTDIRTY in clear_soft_dirty_pmd()
  mm: make sure isolate_lru_page() is never called for tail page
  vmstat: make vmstat_updater deferrable again and shut down on idle
  ...
2016-01-15 11:41:44 -08:00
Xin Long
65a5124a71 sctp: support to lookup with ep+paddr in transport rhashtable
Now, when we sendmsg, we translate the ep to laddr by selecting the
first element of the list, and then do a lookup for a transport.

But sctp_hash_cmp() will compare it against asoc addr_list, which may
be a subset of ep addr_list, meaning that this chosen laddr may not be
there, and thus making it impossible to find the transport.

So we fix it by using ep + paddr to lookup transports in hashtable. In
sctp_hash_cmp, if .ep is set, we will check if this ep == asoc->ep,
or we will do the laddr check.

Fixes: d6c0256a60 ("sctp: add the rhashtable apis for sctp global transport hashtable")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reported-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-15 14:41:36 -05:00
Weijie Yang
7dfa461220 zsmalloc: reorganize struct size_class to pack 4 bytes hole
Reoder the pages_per_zspage field in struct size_class which can
eliminate the 4 bytes hole between it and stats field.

Signed-off-by: Weijie Yang <weijie.yang@samsung.com>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15 11:40:52 -08:00
Geliang Tang
f58fb5e7f0 mm/zbud.c: use list_last_entry() instead of list_tail_entry()
list_last_entry*( has been defined in list.h, so replace
list_tail_entry() with it.

Signed-off-by: Geliang Tang <geliangtang@163.com>
Cc: Seth Jennings <sjennings@variantweb.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15 11:40:52 -08:00
Sergey Senozhatsky
e02d238c98 zram/zcomp: do not zero out zcomp private pages
Do not __GFP_ZERO allocated zcomp ->private pages.  We keep allocated
streams around and use them for read/write requests, so we supply a
zeroed out ->private to compression algorithm as a scratch buffer only
once -- the first time we use that stream.  For the rest of IO requests
served by this stream ->private usually contains some temporarily data
from the previous requests.

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15 11:40:52 -08:00
Minchan Kim
75d8947a36 zram: pass gfp from zcomp frontend to backend
Each zcomp backend uses own gfp flag but it's pointless because the
context they could be called is driven by upper layer(ie, zcomp
frontend).  As well, zcomp frondend could call them in different
context.  One context(ie, zram init part) is it should be better to make
sure successful allocation other context(ie, further stream allocation
part for accelarating I/O speed) is just optional so let's pass gfp down
from driver (ie, zcomp frontend) like normal MM convention.

[sergey.senozhatsky@gmail.com: add missing __vmalloc zero and highmem gfps]
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15 11:40:51 -08:00
Kyeongdon Kim
d913897aba zram: try vmalloc() after kmalloc()
When we're using LZ4 multi compression streams for zram swap, we found
out page allocation failure message in system running test.  That was
not only once, but a few(2 - 5 times per test).  Also, some failure
cases were continually occurring to try allocation order 3.

In order to make parallel compression private data, we should call
kzalloc() with order 2/3 in runtime(lzo/lz4).  But if there is no order
2/3 size memory to allocate in that time, page allocation fails.  This
patch makes to use vmalloc() as fallback of kmalloc(), this prevents
page alloc failure warning.

After using this, we never found warning message in running test, also
It could reduce process startup latency about 60-120ms in each case.

For reference a call trace :

    Binder_1: page allocation failure: order:3, mode:0x10c0d0
    CPU: 0 PID: 424 Comm: Binder_1 Tainted: GW 3.10.49-perf-g991d02b-dirty #20
    Call trace:
      dump_backtrace+0x0/0x270
      show_stack+0x10/0x1c
      dump_stack+0x1c/0x28
      warn_alloc_failed+0xfc/0x11c
      __alloc_pages_nodemask+0x724/0x7f0
      __get_free_pages+0x14/0x5c
      kmalloc_order_trace+0x38/0xd8
      zcomp_lz4_create+0x2c/0x38
      zcomp_strm_alloc+0x34/0x78
      zcomp_strm_multi_find+0x124/0x1ec
      zcomp_strm_find+0xc/0x18
      zram_bvec_rw+0x2fc/0x780
      zram_make_request+0x25c/0x2d4
      generic_make_request+0x80/0xbc
      submit_bio+0xa4/0x15c
      __swap_writepage+0x218/0x230
      swap_writepage+0x3c/0x4c
      shrink_page_list+0x51c/0x8d0
      shrink_inactive_list+0x3f8/0x60c
      shrink_lruvec+0x33c/0x4cc
      shrink_zone+0x3c/0x100
      try_to_free_pages+0x2b8/0x54c
      __alloc_pages_nodemask+0x514/0x7f0
      __get_free_pages+0x14/0x5c
      proc_info_read+0x50/0xe4
      vfs_read+0xa0/0x12c
      SyS_read+0x44/0x74
    DMA: 3397*4kB (MC) 26*8kB (RC) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB
         0*512kB 0*1024kB 0*2048kB 0*4096kB = 13796kB

[minchan@kernel.org: change vmalloc gfp and adding comment about gfp]
[sergey.senozhatsky@gmail.com: tweak comments and styles]
Signed-off-by: Kyeongdon Kim <kyeongdon.kim@lge.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15 11:40:51 -08:00
Sergey Senozhatsky
3d5fe03a3e zram/zcomp: use GFP_NOIO to allocate streams
We can end up allocating a new compression stream with GFP_KERNEL from
within the IO path, which may result is nested (recursive) IO
operations.  That can introduce problems if the IO path in question is a
reclaimer, holding some locks that will deadlock nested IOs.

Allocate streams and working memory using GFP_NOIO flag, forbidding
recursive IO and FS operations.

An example:

  inconsistent {IN-RECLAIM_FS-W} -> {RECLAIM_FS-ON-W} usage.
  git/20158 [HC0[0]:SC0[0]:HE1:SE1] takes:
   (jbd2_handle){+.+.?.}, at:  start_this_handle+0x4ca/0x555
  {IN-RECLAIM_FS-W} state was registered at:
     __lock_acquire+0x8da/0x117b
     lock_acquire+0x10c/0x1a7
     start_this_handle+0x52d/0x555
     jbd2__journal_start+0xb4/0x237
     __ext4_journal_start_sb+0x108/0x17e
     ext4_dirty_inode+0x32/0x61
     __mark_inode_dirty+0x16b/0x60c
     iput+0x11e/0x274
     __dentry_kill+0x148/0x1b8
     shrink_dentry_list+0x274/0x44a
     prune_dcache_sb+0x4a/0x55
     super_cache_scan+0xfc/0x176
     shrink_slab.part.14.constprop.25+0x2a2/0x4d3
     shrink_zone+0x74/0x140
     kswapd+0x6b7/0x930
     kthread+0x107/0x10f
     ret_from_fork+0x3f/0x70
  irq event stamp: 138297
  hardirqs last  enabled at (138297):  debug_check_no_locks_freed+0x113/0x12f
  hardirqs last disabled at (138296):  debug_check_no_locks_freed+0x33/0x12f
  softirqs last  enabled at (137818):  __do_softirq+0x2d3/0x3e9
  softirqs last disabled at (137813):  irq_exit+0x41/0x95

               other info that might help us debug this:
   Possible unsafe locking scenario:
         CPU0
         ----
    lock(jbd2_handle);
    <Interrupt>
      lock(jbd2_handle);

                *** DEADLOCK ***
  5 locks held by git/20158:
   #0:  (sb_writers#7){.+.+.+}, at: [<ffffffff81155411>] mnt_want_write+0x24/0x4b
   #1:  (&type->i_mutex_dir_key#2/1){+.+.+.}, at: [<ffffffff81145087>] lock_rename+0xd9/0xe3
   #2:  (&sb->s_type->i_mutex_key#11){+.+.+.}, at: [<ffffffff8114f8e2>] lock_two_nondirectories+0x3f/0x6b
   #3:  (&sb->s_type->i_mutex_key#11/4){+.+.+.}, at: [<ffffffff8114f909>] lock_two_nondirectories+0x66/0x6b
   #4:  (jbd2_handle){+.+.?.}, at: [<ffffffff811e31db>] start_this_handle+0x4ca/0x555

               stack backtrace:
  CPU: 2 PID: 20158 Comm: git Not tainted 4.1.0-rc7-next-20150615-dbg-00016-g8bdf555-dirty #211
  Call Trace:
    dump_stack+0x4c/0x6e
    mark_lock+0x384/0x56d
    mark_held_locks+0x5f/0x76
    lockdep_trace_alloc+0xb2/0xb5
    kmem_cache_alloc_trace+0x32/0x1e2
    zcomp_strm_alloc+0x25/0x73 [zram]
    zcomp_strm_multi_find+0xe7/0x173 [zram]
    zcomp_strm_find+0xc/0xe [zram]
    zram_bvec_rw+0x2ca/0x7e0 [zram]
    zram_make_request+0x1fa/0x301 [zram]
    generic_make_request+0x9c/0xdb
    submit_bio+0xf7/0x120
    ext4_io_submit+0x2e/0x43
    ext4_bio_write_page+0x1b7/0x300
    mpage_submit_page+0x60/0x77
    mpage_map_and_submit_buffers+0x10f/0x21d
    ext4_writepages+0xc8c/0xe1b
    do_writepages+0x23/0x2c
    __filemap_fdatawrite_range+0x84/0x8b
    filemap_flush+0x1c/0x1e
    ext4_alloc_da_blocks+0xb8/0x117
    ext4_rename+0x132/0x6dc
    ? mark_held_locks+0x5f/0x76
    ext4_rename2+0x29/0x2b
    vfs_rename+0x540/0x636
    SyS_renameat2+0x359/0x44d
    SyS_rename+0x1e/0x20
    entry_SYSCALL_64_fastpath+0x12/0x6f

[minchan@kernel.org: add stable mark]
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: Kyeongdon Kim <kyeongdon.kim@lge.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15 11:40:51 -08:00
David S. Miller
628a96e7d3 Merge branch 'hisi-fixes'
Kejian Yan says:

====================
dts: hisi: fixes no syscon fault when init mdio

This patchset fixes the bug that eth can't initial successful on hip05-D02
because the dts files doesn't match the source code.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-15 14:40:04 -05:00
yankejian
28052e7583 net: hns: fixes no syscon error when init mdio
As dtsi files use the normal naming conventions using '-' instead of '_'
inside of property names, the driver needs to update the phandle name
strings of the of_parse_phandle func.

Signed-off-by: Kejian Yan <yankejian@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-15 14:40:03 -05:00
yankejian
b70ce2ab41 dts: hisi: fixes no syscon fault when init mdio
When linux start up, we get the log below:
"Hi-HNS_MDIO 803c0000.mdio: no syscon hisilicon,peri-c-subctrl
mdio_bus mdio@803c0000: mdio sys ctl reg has not maped"

The source code about the subctrl is dealt syscon, but dts doesn't.
It cause such fault, so this patch adds the syscon info on dts files to
fixes it.

Signed-off-by: Kejian Yan <yankejian@huawei.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-15 14:40:03 -05:00
Konstantin Khlebnikov
9207f9d45b net: preserve IP control block during GSO segmentation
Skb_gso_segment() uses skb control block during segmentation.
This patch adds 32-bytes room for previous control block which
will be copied into all resulting segments.

This patch fixes kernel crash during fragmenting forwarded packets.
Fragmentation requires valid IP CB in skb for clearing ip options.
Also patch removes custom save/restore in ovs code, now it's redundant.

Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Link: http://lkml.kernel.org/r/CALYGNiP-0MZ-FExV2HutTvE9U-QQtkKSoE--KN=JQE5STYsjAA@mail.gmail.com
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-15 14:35:24 -05:00
Jiri Olsa
96b9e70b8e perf build: Introduce FEATURES_DUMP make variable
Introducing FEATURES_DUMP make variable to provide features
detection dump file and bypass the feature detection.

The intention is to use this during build tests to skip
repeated features detection, like:

Get feature dump static build into /tmp/fd file:
  $ make feature-dump FEATURE_DUMP_COPY=/tmp/fd LDFLAGS=-static
    BUILD:   Doing 'make -j4' parallel build

  Auto-detecting system features:
  ...                         dwarf: [ OFF ]

  SNIP

  FEATURE-DUMP file copied into /tmp/fd

Use /tmp/fd to build perf:
  $ make FEATURES_DUMP=/tmp/fd LDFLAGS=-static

  $ file perf
  perf: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), statically linked, for ...

Suggested-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1452830421-77757-7-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-15 16:32:00 -03:00
Jiri Olsa
b8e52be00c perf build: Add feature-dump target
To provide FEATURE-DUMP into $(FEATURE_DUMP_COPY) if defined, with no
further action.

Get feature dump of the current build:
  $ make feature-dump
    BUILD:   Doing 'make -j4' parallel build

  Auto-detecting system features:
  ...                         dwarf: [ on  ]

  FEATURE-DUMP file available in FEATURE-DUMP

Get feature dump static build into /tmp/fd file:

  $ make feature-dump FEATURE_DUMP_COPY=/tmp/fd LDFLAGS=-static
    BUILD:   Doing 'make -j4' parallel build

  Auto-detecting system features:
  ...                         dwarf: [ OFF ]

  SNIP

  FEATURE-DUMP file copied into /tmp/fd

Suggested-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1452830421-77757-6-git-send-email-wangnan0@huawei.com
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-15 16:32:00 -03:00
Wang Nan
c15e758c4b perf build: Pass O option to kernel makefile in build-test
Kernel makefile only follows an 'O' option passed from command line
explicitely. In build-test with 'O' option set, kernel makefile
contaminate kernel source directory. Build test also fail if we don't
create output directory manually.

K_O_OPT is added and passed to kernel makefile if 'O' is passed
to build-test.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1452830421-77757-5-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-15 16:32:00 -03:00
Wang Nan
68824de19a perf build: Test correct path of perf in build-test
If an 'O' is passed to 'make build-test', many 'test -x' and 'test -f'
will fail because perf resides in a different directory. Fix this by
computing PERF_OUT according to 'O' and test correct output files.
For make_kernelsrc and make_kernelsrc_tools, set KBUILD_OUTPUT_DIR
instead because the path is different from others ($(O)/perf vs
 $(O)/tools/perf).

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1452830421-77757-4-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-15 16:31:59 -03:00
Wang Nan
eb807730c0 perf build: Pass O option to Makefile.perf in build-test
Unlike tools/perf/Makefile, tools/perf/Makefile.perf obey 'O' option
when it is passed through cmdline only, due to code in
tools/scripts/Makefile.include:

 ifneq ($(O),)
 ifeq ($(origin O), command line)
 	...
 	ABSOLUTE_O := $(shell cd $(O) ; pwd)
 	OUTPUT := $(ABSOLUTE_O)/$(if $(subdir),$(subdir)/)
 endif
 endif

This patch passes 'O' to Makefile.perf through cmdline explicitly
to make it follow O variable during build-test.

'make clean' should have identical 'O' option with 'make'. If not,
config-clean may error.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1452830421-77757-3-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-15 16:31:59 -03:00
Wang Nan
7be43dfb1e perf build: Set parallel making options build-test
'make build-test' is painful because of time consuming. In a full test,
all test cases are built twice with tools/perf/Makefile and
tools/perf/Makefile.perf. 'Makefile' automatically computes parallel
options for make, but 'Makefile.perf' not, so all test cases is built
with one job. It is very slow.

This patch adds '-j' options to Makefile.perf testing. It computes
parallel building options like what tools/perf/Makefile does, and pass
'-j' option to Makefile.perf test.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1452687442-6186-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-15 16:31:59 -03:00
Ben Hutchings
40c4a0f92a perf symbols: Fix reading of build-id from vDSO
We need to use the long name (the filename) when reading the build-id
from a DSO.  Using the short name doesn't work for (at least) vDSOs.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160113172301.GT28542@decadent.org.uk
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-15 16:31:58 -03:00
Ravi Bangoria
3caeaa5627 perf kvm record/report: 'unprocessable sample' error while recording/reporting guest data
While recording guest samples in host using perf kvm record, it will
populate unprocessable sample error, though samples will be recorded
properly. While generating report using perf kvm report, no samples will
be processed and same error will populate. We have seen this behaviour
with upstream perf(4.4-rc3) on x86 and ppc64 hardware.

Reason behind this failure is, when it tries to fetch machine from
rb_tree of machines, it fails. As a part of tracing a bug, we figured
out that this code was incorrectly refactored in commit 54245fdc35
("perf session: Remove wrappers to machines__find").

This patch will change the functionality such that if it can't fetch
machine in first trial, it will create one node of machine and add that to
rb_tree. So next time when it tries to fetch same machine from rb_tree,
it won't fail. Actually it was the case before refactoring of code in
aforementioned commit.

This patch is generated from acme perf/core branch.

Below I've mention an example that demonstrate the behaviour before and
after applying patch.

Before applying patch:
[Note: One needs to run guest before recording data in host]

  ravi@ravi-bangoria:~$ ./perf kvm record -a
  Warning:
  5903 unprocessable samples recorded.
  Do you have a KVM guest running and not using 'perf kvm'?
  [ perf record: Captured and wrote 1.409 MB perf.data.guest (285 samples) ]

  ravi@ravi-bangoria:~$ ./perf kvm report --stdio
  Warning:
  5903 unprocessable samples recorded.
  Do you have a KVM guest running and not using 'perf kvm'?
  # To display the perf.data header info, please use --header/--header-only options.
  #
  # Total Lost Samples: 0
  #
  # Samples: 285  of event 'cycles'
  # Event count (approx.): 88715406
  #
  # Overhead  Command  Shared Object  Symbol
  # ........  .......  .............  ......
  #

  # (For a higher level overview, try: perf report --sort comm,dso)
  #

After applying patch:

  ravi@ravi-bangoria:~$ ./perf kvm record -a
  [ perf record: Captured and wrote 1.188 MB perf.data.guest (17 samples) ]

  ravi@ravi-bangoria:~$ ./perf kvm report --stdio
  # To display the perf.data header info, please use --header/--header-only options.
  #
  # Total Lost Samples: 0
  #
  # Samples: 17  of event 'cycles'
  # Event count (approx.): 700746
  #
  # Overhead  Command  Shared Object     Symbol
  # ........  .......  ................  ......................
  #
      34.19%  :5758    [unknown]         [g] 0xffffffff818682ab
      22.79%  :5758    [unknown]         [g] 0xffffffff812dc7f8
      22.79%  :5758    [unknown]         [g] 0xffffffff818650d0
      14.83%  :5758    [unknown]         [g] 0xffffffff8161a1b6
       2.49%  :5758    [unknown]         [g] 0xffffffff818692bf
       0.48%  :5758    [unknown]         [g] 0xffffffff81869253
       0.05%  :5758    [unknown]         [g] 0xffffffff81869250

Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org # v3.19+
Fixes: 54245fdc35 ("perf session: Remove wrappers to machines__find")
Link: http://lkml.kernel.org/r/1449471302-11283-1-git-send-email-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-15 16:31:58 -03:00
Ville Syrjälä
2d7f3bdb2c drm/i915: Pass the dma_addr_t array as const to rotate_pages()
rotate_pages() doesn't modify the passed in dma addresses, so make
them const.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1452777736-4909-4-git-send-email-ville.syrjala@linux.intel.com
2016-01-15 21:04:27 +02:00
Ville Syrjälä
b5e16987a0 drm/i915: Set i915_ggtt_view_normal type explicitly
Just for clarity set the type for i915_ggtt_view_normal explicitly.

While at it fix the indentation fail for i915_ggtt_view_rotated.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1452777736-4909-3-git-send-email-ville.syrjala@linux.intel.com
2016-01-15 21:04:16 +02:00
Ville Syrjälä
0b05e1e0c9 drm/i915: Don't leak framebuffer_references if drm_framebuffer_init() fails
Don't increment obj->framebuffer_references until we know we actually
managed to create the framebuffer.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1452777736-4909-2-git-send-email-ville.syrjala@linux.intel.com
2016-01-15 21:04:07 +02:00
Bjorn Helgaas
472358412b Merge branches 'pci/hotplug' and 'pci/resource' into next
* pci/hotplug:
  PCI: ibmphp: Remove unneeded NULL test
  PCI: hotplug: Use list_for_each_entry() to simplify code
  PCI: acpiphp_ibm: Fix null dereferences on null ibm_slot

* pci/resource:
  PCI: Avoid iterating through memory outside the resource window
  PCI: Fix minimum allocation address overwrite
2016-01-15 12:33:29 -06:00
Bjorn Helgaas
c111e8bf6e Merge branches 'pci/host', 'pci/host-designware', 'pci/host-hisi', 'pci/host-qcom' and 'pci/host-rcar' into next
* pci/host:
  PCI: host: Add of_pci_get_host_bridge_resources() stub
  PCI: host: Mark PCIe/PCI (MSI) IRQ cascade handlers as IRQF_NO_THREAD

* pci/host-designware:
  PCI: designware: Make config accessor override checking symmetric
  PCI: designware: Simplify control flow

* pci/host-hisi:
  PCI: hisi: Add support for HiSilicon Hip06 PCIe host controllers

* pci/host-qcom:
  ARM: dts: ifc6410: enable PCIe DT node for this board
  ARM: dts: apq8064: add PCIe devicetree node
  PCI: qcom: Add Qualcomm PCIe controller driver
  PCI: qcom: Document PCIe devicetree bindings
  PCI: designware: Ensure ATU is enabled before IO/conf space accesses

* pci/host-rcar:
  PCI: rcar: Add Gen2 PHY setup to pcie-rcar
  PCI: rcar: Add runtime PM support to pcie-rcar
  PCI: rcar: Remove unused pci_sys_data struct from pcie-rcar
2016-01-15 12:33:14 -06:00
Arnd Bergmann
40704b1290 PCI: host: Add of_pci_get_host_bridge_resources() stub
The pcie-rcar driver can be built for any ARM platform (for COMPILE_TEST)
including those without CONFIG_OF enabled, and that results in a
compile-time error:

  drivers/pci/host/pcie-rcar.c: In function 'rcar_pcie_parse_request_of_pci_ranges':
  drivers/pci/host/pcie-rcar.c:939:8: error: implicit declaration of function 'of_pci_get_host_bridge_resources' [-Werror=implicit-function-declaration]
    err = of_pci_get_host_bridge_resources(np, 0, 0xff, &pci->resources, &iobase);

Add a of_pci_get_host_bridge_resources() stub function defined when
CONFIG_OF_ADDRESS is disabled to allow compile-testing on all platforms.
This mirrors what we do for other OF-specific functions.

Fixes: 5d2917d469 ("PCI: rcar: Convert to DT resource parsing API")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
2016-01-15 12:30:35 -06:00
Sebastian Andrzej Siewior
546bed6312 btrfs: initialize the seq counter in struct btrfs_device
I managed to trigger this:
| INFO: trying to register non-static key.
| the code is fine but needs lockdep annotation.
| turning off the locking correctness validator.
| CPU: 1 PID: 781 Comm: systemd-gpt-aut Not tainted 4.4.0-rt2+ #14
| Hardware name: ARM-Versatile Express
| [<80307cec>] (dump_stack)
| [<80070e98>] (__lock_acquire)
| [<8007184c>] (lock_acquire)
| [<80287800>] (btrfs_ioctl)
| [<8012a8d4>] (do_vfs_ioctl)
| [<8012ac14>] (SyS_ioctl)

so I think that btrfs_device_data_ordered_init() is not invoked behind
a macro somewhere.

Fixes: 7cc8e58d53 ("Btrfs: fix unprotected device's variants on 32bits machine")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2016-01-15 19:28:43 +01:00
Dan Carpenter
0dc924c5f2 Btrfs: clean up an error code in btrfs_init_space_info()
If we return 1 here, then the caller treats it as an error and returns
-EINVAL.  It causes a static checker warning to treat positive returns
as an error.

Fixes: 1aba86d67f ('Btrfs: fix easily get into ENOSPC in mixed case')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2016-01-15 19:27:28 +01:00
Geliang Tang
8e217858ee btrfs: fix iterator with update error in backref.c
Fix the following error:

fs/btrfs/backref.c:565:1-20: iterator with update on line 577

Fixes: a7ca422('btrfs: use list_for_each_entry* in backref.c')
Signed-off-by: Geliang Tang <geliangtang@163.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2016-01-15 19:27:18 +01:00
Tsutomu Itoh
b7c47bbb2d Btrfs: fix output of compression message in btrfs_parse_options()
The compression message might not be correctly output.
Fix it.

[[before fix]]

# mount -o compress /dev/sdb3 /test3
[  996.874264] BTRFS info (device sdb3): disk space caching is enabled
[  996.874268] BTRFS: has skinny extents
# mount | grep /test3
/dev/sdb3 on /test3 type btrfs (rw,relatime,compress=zlib,space_cache,subvolid=5,subvol=/)

# mount -o remount,compress-force /dev/sdb3 /test3
[ 1035.075017] BTRFS info (device sdb3): force zlib compression
[ 1035.075021] BTRFS info (device sdb3): disk space caching is enabled
# mount | grep /test3
/dev/sdb3 on /test3 type btrfs (rw,relatime,compress-force=zlib,space_cache,subvolid=5,subvol=/)

# mount -o remount,compress /dev/sdb3 /test3
[ 1053.679092] BTRFS info (device sdb3): disk space caching is enabled
# mount | grep /test3
/dev/sdb3 on /test3 type btrfs (rw,relatime,compress=zlib,space_cache,subvolid=5,subvol=/)

[[after fix]]

# mount -o compress /dev/sdb3 /test3
[  401.021753] BTRFS info (device sdb3): use zlib compression
[  401.021758] BTRFS info (device sdb3): disk space caching is enabled
[  401.021760] BTRFS: has skinny extents
# mount | grep /test3
/dev/sdb3 on /test3 type btrfs (rw,relatime,compress=zlib,space_cache,subvolid=5,subvol=/)

# mount -o remount,compress-force /dev/sdb3 /test3
[  439.824624] BTRFS info (device sdb3): force zlib compression
[  439.824629] BTRFS info (device sdb3): disk space caching is enabled
# mount | grep /test3
/dev/sdb3 on /test3 type btrfs (rw,relatime,compress-force=zlib,space_cache,subvolid=5,subvol=/)

# mount -o remount,compress /dev/sdb3 /test3
[  459.918430] BTRFS info (device sdb3): use zlib compression
[  459.918434] BTRFS info (device sdb3): disk space caching is enabled
# mount | grep /test3
/dev/sdb3 on /test3 type btrfs (rw,relatime,compress=zlib,space_cache,subvolid=5,subvol=/)

Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2016-01-15 19:25:36 +01:00
Chandan Rajendra
f32e48e925 Btrfs: Initialize btrfs_root->highest_objectid when loading tree root and subvolume roots
The following call trace is seen when btrfs/031 test is executed in a loop,

[  158.661848] ------------[ cut here ]------------
[  158.662634] WARNING: CPU: 2 PID: 890 at /home/chandan/repos/linux/fs/btrfs/ioctl.c:558 create_subvol+0x3d1/0x6ea()
[  158.664102] BTRFS: Transaction aborted (error -2)
[  158.664774] Modules linked in:
[  158.665266] CPU: 2 PID: 890 Comm: btrfs Not tainted 4.4.0-rc6-g511711a #2
[  158.666251] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[  158.667392]  ffffffff81c0a6b0 ffff8806c7c4f8e8 ffffffff81431fc8 ffff8806c7c4f930
[  158.668515]  ffff8806c7c4f920 ffffffff81051aa1 ffff880c85aff000 ffff8800bb44d000
[  158.669647]  ffff8808863b5c98 0000000000000000 00000000fffffffe ffff8806c7c4f980
[  158.670769] Call Trace:
[  158.671153]  [<ffffffff81431fc8>] dump_stack+0x44/0x5c
[  158.671884]  [<ffffffff81051aa1>] warn_slowpath_common+0x81/0xc0
[  158.672769]  [<ffffffff81051b27>] warn_slowpath_fmt+0x47/0x50
[  158.673620]  [<ffffffff813bc98d>] create_subvol+0x3d1/0x6ea
[  158.674440]  [<ffffffff813777c9>] btrfs_mksubvol.isra.30+0x369/0x520
[  158.675376]  [<ffffffff8108a4aa>] ? percpu_down_read+0x1a/0x50
[  158.676235]  [<ffffffff81377a81>] btrfs_ioctl_snap_create_transid+0x101/0x180
[  158.677268]  [<ffffffff81377b52>] btrfs_ioctl_snap_create+0x52/0x70
[  158.678183]  [<ffffffff8137afb4>] btrfs_ioctl+0x474/0x2f90
[  158.678975]  [<ffffffff81144b8e>] ? vma_merge+0xee/0x300
[  158.679751]  [<ffffffff8115be31>] ? alloc_pages_vma+0x91/0x170
[  158.680599]  [<ffffffff81123f62>] ? lru_cache_add_active_or_unevictable+0x22/0x70
[  158.681686]  [<ffffffff813d99cf>] ? selinux_file_ioctl+0xff/0x1d0
[  158.682581]  [<ffffffff8117b791>] do_vfs_ioctl+0x2c1/0x490
[  158.683399]  [<ffffffff813d3cde>] ? security_file_ioctl+0x3e/0x60
[  158.684297]  [<ffffffff8117b9d4>] SyS_ioctl+0x74/0x80
[  158.685051]  [<ffffffff819b2bd7>] entry_SYSCALL_64_fastpath+0x12/0x6a
[  158.685958] ---[ end trace 4b63312de5a2cb76 ]---
[  158.686647] BTRFS: error (device loop0) in create_subvol:558: errno=-2 No such entry
[  158.709508] BTRFS info (device loop0): forced readonly
[  158.737113] BTRFS info (device loop0): disk space caching is enabled
[  158.738096] BTRFS error (device loop0): Remounting read-write after error is not allowed
[  158.851303] BTRFS error (device loop0): cleaner transaction attach returned -30

This occurs because,

Mount filesystem
Create subvol with ID 257
Unmount filesystem
Mount filesystem
Delete subvol with ID 257
  btrfs_drop_snapshot()
    Add root corresponding to subvol 257 into
    btrfs_transaction->dropped_roots list
Create new subvol (i.e. create_subvol())
  257 is returned as the next free objectid
  btrfs_read_fs_root_no_name()
    Finds the btrfs_root instance corresponding to the old subvol with ID 257
    in btrfs_fs_info->fs_roots_radix.
    Returns error since btrfs_root_item->refs has the value of 0.

To fix the issue the commit initializes tree root's and subvolume root's
highest_objectid when loading the roots from disk.

Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2016-01-15 19:25:02 +01:00
Jeff Mahoney
95617d6932 btrfs: cleanup, stop casting for extent_map->lookup everywhere
Overloading extent_map->bdev to struct map_lookup * might have started out
as a means to an end, but it's a pattern that's used all over the place
now. Let's get rid of the casting and just add a union instead.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2016-01-15 19:22:28 +01:00
Alex Deucher
7776a69386 drm/amdgpu: Add some tweaks to gfx 8 soft reset
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Ken Wang <Qingqing.Wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-01-15 12:43:22 -05:00
Alex Deucher
e160e4db83 drm/amdgpu: fix tonga smu resume
Need to make sure smu buffers are pinned on resume.  This
matches what Fiji does.

Cc: stable@vger.kernel.org
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Ken Wang <Qingqing.Wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-01-15 12:43:11 -05:00
Felipe Balbi
02f1cc3c1a ARM: omap2plus_defconfig: enable SPLIT and DWARF4
CONFIG_DEBUG_INFO_SPLIT will split debug info on
.dwo files. This will generate a smaller vmlinux and
smaller .ko modules, which will be easier to ship on
certain products.

CONFIG_DEBUG_INFO_DWARF4 will generate debug info in
DWARF4 format, which ends up being larger, but
improves the probability of resolving variables in
optmized code.

Signed-off-by: Felipe Balbi <balbi@ti.com>
[tony@atomide.com: this makes initramfs quite a bit smaller]
Signed-off-by: Tony Lindgren <tony@atomide.com>
2016-01-15 09:07:09 -08:00
H. Nikolaus Schaller
c08659d431 ARM: dts: omap5-board-common: enable rtc and charging of backup battery
tested on OMP5432 EVM

Cc: stable@vger.kernel.org # v4.4
Signed-off-by: H. Nikolaus Schaller <hns@goldelico.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2016-01-15 09:07:09 -08:00
Paolo Bonzini
171b5682aa Merge branch 'kvm-ppc-next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into HEAD 2016-01-15 17:49:39 +01:00
Takashi Iwai
c3b1681375 ALSA: timer: Code cleanup
This is a minor code cleanup without any functional changes:
- Kill keep_flag argument from _snd_timer_stop(), as all callers pass
  only it false.
- Remove redundant NULL check in _snd_timer_stop().

Signed-off-by: Takashi Iwai <tiwai@suse.de>
2016-01-15 15:50:34 +01:00
Takashi Iwai
b5a663aa42 ALSA: timer: Harden slave timer list handling
A slave timer instance might be still accessible in a racy way while
operating the master instance as it lacks of locking.  Since the
master operation is mostly protected with timer->lock, we should cope
with it while changing the slave instance, too.  Also, some linked
lists (active_list and ack_list) of slave instances aren't unlinked
immediately at stopping or closing, and this may lead to unexpected
accesses.

This patch tries to address these issues.  It adds spin lock of
timer->lock (either from master or slave, which is equivalent) in a
few places.  For avoiding a deadlock, we ensure that the global
slave_active_lock is always locked at first before each timer lock.

Also, ack and active_list of slave instances are properly unlinked at
snd_timer_stop() and snd_timer_close().

Last but not least, remove the superfluous call of _snd_timer_stop()
at removing slave links.  This is a noop, and calling it may confuse
readers wrt locking.  Further cleanup will follow in a later patch.

Actually we've got reports of use-after-free by syzkaller fuzzer, and
this hopefully fixes these issues.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2016-01-15 15:50:34 +01:00
Takashi Iwai
cf52103a21 ALSA: hda - Add fixup for Dell Latitidue E6540
Another Dell model, another fixup entry: Latitude E6540 needs the same
fixup as other Latitude E series as workaround for noise problems.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=104341
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2016-01-15 14:19:49 +01:00
Thomas Gleixner
98229aa36c x86/irq: Plug vector cleanup race
We still can end up with a stale vector due to the following:

CPU0                          CPU1                      CPU2
lock_vector()
data->move_in_progress=0
sendIPI()                       
unlock_vector()
                              set_affinity()
                              assign_irq_vector()
                              lock_vector()             handle_IPI
                              move_in_progress = 1      lock_vector()
                              unlock_vector()
                                                        move_in_progress == 1

So we need to serialize the vector assignment against a pending cleanup. The
solution is rather simple now. We not only check for the move_in_progress flag
in assign_irq_vector(), we also check whether there is still a cleanup pending
in the old_domain cpumask. If so, we return -EBUSY to the caller and let him
deal with it. Though we have to be careful in the cpu unplug case. If the
cleanout has not yet completed then the following setaffinity() call would
return -EBUSY. Add code which prevents this.

Full context is here: http://lkml.kernel.org/r/5653B688.4050809@stratus.com

Reported-and-tested-by: Joe Lawrence <joe.lawrence@stratus.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Borislav Petkov <bp@alien8.de>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Jeremiah Mahler <jmmahler@gmail.com>
Cc: andy.shevchenko@gmail.com
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: stable@vger.kernel.org #4.3+
Link: http://lkml.kernel.org/r/20151231160107.207265407@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-01-15 13:44:02 +01:00
Thomas Gleixner
90a2282e23 x86/irq: Call irq_force_move_complete with irq descriptor
First of all there is no point in looking up the irq descriptor again, but we
also need the descriptor for the final cleanup race fix in the next
patch. Make that change seperate. No functional difference.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Borislav Petkov <bp@alien8.de>
Tested-by: Joe Lawrence <joe.lawrence@stratus.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Jeremiah Mahler <jmmahler@gmail.com>
Cc: andy.shevchenko@gmail.com
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: stable@vger.kernel.org #4.3+
Link: http://lkml.kernel.org/r/20151231160107.125211743@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-01-15 13:44:01 +01:00
Thomas Gleixner
56d7d2f4bb x86/irq: Remove outgoing CPU from vector cleanup mask
We want to synchronize new vector assignments with a pending cleanup. Remove a
dying cpu from a pending cleanup mask.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Borislav Petkov <bp@alien8.de>
Tested-by: Joe Lawrence <joe.lawrence@stratus.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Jeremiah Mahler <jmmahler@gmail.com>
Cc: andy.shevchenko@gmail.com
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: stable@vger.kernel.org #4.3+
Link: http://lkml.kernel.org/r/20151231160107.045961667@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-01-15 13:44:01 +01:00
Thomas Gleixner
5da0c1217f x86/irq: Remove the cpumask allocation from send_cleanup_vector()
There is no need to allocate a new cpumask for sending the cleanup vector. The
old_domain mask is now protected by the vector_lock, so we can safely remove
the offline cpus from it and send the IPI with the resulting mask.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Borislav Petkov <bp@alien8.de>
Tested-by: Joe Lawrence <joe.lawrence@stratus.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Jeremiah Mahler <jmmahler@gmail.com>
Cc: andy.shevchenko@gmail.com
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: stable@vger.kernel.org #4.3+
Link: http://lkml.kernel.org/r/20151231160106.967993932@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-01-15 13:44:01 +01:00
Thomas Gleixner
c1684f5035 x86/irq: Clear move_in_progress before sending cleanup IPI
send_cleanup_vector() fiddles with the old_domain mask unprotected because it
relies on the protection by the move_in_progress flag. But this is fatal, as
the flag is reset after the IPI has been sent. So a cpu which receives the IPI
can still see the flag set and therefor ignores the cleanup request. If no
other cleanup request happens then the vector stays stale on that cpu and in
case of an irq removal the vector still persists. That can lead to use after
free when the next cleanup IPI happens.

Protect the code with vector_lock and clear move_in_progress before sending
the IPI.

This does not plug the race which Joe reported because:

CPU0                          CPU1                      CPU2
lock_vector()
data->move_in_progress=0
sendIPI()                       
unlock_vector()
                              set_affinity()
                              assign_irq_vector()
                              lock_vector()             handle_IPI
                              move_in_progress = 1      lock_vector()
                              unlock_vector()
                                                        move_in_progress == 1

The full fix comes with a later patch.

Reported-and-tested-by: Joe Lawrence <joe.lawrence@stratus.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Borislav Petkov <bp@alien8.de>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Jeremiah Mahler <jmmahler@gmail.com>
Cc: andy.shevchenko@gmail.com
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: stable@vger.kernel.org #4.3+
Link: http://lkml.kernel.org/r/20151231160106.892412198@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-01-15 13:44:01 +01:00
Thomas Gleixner
847667ef10 x86/irq: Remove offline cpus from vector cleanup
No point of keeping offline cpus in the cleanup mask.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Borislav Petkov <bp@alien8.de>
Tested-by: Joe Lawrence <joe.lawrence@stratus.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Jeremiah Mahler <jmmahler@gmail.com>
Cc: andy.shevchenko@gmail.com
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: stable@vger.kernel.org #4.3+
Link: http://lkml.kernel.org/r/20151231160106.808642683@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-01-15 13:44:01 +01:00
Thomas Gleixner
ab25ac0214 x86/irq: Get rid of code duplication
Reusing an existing vector and assigning a new vector has duplicated
code. Consolidate it.

This is also a preparatory patch for finally plugging the cleanup race.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Borislav Petkov <bp@alien8.de>
Tested-by: Joe Lawrence <joe.lawrence@stratus.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Jeremiah Mahler <jmmahler@gmail.com>
Cc: andy.shevchenko@gmail.com
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: stable@vger.kernel.org #4.3+
Link: http://lkml.kernel.org/r/20151231160106.721599216@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-01-15 13:44:00 +01:00