Commit Graph

72962 Commits

Author SHA1 Message Date
Linus Torvalds
e35c5a2759 Merge tag 'armsoc-for-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
 "Another small set of fixes:

   - some DT compatible typo fixes
   - irq setup fix dealing with irq storms on orion
   - i2c quirk generalization for mvebu
   - a handful of smaller fixes for OMAP
   - a couple of added file patterns for OMAP entries in MAINTAINERS"

* tag 'armsoc-for-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ARM: at91/dt: Fix sama5d3x typos
  pinctrl: dra: dt-bindings: Fix output pull up/down
  MAINTAINERS: Update entry for omap related .dts files to cover new SoCs
  MAINTAINERS: add more files under OMAP SUPPORT
  ARM: dts: AM437x-SK-EVM: Fix DCDC3 voltage
  ARM: dts: AM437x-GP-EVM: Fix DCDC3 voltage
  ARM: dts: AM43x-EPOS-EVM: Fix DCDC3 voltage
  ARM: dts: am335x-evm: Fix 5th NAND partition's name
  ARM: orion: Fix for certain sequence of request_irq can cause irq storm
  ARM: mvebu: armada xp: Generalize use of i2c quirk
2014-11-16 16:21:57 -08:00
Olof Johansson
f7efdad025 Merge tag 'omap-fixes-against-v3.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes
Merge "omap fixes against v3.18-rc4" from Tony Lindgren:

Few omap fixes for hangs and wrong pinctrl defines, and update
MAINTAINERS file to avoid missing PMIC and SoC related patches:

- Fix random hangs on am437x because of incorrect default
  value for the DDR regulator

- Fix wrong partition name for NAND on am335x-evm

- Fix wrong pinctrl defines for dra7xx

- Update maintainers entries for PMICs and SoCs

* tag 'omap-fixes-against-v3.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
  pinctrl: dra: dt-bindings: Fix output pull up/down
  MAINTAINERS: Update entry for omap related .dts files to cover new SoCs
  MAINTAINERS: add more files under OMAP SUPPORT
  ARM: dts: AM437x-SK-EVM: Fix DCDC3 voltage
  ARM: dts: AM437x-GP-EVM: Fix DCDC3 voltage
  ARM: dts: AM43x-EPOS-EVM: Fix DCDC3 voltage
  ARM: dts: am335x-evm: Fix 5th NAND partition's name

Signed-off-by: Olof Johansson <olof@lixom.net>
2014-11-16 15:09:53 -08:00
Eric Dumazet
b9d1ab7eb4 mlx4: use netdev_rss_key_fill() helper
Use of well known RSS key increases attack surface.
Switch to a random one, using generic helper so that all
ports share a common key.

Also provide ethtool -x support to fetch RSS key

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-16 15:59:13 -05:00
Eric Dumazet
960fb622f8 net: provide a per host RSS key generic infrastructure
RSS (Receive Side Scaling) typically uses Toeplitz hash and a 40 or 52 bytes
RSS key.

Some drivers use a constant (and well known key), some drivers use a random
key per port, making bonding setups hard to tune. Well known keys increase
attack surface, considering that number of queues is usually a power of two.

This patch provides infrastructure to help drivers doing the right thing.

netdev_rss_key_fill() should be used by drivers to initialize their RSS key,
even if they provide ethtool -X support to let user redefine the key later.

A new /proc/sys/net/core/netdev_rss_key file can be used to get the host
RSS key even for drivers not providing ethtool -x support, in case some
applications want to precisely setup flows to match some RX queues.

Tested:

myhost:~# cat /proc/sys/net/core/netdev_rss_key
11:63:99:bb:79:fb:a5:a7:07:45:b2:20:bf:02:42:2d:08:1a:dd:19:2b:6b:23:ac:56:28:9d:70:c3:ac:e8:16:4b:b7:c1:10:53:a4:78:41:36:40:74:b6:15:ca:27:44:aa:b3:4d:72

myhost:~# ethtool -x eth0
RX flow hash indirection table for eth0 with 8 RX ring(s):
    0:      0     1     2     3     4     5     6     7
RSS hash key:
11:63:99:bb:79:fb:a5:a7:07:45:b2:20:bf:02:42:2d:08:1a:dd:19:2b:6b:23:ac:56:28:9d:70:c3:ac:e8:16:4b:b7:c1:10:53:a4:78:41

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-16 15:59:11 -05:00
David S. Miller
f1227c5c1b Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter/IPVS fixes for net

The following patchset contains Netfilter updates for your net tree,
they are:

1) Fix missing initialization of the range structure (allocated in the
   stack) in nft_masq_{ipv4, ipv6}_eval, from Daniel Borkmann.

2) Make sure the data we receive from userspace contains the req_version
   structure, otherwise return an error incomplete on truncated input.
   From Dan Carpenter.

3) Fix handling og skb->sk which may cause incorrect handling
   of connections from a local process. Via Simon Horman, patch from
   Calvin Owens.

4) Fix wrong netns in nft_compat when setting target and match params
   structure.

5) Relax chain type validation in nft_compat that was recently included,
   this broke the matches that need to be run from the route chain type.
   Now iptables-test.py automated regression tests report success again
   and we avoid the only possible problematic case, which is the use of
   nat targets out of nat chain type.

6) Use match->table to validate the tablename, instead of the match->name.
   Again patch for nft_compat.

7) Restore the synchronous release of objects from the commit and abort
   path in nf_tables. This is causing two major problems: splats when using
   nft_compat, given that matches and targets may sleep and call_rcu is
   invoked from softirq context. Moreover Patrick reported possible event
   notification reordering when rules refer to anonymous sets.

8) Fix race condition in between packets that are being confirmed by
   conntrack and the ctnetlink flush operation. This happens since the
   removal of the central spinlock. Thanks to Jesper D. Brouer to looking
   into this.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-16 14:23:56 -05:00
Peter Zijlstra
2565711fb7 perf: Improve the perf_sample_data struct layout
This patch reorders fields in the perf_sample_data struct in order to
minimize the number of cachelines touched in perf_sample_data_init().
It also removes some intializations which are redundant with the code
in kernel/events/core.c

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1411559322-16548-7-git-send-email-eranian@google.com
Cc: cebbert.lkml@gmail.com
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: jolsa@redhat.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-11-16 11:42:04 +01:00
Stephane Eranian
60e2364e60 perf: Add ability to sample machine state on interrupt
Enable capture of interrupted machine state for each sample.

Registers to sample are passed per event in the sample_regs_intr bitmask.

To sample interrupt machine state, the PERF_SAMPLE_INTR_REGS must be passed in
sample_type.

The list of available registers is arch dependent and provided by asm/perf_regs.h

Registers are laid out as u64 in the order of the bit order of sample_intr_regs.

This patch also adds a new ABI version PERF_ATTR_SIZE_VER4 because we extend
the perf_event_attr struct with a new u64 field.

Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: cebbert.lkml@gmail.com
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-api@vger.kernel.org
Link: http://lkml.kernel.org/r/1411559322-16548-2-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-11-16 11:41:57 +01:00
Kirill Tkhai
d8b163c4c6 sched/numa: Init numa balancing fields of init_task
We do not initialize init_task.numa_preferred_nid,
but this value is inherited by userspace "init"
process:

rest_init()->kernel_thread(kernel_init)->do_fork(CLONE_VM);

__sched_fork()
{
	if (clone_flags & CLONE_VM)
		p->numa_preferred_nid = current->numa_preferred_nid;
	else
		p->numa_preferred_nid = -1;
}

kernel_init() becomes userspace "init" process.

So, we propagate garbage nid to userspace, and it may be used
during numa balancing.

Currently, we do not have reports about this brings a problem,
but it seem we should set it for sure.

Even if init_task.numa_preferred_nid is zero, we may meet a weird
configuration without nid#0. On sparc64, where processors are
numbered physically, I saw a machine without cpu#1, while cpu#2
existed. Possible, something similar may be with numa nodes.
So, let's initialize it and be sure we're safe.

Signed-off-by: Kirill Tkhai <ktkhai@parallels.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Eric Paris <eparis@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Sergey Dyasly <dserrg@gmail.com>
Link: http://lkml.kernel.org/r/1415699189.15631.6.camel@tkhai
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-11-16 10:59:01 +01:00
Chen Hanxiao
f622b429da sched: Update comments about CLONE_NEWUTS and CLONE_NEWIPC
Remove question mark:

 s/New utsname group?/New utsname namespace

Unified style for IPC:

 s/New ipcs/New ipc namespace

Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Jiri Kosina <trivial@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-api@vger.kernel.org
Link: http://lkml.kernel.org/r/1415091082-15093-1-git-send-email-chenhanxiao@cn.fujitsu.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-11-16 10:58:53 +01:00
Ingo Molnar
e9ac5f0fa8 Merge branch 'sched/urgent' into sched/core, to pick up fixes before applying more changes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-11-16 10:50:25 +01:00
Peter Zijlstra
23cfa361f3 sched/cputime: Fix cpu_timer_sample_group() double accounting
While looking over the cpu-timer code I found that we appear to add
the delta for the calling task twice, through:

  cpu_timer_sample_group()
    thread_group_cputimer()
      thread_group_cputime()
        times->sum_exec_runtime += task_sched_runtime();

    *sample = cputime.sum_exec_runtime + task_delta_exec();

Which would make the sample run ahead, making the sleep short.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Link: http://lkml.kernel.org/r/20141112113737.GI10476@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-11-16 10:04:18 +01:00
Maxime COQUELIN
00b4d9a141 bitops: Fix shift overflow in GENMASK macros
On some 32 bits architectures, including x86, GENMASK(31, 0) returns 0
instead of the expected ~0UL.

This is the same on some 64 bits architectures with GENMASK_ULL(63, 0).

This is due to an overflow in the shift operand, 1 << 32 for GENMASK,
1 << 64 for GENMASK_ULL.

Reported-by: Eric Paire <eric.paire@st.com>
Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Maxime Coquelin <maxime.coquelin@st.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <stable@vger.kernel.org> # v3.13+
Cc: linux@rasmusvillemoes.dk
Cc: gong.chen@linux.intel.com
Cc: John Sullivan <jsrhbz@kanargh.force9.co.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Fixes: 10ef6b0dff ("bitops: Introduce a more generic BITMASK macro")
Link: http://lkml.kernel.org/r/1415267659-10563-1-git-send-email-maxime.coquelin@st.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-11-16 09:55:39 +01:00
Linus Torvalds
ec7de6567f Merge tag 'for-v3.18-rc' of git://git.infradead.org/battery-2.6
Pull power supply updates from Sebastian Reichel:
 "Power supply and reset changes for the v3.18-rc:

   - misc. charger-manager fixes
   - year 2038 fix in ab8500_fg
   - fix error handling of bq2415x_charger"

* tag 'for-v3.18-rc' of git://git.infradead.org/battery-2.6:
  power: charger-manager: Fix accessing invalidated power supply after charger unbind
  power: charger-manager: Fix accessing invalidated power supply after fuel gauge unbind
  power: charger-manager: Avoid recursive thermal get_temp call
  power_supply: Add no_thermal property to prevent recursive get_temp calls
  power: bq2415x_charger: Fix memory leak on DTS parsing error
  power: bq2415x_charger: Properly handle ENODEV from power_supply_get_by_phandle
  power: ab8500_fg.c: use 64-bit time types
2014-11-15 15:22:51 -08:00
Linus Torvalds
1afcb6ed0d Merge tag 'nfs-for-3.18-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes from Trond Myklebust:
 "Highlights include:

   - stable patches to fix NFSv4.x delegation reclaim error paths
   - fix a bug whereby we were advertising NFSv4.1 but using NFSv4.2
     features
   - fix a use-after-free problem with pNFS block layouts
   - fix a memory leak in the pNFS files O_DIRECT code
   - replace an intrusive and Oops-prone performance fix in the NFSv4
     atomic open code with a safer one-line version and revert the two
     original patches"

* tag 'nfs-for-3.18-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  sunrpc: fix sleeping under rcu_read_lock in gss_stringify_acceptor
  NFS: Don't try to reclaim delegation open state if recovery failed
  NFSv4: Ensure that we call FREE_STATEID when NFSv4.x stateids are revoked
  NFSv4: Fix races between nfs_remove_bad_delegation() and delegation return
  NFSv4.1: nfs41_clear_delegation_stateid shouldn't trust NFS_DELEGATED_STATE
  NFSv4: Ensure that we remove NFSv4.0 delegations when state has expired
  NFS: SEEK is an NFS v4.2 feature
  nfs: Fix use of uninitialized variable in nfs_getattr()
  nfs: Remove bogus assignment
  nfs: remove spurious WARN_ON_ONCE in write path
  pnfs/blocklayout: serialize GETDEVICEINFO calls
  nfs: fix pnfs direct write memory leak
  Revert "NFS: nfs4_do_open should add negative results to the dcache."
  Revert "NFS: remove BUG possibility in nfs4_open_and_get_state"
  NFSv4: Ensure nfs_atomic_open set the dentry verifier on ENOENT
2014-11-15 14:15:16 -08:00
Cristina Ciocan
ccf54555da iio: Fix IIO_EVENT_CODE_EXTRACT_DIR bit mask
The direction field is set on 7 bits, thus we need to AND it with 0111 111 mask
in order to retrieve it, that is 0x7F, not 0xCF as it is now.

Fixes: ade7ef7ba (staging:iio: Differential channel handling)
Signed-off-by: Cristina Ciocan <cristina.ciocan@intel.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
2014-11-15 16:04:56 +00:00
Johan Hedberg
adae20cb2d Bluetooth: Convert IRK list to RCU
This patch set converts the hdev->identity_resolving_keys list to use
RCU to eliminate the need to use hci_dev_lock/unlock.

An additional change that must be done is to remove use of
CRYPTO_ALG_ASYNC for the hdev-specific AES crypto context. The reason is
that this context is used for matching RPAs and the loop that does the
matching is under the RCU read lock, i.e. is an atomic section which
cannot sleep.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-11-15 01:53:27 +01:00
Johan Hedberg
970d0f1b28 Bluetooth: Convert LTK list to RCU
This patch set converts the hdev->long_term_keys list to use RCU to
eliminate the need to use hci_dev_lock/unlock.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-11-15 01:53:27 +01:00
Dave Airlie
ca5a71de48 Merge tag 'drm/gem-cma/for-3.19-rc1' of git://people.freedesktop.org/~tagr/linux into drm-next
drm: Sanitize DRM_IOCTL_MODE_CREATE_DUMB input

Some drivers erroneously treat the .pitch and .size fields of struct
drm_mode_create_dumb as inputs. While the include/uapi/drm/drm_mode.h
header has a comment denoting them as outputs, that seemingly wasn't
enough to make drivers use them properly.

The result is that some userspace doesn't explicitly zero out those
fields, assuming that the kernel won't use them. That causes problems
since the data within the structure might be uninitialized, so bogus
data may end up confusing drivers (ridiculously large values for the
pitch, ...).

This series attempts to improve the situation by fixing all drivers to
not use the output fields. Furthermore to spare new drivers this bad
surprise, the DRM core now zeros out these fields prior to handing the
data structure to the driver.

Lessons learned from this are that future IOCTLs should be properly
documented (in the DRM DocBook for example) and should be rigorously
defined. To prevent misuse like this, userspace should be required to
zero out all output fields. The kernel should check for this and fail
if that's not the case.

* tag 'drm/gem-cma/for-3.19-rc1' of git://people.freedesktop.org/~tagr/linux:
  drm/cma: Remove call to drm_gem_free_mmap_offset()
  drm: Sanitize DRM_IOCTL_MODE_CREATE_DUMB input
  drm/rcar: gem: dumb: pitch is an output
  drm/omap: gem: dumb: pitch is an output
  drm/cma: Introduce drm_gem_cma_dumb_create_internal()
  drm/doc: Add GEM/CMA helpers to kerneldoc
  drm/doc: mm: Fix indentation
  drm/gem: Fix a few kerneldoc typos
2014-11-15 09:50:21 +10:00
Dave Airlie
5bb2bbf596 drm: add properties for suggested x/y offset for connectors. (v2)
Virtual GPUs would like to give the guest some indication where on the screen
the outputs are layed out. So far we only provide modes, these
properties could be exposed to userspace so the desktop environment
could use them as hints to set the correct offsets.

v2: rename properties to be more consistent.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-11-15 09:43:23 +10:00
Dave Airlie
b0654103f5 Merge tag 'drm/tegra/for-3.19-rc1' of git://people.freedesktop.org/~tagr/linux into drm-next
drm/tegra: Changes for v3.19-rc1

The highlights in this pull request are:

  * IOMMU support: The Tegra DRM driver can now deal with discontiguous
    buffers if an IOMMU exists in the system. That means it can allocate
    using drm_gem_get_pages() and will map them into IOVA space via the
    IOMMU API. Similarly, non-contiguous PRIME buffers can be imported
    from a different driver, which allows better integration with gk20a
    (nouveau) and less hacks.

  * Universal planes: This is precursory work for atomic modesetting and
    will allow hardware cursor support to be implemented on pre-Tegra114
    where RGB cursors were not supported.

  * DSI ganged-mode support: The DSI controller can now gang up with a
    second DSI controller to drive high resolution DSI panels.

Besides those bigger changes there is a slew of fixes, cleanups, plugged
memory leaks and so on.

* tag 'drm/tegra/for-3.19-rc1' of git://people.freedesktop.org/~tagr/linux: (44 commits)
  drm/tegra: gem: Check before freeing CMA memory
  drm/tegra: fb: Add error codes to error messages
  drm/tegra: fb: Properly release GEM objects on failure
  drm/tegra: Detach panel when a connector is removed
  drm/tegra: Plug memory leak
  drm/tegra: gem: Use more consistent data types
  drm/tegra: fb: Do not destroy framebuffer
  drm/tegra: gem: dumb: pitch and size are outputs
  drm/tegra: Enable the hotplug interrupt only when necessary
  drm/tegra: dc: Universal plane support
  drm/tegra: dc: Registers are 32 bits wide
  drm/tegra: dc: Factor out DC, window and cursor commit
  drm/tegra: Add IOMMU support
  drm/tegra: Fix error handling cleanup
  drm/tegra: gem: Use dma_mmap_writecombine()
  drm/tegra: gem: Remove redundant drm_gem_free_mmap_offset()
  drm/tegra: gem: Cleanup tegra_bo_create_with_handle()
  drm/tegra: gem: Extract tegra_bo_alloc_object()
  drm/tegra: dsi: Set up PHY_TIMING & BTA_TIMING registers earlier
  drm/tegra: dsi: Replace 1000000 by USEC_PER_SEC
  ...
2014-11-15 09:38:55 +10:00
Dave Airlie
4fb2ac6ebe Merge tag 'drm/fixes/for-3.19-rc1' of git://people.freedesktop.org/~tagr/linux into drm-next
drm: Miscellaneous fixes for v3.19-rc1

This is a small collection of fixes that I've been carrying around for a
while now. Many of these have been posted and reviewed or acked. The few
that haven't I deemed too trivial to bother.

* tag 'drm/fixes/for-3.19-rc1' of git://people.freedesktop.org/~tagr/linux:
  video/hdmi: Relicense header under MIT license
  drm/gma500: mdfld: Reuse video/mipi_display.h
  drm: Make drm_mode_create_tv_properties() signature consistent
  drm: Implement drm_get_pci_dev() dummy for !PCI
  drm/prime: Use unsigned type for number of pages
  drm/gem: Fix typo in kerneldoc
  drm: Use const data when creating blob properties
  drm: Use size_t for blob property sizes
2014-11-15 09:37:20 +10:00
Dave Airlie
8aa3dc3c17 Merge tag 'drm/panel/for-3.19-rc1' of git://people.freedesktop.org/~tagr/linux into drm-next
drm/panel: Changes for v3.19-rc1

This contains support for a couple of new panels, updates for some GPIO
API changes and a bunch of updates to the MIPI DSI support that should
make it easier to write panel drivers in the future.

* tag 'drm/panel/for-3.19-rc1' of git://people.freedesktop.org/~tagr/linux: (31 commits)
  drm/panel: Add Sharp LQ101R1SX01 support
  drm/dsi: Do not require .owner field to be set
  drm/dsi: Resolve MIPI DSI device from phandle
  drm/dsi: Implement DCS set_{column,page}_address commands
  drm/dsi: Implement DCS {get,set}_pixel_format commands
  drm/dsi: Implement DCS get_power_mode command
  drm/dsi: Implement DCS soft_reset command
  drm/dsi: Implement DCS nop command
  drm/dsi: Add to DocBook documentation
  drm/dsi: Implement some standard DCS commands
  drm/dsi: Implement generic read and write commands
  drm/panel: s6e8aa0: Use standard MIPI DSI function
  drm/dsi: Add mipi_dsi_set_maximum_return_packet_size() helper
  drm/dsi: Constify mipi_dsi_msg
  drm/dsi: Make mipi_dsi_dcs_{read,write}() symmetrical
  drm/dsi: Add DSI transfer helper
  drm/dsi: Add message to packet translator
  drm/dsi: Introduce packet format helpers
  drm/panel: s6e8aa0: Fix build warnings on 64-bit
  drm/panel: ld9040: Fix build warnings on 64-bit
  ...
2014-11-15 09:36:13 +10:00
Dave Airlie
fd172d0c47 Merge tag 'drm-intel-next-2014-11-07-fixups' of git://anongit.freedesktop.org/drm-intel into drm-next
- skl watermarks code (Damien, Vandana, Pradeep)
- reworked audio codec /eld handling code (Jani)
- rework the mmio_flip code to use the vblank evade logic and wait for rendering
  using the standard wait_seqno interface (Ander)
- skl forcewake support (Zhe Wang)
- refactor the chv interrupt code to use functions shared with vlv (Ville)
- prep work for different global gtt views (Tvrtko Ursulin)
- precompute the display PLL config before touching hw state (Ander)
- completely reworked panel power sequencer code for chv/vlv (Ville)
- pre work to split the plane update code into a prepare and commit phase
  (Gustavo Padovan)
- golden context for skl (Armin Reese)
- as usual tons of fixes and improvements all over

* tag 'drm-intel-next-2014-11-07-fixups' of git://anongit.freedesktop.org/drm-intel: (135 commits)
  drm/i915: Use correct pipe config to update pll dividers. V2
  drm/i915: Plug memory leak in intel_shared_dpll_start_config()
  drm/i915: Update DRIVER_DATE to 20141107
  drm/i915: Add gen to the gpu hang ecode
  drm/i915: Cache HPLL frequency on VLV/CHV
  Revert "drm/i915/vlv: Remove check for Old Ack during forcewake"
  drm/i915: Make mmio flip wait for seqno in the work function
  drm/i915: Make __wait_seqno non-static and rename to __i915_wait_seqno
  drm/i915: Move the .global_resources() hook call into modeset_update_crtc_power_domains()
  drm/i915/audio: add DOC comment describing HDA over HDMI/DP
  drm/i915: make pipe/port based audio valid accessors easier to use
  drm/i915/audio: add audio codec enable debug log for g4x
  drm/i915/audio: add audio codec disable on g4x
  drm/i915: enable audio codec after port
  drm/i915/audio: add vlv/chv/gen5-7 audio codec disable sequence
  drm/i915/audio: rewrite vlv/chv and gen 5-7 audio codec enable sequence
  drm/i915/skl: Enable Gen9 RC6
  drm/i915/skl: Gen9 Forcewake
  drm/i915/skl: Log the order in which we flush the pipes in the WM code
  drm/i915/skl: Flush the WM configuration
  ...
2014-11-15 09:33:40 +10:00
Boris BREZILLON
d7f8db5300 drm: flip-work: change drm_flip_work_init prototype
Now that we're using lists instead of kfifo to store drm flip-work tasks
we do not need the size parameter passed to drm_flip_work_init function
anymore.
Moreover this function cannot fail anymore, we can thus remove the return
code.

Modify drm_flip_work_init users to take account of these changes.

[airlied: fixed two unused variable warnings]

Signed-off-by: Boris BREZILLON <boris.brezillon@free-electrons.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-11-15 09:29:14 +10:00
Boris BREZILLON
8bd4ae2028 drm: rework flip-work helpers to avoid calling func when the FIFO is full
Make use of lists instead of kfifo in order to dynamically allocate
task entry when someone require some delayed work, and thus preventing
drm_flip_work_queue from directly calling func instead of queuing this
call.
This allow drm_flip_work_queue to be safely called even within irq
handlers.

Add new helper functions to allocate a flip work task and queue it when
needed. This prevents allocating data within irq context (which might
impact the time spent in the irq handler).

Signed-off-by: Boris BREZILLON <boris.brezillon@free-electrons.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-11-15 09:25:35 +10:00
Jacek Anaszewski
4d71a4a12b leds: Add support for setting brightness in a synchronous way
There are use cases when setting a LED brightness has to
have immediate effect (e.g. setting a torch LED brightness).
This patch extends LED subsystem to support such operations.
The LED subsystem internal API __led_set_brightness is changed
to led_set_brightness_async and new led_set_brightness_sync API
is added.

Signed-off-by: Jacek Anaszewski <j.anaszewski@samsung.com>
Acked-by: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Bryan Wu <cooloney@gmail.com>
2014-11-14 14:29:35 -08:00
Jacek Anaszewski
acd899e4f3 leds: implement sysfs interface locking mechanism
Add a mechanism for locking LED subsystem sysfs interface.
This patch prepares ground for addition of LED Flash Class
extension, whose API will be integrated with V4L2 Flash API.
Such a fusion enforces introducing a locking scheme, which
will secure consistent access to the LED Flash Class device.

The mechanism being introduced allows for disabling LED
subsystem sysfs interface by calling led_sysfs_disable function
and enabling it by calling led_sysfs_enable. The functions
alter the LED_SYSFS_DISABLE flag state and must be called
under mutex lock. The state of the lock is checked with use
of led_sysfs_is_disabled function. Such a design allows for
providing immediate feedback to the user space on whether
the LED Flash Class device is available or is under V4L2 Flash
sub-device control.

Signed-off-by: Jacek Anaszewski <j.anaszewski@samsung.com>
Acked-by: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Bryan Wu <cooloney@gmail.com>
2014-11-14 14:29:35 -08:00
Joe Stringer
23e62de33d net: Add vxlan_gso_check() helper
Most NICs that report NETIF_F_GSO_UDP_TUNNEL support VXLAN, and not
other UDP-based encapsulation protocols where the format and size of the
header differs. This patch implements a generic ndo_gso_check() for
VXLAN which will only advertise GSO support when the skb looks like it
contains VXLAN (or no UDP tunnelling at all).

Implementation shamelessly stolen from Tom Herbert:
http://thread.gmane.org/gmane.linux.network/332428/focus=333111

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-14 17:12:48 -05:00
Vincent BENAYOUN
84bc88688e inetdevice: fixed signed integer overflow
There could be a signed overflow in the following code.

The expression, (32-logmask) is comprised between 0 and 31 included.
It may be equal to 31.
In such a case the left shift will produce a signed integer overflow.
According to the C99 Standard, this is an undefined behavior.
A simple fix is to replace the signed int 1 with the unsigned int 1U.

Signed-off-by: Vincent BENAYOUN <vincent.benayoun@trust-in-soft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-14 17:08:58 -05:00
Linus Torvalds
78646f62db Merge tag 'pm+acpi-3.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI and power management fixes from Rafael Wysocki:
 "These are three regression fixes, two recent (generic power domains,
  suspend-to-idle) and one older (cpufreq), an ACPI blacklist entry for
  one more machine having problems with Windows 8 compatibility, a minor
  cpufreq driver fix (cpufreq-dt) and a fixup for new callback
  definitions (generic power domains).

  Specifics:

   - Fix a crash in the suspend-to-idle code path introduced by a recent
     commit that forgot to check a pointer against NULL before
     dereferencing it (Dmitry Eremin-Solenikov).

   - Fix a boot crash on Exynos5 introduced by a recent commit making
     that platform use generic Device Tree bindings for power domains
     which exposed a weakness in the generic power domains framework
     leading to that crash (Ulf Hansson).

   - Fix a crash during system resume on systems where cpufreq depends
     on Operation Performance Points (OPP) for functionality, but
     CONFIG_OPP is not set.  This leads the cpufreq driver registration
     to fail, but the resume code attempts to restore the pre-suspend
     cpufreq configuration (which does not exist) nevertheless and
     crashes.  From Geert Uytterhoeven.

   - Add a new ACPI blacklist entry for Dell Vostro 3546 that has
     problems if it is reported as Windows 8 compatible to the BIOS
     (Adam Lee).

   - Fix swapped arguments in an error message in the cpufreq-dt driver
     (Abhilash Kesavan).

   - Fix up the prototypes of new callbacks in struct generic_pm_domain
     to make them more useful.  Users of those callbacks will be added
     in 3.19 and it's better for them to be based on the correct struct
     definition in mainline from the start.  From Ulf Hansson and Kevin
     Hilman"

* tag 'pm+acpi-3.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM / Domains: Fix initial default state of the need_restore flag
  PM / sleep: Fix entering suspend-to-IDLE if no freeze_oops is set
  PM / Domains: Change prototype for the attach and detach callbacks
  cpufreq: Avoid crash in resume on SMP without OPP
  cpufreq: cpufreq-dt: Fix arguments in clock failure error message
  ACPI / blacklist: blacklist Win8 OSI for Dell Vostro 3546
2014-11-14 13:38:02 -08:00
Jay Vosburgh
a77f9c5dcd Revert "fast_hash: avoid indirect function calls"
This reverts commit e5a2c89995.

	Commit e5a2c899 introduced an alternative_call, arch_fast_hash2,
that selects between __jhash2 and __intel_crc4_2_hash based on the
X86_FEATURE_XMM4_2.

	Unfortunately, the alternative_call system does not appear to be
suitable for use with C functions, as register usage is not handled
properly for the called functions.  The __jhash2 function in particular
clobbers registers that are not preserved when called via
alternative_call, resulting in a panic for direct callers of
arch_fast_hash2 on older CPUs lacking sse4_2.  It is possible that
__intel_crc4_2_hash works merely by chance because it uses fewer
registers.

	This commit was suggested as the source of the problem by Jesse
Gross <jesse@nicira.com>.

Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-14 16:36:25 -05:00
Sebastian Reichel
bf2b892e66 [media] si4713: cleanup platform data
Remove unreferenced members from the platform
data's structure.

Signed-off-by: Sebastian Reichel <sre@kernel.org>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
2014-11-14 18:14:33 -02:00
Sebastian Reichel
98bea620c7 [media] si4713: add device tree support
Add device tree support by changing the device registration order.
In the device tree the si4713 node is a normal I2C device, which
will be probed as such. Thus the V4L device must be probed from
the I2C device and not the other way around.

Signed-off-by: Sebastian Reichel <sre@kernel.org>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
2014-11-14 18:08:19 -02:00
Boris BREZILLON
aa1b4da6af [media] v4l: Forbid usage of V4L2_MBUS_FMT definitions inside the kernel
Place v4l2_mbus_pixelcode in a #ifndef __KERNEL__ section so that kernel
users don't have access to these definitions.

We have to keep this definition for user-space users even though they're
encouraged to move to the new media_bus_format enum.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
2014-11-14 17:56:18 -02:00
Boris BREZILLON
27ffaeb0ab [media] platform: Make use of media_bus_format enum
In order to have subsytem agnostic media bus format definitions we've
moved media bus definition to include/uapi/linux/media-bus-format.h and
prefixed values with MEDIA_BUS_FMT instead of V4L2_MBUS_FMT.

Reference new definitions in all platform drivers.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Acked-by: Hans Verkuil <hans.verkuil@cisco.com>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Acked-by: Sekhar Nori <nsekhar@ti.com>
Acked-by: Lad, Prabhakar <prabhakar.csengg@gmail.com>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
2014-11-14 17:54:08 -02:00
Boris BREZILLON
32b32ce84a [media] Make use of the new media_bus_format definitions
Replace references to the v4l2_mbus_pixelcode enum with the new
media_bus_format enum in all common headers.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Acked-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
2014-11-14 17:51:18 -02:00
Boris BREZILLON
edcf58bc03 [media] Move mediabus format definition to a more standard place
Define MEDIA_BUS_FMT macros (re-using the values defined in the
v4l2_mbus_pixelcode enum) into a separate header file so that they can be
used from the DRM/KMS subsystem without any reference to the V4L2
subsystem.

Then set V4L2_MBUS_FMT definitions to the MEDIA_BUS_FMT values using the
V4L2_MBUS_FROM_MEDIA_BUS_FMT macro.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Acked-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Acked-by: Hans Verkuil <hans.verkuil@cisco.com>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
2014-11-14 17:47:47 -02:00
Tony Lindgren
9889278181 Merge tag 'for-v3.19/omap-a' of git://git.kernel.org/pub/scm/linux/kernel/git/pjw/omap-pending into omap-for-v3.19/soc
Some OMAP clock/hwmod patches for v3.19.

Most of the patches are clock-related.  The DPLL implementation is
changed to better align to the common clock framework.
There is also a patch that removes a few lines from the hwmod code -
this patch should have no functional effect.

Basic build, boot, and PM test logs for these patches can be found here:

http://www.pwsan.com/omap/testlogs/omap-a-for-v3.19/20141113094101/
2014-11-14 10:25:12 -08:00
Antonios Motakis
c49866493b iommu: add capability IOMMU_CAP_NOEXEC
Some IOMMUs accept an IOMMU_NOEXEC protection flag in addition to
IOMMU_READ and IOMMU_WRITE. Expose this as an IOMMU capability.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Acked-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2014-11-14 14:41:38 +00:00
Antonios Motakis
a720b41c41 iommu/arm-smmu: change IOMMU_EXEC to IOMMU_NOEXEC
Exposing the XN flag of the SMMU driver as IOMMU_NOEXEC instead of
IOMMU_EXEC makes it enforceable, since for IOMMUs that don't support
the XN flag pages will always be executable.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Acked-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2014-11-14 14:41:38 +00:00
Rafael J. Wysocki
31689497d9 Merge branches 'pm-domains', 'pm-sleep' and 'pm-cpufreq'
* pm-domains:
  PM / Domains: Fix initial default state of the need_restore flag
  PM / Domains: Change prototype for the attach and detach callbacks

* pm-sleep:
  PM / sleep: Fix entering suspend-to-IDLE if no freeze_oops is set

* pm-cpufreq:
  cpufreq: Avoid crash in resume on SMP without OPP
  cpufreq: cpufreq-dt: Fix arguments in clock failure error message
2014-11-14 15:17:32 +01:00
Hans de Goede
6d09dc6b74 of.h: Keep extern declaration of of_* variables when !CONFIG_OF
Keep the extern declaration of of_allnodes and friends, when building without
of support, this way code using them can be written like this:

	if (IS_ENABLED(CONFIG_OF_PLATFORM) && of_chosen) {
		for_each_child_of_node(of_chosen, np)
			...
	}

And rely on the compiler optimizing it away, avoiding the need for #ifdef-ery.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2014-11-14 15:28:58 +02:00
Chris Wilson
6a2c4232ec drm/i915: Make the physical object coherent with GTT
Currently objects for which the hardware needs a contiguous physical
address are allocated a shadow backing storage to satisfy the contraint.
This shadow buffer is not wired into the normal obj->pages and so the
physical object is incoherent with accesses via the GPU, GTT and CPU. By
setting up the appropriate scatter-gather table, we can allow userspace
to access the physical object via either a GTT mmaping of or by rendering
into the GEM bo. However, keeping the CPU mmap of the shmemfs backing
storage coherent with the contiguous shadow is not yet possible.
Fortuituously, CPU mmaps of objects requiring physical addresses are not
expected to be coherent anyway.

This allows the physical constraint of the GEM object to be transparent
to userspace and allow it to efficiently render into or update them via
the GTT and GPU.

v2: Fix leak of pci handle spotted by Ville
v3: Remove the now duplicate call to detach_phys_object during free.
v4: Wait for rendering before pwrite. As this patch makes it possible to
render into the phys object, we should make it correct as well!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-11-14 10:29:18 +01:00
Aneesh Kumar K.V
f30c59e921 mm: Update generic gup implementation to handle hugepage directory
Update generic gup implementation with powerpc specific details.
On powerpc at pmd level we can have hugepte, normal pmd pointer
or a pointer to the hugepage directory.

Tested-by: Steve Capper <steve.capper@linaro.org>
Acked-by: Steve Capper <steve.capper@linaro.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-14 17:24:21 +11:00
David S. Miller
076ce44825 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/chelsio/cxgb4vf/sge.c
	drivers/net/ethernet/intel/ixgbe/ixgbe_phy.c

sge.c was overlapping two changes, one to use the new
__dev_alloc_page() in net-next, and one to use s->fl_pg_order in net.

ixgbe_phy.c was a set of overlapping whitespace changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-14 01:01:12 -05:00
Linus Torvalds
5cf5203704 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) sunhme driver lacks DMA mapping error checks, based upon a report by
    Meelis Roos.

 2) Fix memory leak in mvpp2 driver, from Sudip Mukherjee.

 3) DMA memory allocation sizes are wrong in systemport ethernet driver,
    fix from Florian Fainelli.

 4) Fix use after free in mac80211 defragmentation code, from Johannes
    Berg.

 5) Some networking uapi headers missing from Kbuild file, from Stephen
    Hemminger.

 6) TUN driver gets csum_start offset wrong when VLAN accel is enabled,
    and macvtap has a similar bug, from Herbert Xu.

 7) Adjust several tunneling drivers to set dev->iflink after registry,
    because registry sets that to -1 overwriting whatever we did.  From
    Steffen Klassert.

 8) Geneve forgets to set inner tunneling type, causing GSO segmentation
    to fail on some NICs.  From Jesse Gross.

 9) Fix several locking bugs in stmmac driver, from Fabrice Gasnier and
    Giuseppe CAVALLARO.

10) Fix spurious timeouts with NewReno on low traffic connections, from
    Marcelo Leitner.

11) Fix descriptor updates in enic driver, from Govindarajulu
    Varadarajan.

12) PPP calls bpf_prog_create() with locks held, which isn't kosher.
    Fix from Takashi Iwai.

13) Fix NULL deref in SCTP with malformed INIT packets, from Daniel
    Borkmann.

14) psock_fanout selftest accesses past the end of the mmap ring, fix
    from Shuah Khan.

15) Fix PTP timestamping for VLAN packets, from Richard Cochran.

16) netlink_unbind() calls in netlink pass wrong initial argument, from
    Hiroaki SHIMODA.

17) vxlan socket reuse accidently reuses a socket when the address
    family is different, so we have to explicitly check this, from
    Marcelo Lietner.

18) Fix missing include in nft_reject_bridge.c breaking the build on ppc
    and other architectures, from Guenter Roeck.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (75 commits)
  vxlan: Do not reuse sockets for a different address family
  smsc911x: power-up phydev before doing a software reset.
  lib: rhashtable - Remove weird non-ASCII characters from comments
  net/smsc911x: Fix delays in the PHY enable/disable routines
  net/smsc911x: Fix rare soft reset timeout issue due to PHY power-down mode
  netlink: Properly unbind in error conditions.
  net: ptp: fix time stamp matching logic for VLAN packets.
  cxgb4 : dcb open-lldp interop fixes
  selftests/net: psock_fanout seg faults in sock_fanout_read_ring()
  net: bcmgenet: apply MII configuration in bcmgenet_open()
  net: bcmgenet: connect and disconnect from the PHY state machine
  net: qualcomm: Fix dependency
  ixgbe: phy: fix uninitialized status in ixgbe_setup_phy_link_tnx
  net: phy: Correctly handle MII ioctl which changes autonegotiation.
  ipv6: fix IPV6_PKTINFO with v4 mapped
  net: sctp: fix memory leak in auth key management
  net: sctp: fix NULL pointer dereference in af->from_addr_param on malformed packet
  net: ppp: Don't call bpf_prog_create() in ppp_lock
  net/mlx4_en: Advertize encapsulation offloads features only when VXLAN tunnel is set
  cxgb4 : Fix bug in DCB app deletion
  ...
2014-11-13 17:54:08 -08:00
Tang Chen
f784a3f196 mem-hotplug: reset node managed pages when hot-adding a new pgdat
In free_area_init_core(), zone->managed_pages is set to an approximate
value for lowmem, and will be adjusted when the bootmem allocator frees
pages into the buddy system.

But free_area_init_core() is also called by hotadd_new_pgdat() when
hot-adding memory.  As a result, zone->managed_pages of the newly added
node's pgdat is set to an approximate value in the very beginning.

Even if the memory on that node has node been onlined,
/sys/device/system/node/nodeXXX/meminfo has wrong value:

  hot-add node2 (memory not onlined)
  cat /sys/device/system/node/node2/meminfo
  Node 2 MemTotal:       33554432 kB
  Node 2 MemFree:               0 kB
  Node 2 MemUsed:        33554432 kB
  Node 2 Active:                0 kB

This patch fixes this problem by reset node managed pages to 0 after
hot-adding a new node.

1. Move reset_managed_pages_done from reset_node_managed_pages() to
   reset_all_zones_managed_pages()
2. Make reset_node_managed_pages() non-static
3. Call reset_node_managed_pages() in hotadd_new_pgdat() after pgdat
   is initialized

Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: <stable@vger.kernel.org>	[3.16+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-11-13 16:17:06 -08:00
Joonsoo Kim
ad53f92eb4 mm/page_alloc: fix incorrect isolation behavior by rechecking migratetype
Before describing bugs itself, I first explain definition of freepage.

 1. pages on buddy list are counted as freepage.
 2. pages on isolate migratetype buddy list are *not* counted as freepage.
 3. pages on cma buddy list are counted as CMA freepage, too.

Now, I describe problems and related patch.

Patch 1: There is race conditions on getting pageblock migratetype that
it results in misplacement of freepages on buddy list, incorrect
freepage count and un-availability of freepage.

Patch 2: Freepages on pcp list could have stale cached information to
determine migratetype of buddy list to go.  This causes misplacement of
freepages on buddy list and incorrect freepage count.

Patch 4: Merging between freepages on different migratetype of
pageblocks will cause freepages accouting problem.  This patch fixes it.

Without patchset [3], above problem doesn't happens on my CMA allocation
test, because CMA reserved pages aren't used at all.  So there is no
chance for above race.

With patchset [3], I did simple CMA allocation test and get below
result:

 - Virtual machine, 4 cpus, 1024 MB memory, 256 MB CMA reservation
 - run kernel build (make -j16) on background
 - 30 times CMA allocation(8MB * 30 = 240MB) attempts in 5 sec interval
 - Result: more than 5000 freepage count are missed

With patchset [3] and this patchset, I found that no freepage count are
missed so that I conclude that problems are solved.

On my simple memory offlining test, these problems also occur on that
environment, too.

This patch (of 4):

There are two paths to reach core free function of buddy allocator,
__free_one_page(), one is free_one_page()->__free_one_page() and the
other is free_hot_cold_page()->free_pcppages_bulk()->__free_one_page().
Each paths has race condition causing serious problems.  At first, this
patch is focused on first type of freepath.  And then, following patch
will solve the problem in second type of freepath.

In the first type of freepath, we got migratetype of freeing page
without holding the zone lock, so it could be racy.  There are two cases
of this race.

 1. pages are added to isolate buddy list after restoring orignal
    migratetype

    CPU1                                   CPU2

    get migratetype => return MIGRATE_ISOLATE
    call free_one_page() with MIGRATE_ISOLATE

                                grab the zone lock
                                unisolate pageblock
                                release the zone lock

    grab the zone lock
    call __free_one_page() with MIGRATE_ISOLATE
    freepage go into isolate buddy list,
    although pageblock is already unisolated

This may cause two problems.  One is that we can't use this page anymore
until next isolation attempt of this pageblock, because freepage is on
isolate buddy list.  The other is that freepage accouting could be wrong
due to merging between different buddy list.  Freepages on isolate buddy
list aren't counted as freepage, but ones on normal buddy list are
counted as freepage.  If merge happens, buddy freepage on normal buddy
list is inevitably moved to isolate buddy list without any consideration
of freepage accouting so it could be incorrect.

 2. pages are added to normal buddy list while pageblock is isolated.
    It is similar with above case.

This also may cause two problems.  One is that we can't keep these
freepages from being allocated.  Although this pageblock is isolated,
freepage would be added to normal buddy list so that it could be
allocated without any restriction.  And the other problem is same as
case 1, that it, incorrect freepage accouting.

This race condition would be prevented by checking migratetype again
with holding the zone lock.  Because it is somewhat heavy operation and
it isn't needed in common case, we want to avoid rechecking as much as
possible.  So this patch introduce new variable, nr_isolate_pageblock in
struct zone to check if there is isolated pageblock.  With this, we can
avoid to re-check migratetype in common case and do it only if there is
isolated pageblock or migratetype is MIGRATE_ISOLATE.  This solve above
mentioned problems.

Changes from v3:
Add one more check in free_one_page() that checks whether migratetype is
MIGRATE_ISOLATE or not. Without this, abovementioned case 1 could happens.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Laura Abbott <lauraa@codeaurora.org>
Cc: Heesub Shin <heesub.shin@samsung.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Ritesh Harjani <ritesh.list@gmail.com>
Cc: Gioh Kim <gioh.kim@lge.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-11-13 16:17:05 -08:00
Thomas Graf
6eba82248e rhashtable: Drop gfp_flags arg in insert/remove functions
Reallocation is only required for shrinking and expanding and both rely
on a mutex for synchronization and callers of rhashtable_init() are in
non atomic context. Therefore, no reason to continue passing allocation
hints through the API.

Instead, use GFP_KERNEL and add __GFP_NOWARN | __GFP_NORETRY to allow
for silent fall back to vzalloc() without the OOM killer jumping in as
pointed out by Eric Dumazet and Eric W. Biederman.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-13 15:18:40 -05:00
Matan Barak
de966c5928 net/mlx4_core: Support more than 64 VFs
We now allow up to 126 VFs. Note though that certain firmware
versions only allow up to 80 VFs. Moreover, old HCAs only support 64 VFs.
In these cases, we limit the maximum number of VFs to 64.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-13 15:16:22 -05:00