Commit Graph

388088 Commits

Author SHA1 Message Date
Amerigo Wang
8965779d2c ipv6,mcast: always hold idev->lock before mca_lock
dingtianhong reported the following deadlock detected by lockdep:

 ======================================================
 [ INFO: possible circular locking dependency detected ]
 3.4.24.05-0.1-default #1 Not tainted
 -------------------------------------------------------
 ksoftirqd/0/3 is trying to acquire lock:
  (&ndev->lock){+.+...}, at: [<ffffffff8147f804>] ipv6_get_lladdr+0x74/0x120

 but task is already holding lock:
  (&mc->mca_lock){+.+...}, at: [<ffffffff8149d130>] mld_send_report+0x40/0x150

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> #1 (&mc->mca_lock){+.+...}:
        [<ffffffff810a8027>] validate_chain+0x637/0x730
        [<ffffffff810a8417>] __lock_acquire+0x2f7/0x500
        [<ffffffff810a8734>] lock_acquire+0x114/0x150
        [<ffffffff814f691a>] rt_spin_lock+0x4a/0x60
        [<ffffffff8149e4bb>] igmp6_group_added+0x3b/0x120
        [<ffffffff8149e5d8>] ipv6_mc_up+0x38/0x60
        [<ffffffff81480a4d>] ipv6_find_idev+0x3d/0x80
        [<ffffffff81483175>] addrconf_notify+0x3d5/0x4b0
        [<ffffffff814fae3f>] notifier_call_chain+0x3f/0x80
        [<ffffffff81073471>] raw_notifier_call_chain+0x11/0x20
        [<ffffffff813d8722>] call_netdevice_notifiers+0x32/0x60
        [<ffffffff813d92d4>] __dev_notify_flags+0x34/0x80
        [<ffffffff813d9360>] dev_change_flags+0x40/0x70
        [<ffffffff813ea627>] do_setlink+0x237/0x8a0
        [<ffffffff813ebb6c>] rtnl_newlink+0x3ec/0x600
        [<ffffffff813eb4d0>] rtnetlink_rcv_msg+0x160/0x310
        [<ffffffff814040b9>] netlink_rcv_skb+0x89/0xb0
        [<ffffffff813eb357>] rtnetlink_rcv+0x27/0x40
        [<ffffffff81403e20>] netlink_unicast+0x140/0x180
        [<ffffffff81404a9e>] netlink_sendmsg+0x33e/0x380
        [<ffffffff813c4252>] sock_sendmsg+0x112/0x130
        [<ffffffff813c537e>] __sys_sendmsg+0x44e/0x460
        [<ffffffff813c5544>] sys_sendmsg+0x44/0x70
        [<ffffffff814feab9>] system_call_fastpath+0x16/0x1b

 -> #0 (&ndev->lock){+.+...}:
        [<ffffffff810a798e>] check_prev_add+0x3de/0x440
        [<ffffffff810a8027>] validate_chain+0x637/0x730
        [<ffffffff810a8417>] __lock_acquire+0x2f7/0x500
        [<ffffffff810a8734>] lock_acquire+0x114/0x150
        [<ffffffff814f6c82>] rt_read_lock+0x42/0x60
        [<ffffffff8147f804>] ipv6_get_lladdr+0x74/0x120
        [<ffffffff8149b036>] mld_newpack+0xb6/0x160
        [<ffffffff8149b18b>] add_grhead+0xab/0xc0
        [<ffffffff8149d03b>] add_grec+0x3ab/0x460
        [<ffffffff8149d14a>] mld_send_report+0x5a/0x150
        [<ffffffff8149f99e>] igmp6_timer_handler+0x4e/0xb0
        [<ffffffff8105705a>] call_timer_fn+0xca/0x1d0
        [<ffffffff81057b9f>] run_timer_softirq+0x1df/0x2e0
        [<ffffffff8104e8c7>] handle_pending_softirqs+0xf7/0x1f0
        [<ffffffff8104ea3b>] __do_softirq_common+0x7b/0xf0
        [<ffffffff8104f07f>] __thread_do_softirq+0x1af/0x210
        [<ffffffff8104f1c1>] run_ksoftirqd+0xe1/0x1f0
        [<ffffffff8106c7de>] kthread+0xae/0xc0
        [<ffffffff814fff74>] kernel_thread_helper+0x4/0x10

actually we can just hold idev->lock before taking pmc->mca_lock,
and avoid taking idev->lock again when iterating idev->addr_list,
since the upper callers of mld_newpack() already take
read_lock_bh(&idev->lock).

Reported-by: dingtianhong <dingtianhong@huawei.com>
Cc: dingtianhong <dingtianhong@huawei.com>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Tested-by: Ding Tianhong <dingtianhong@huawei.com>
Tested-by: Chen Weilong <chenweilong@huawei.com>
Signed-off-by: Cong Wang <amwang@redhat.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-01 23:39:21 -07:00
Cong Wang
ab6c7a0a43 vti: remove duplicated code to fix a memory leak
vti module allocates dev->tstats twice: in vti_fb_tunnel_init()
and in vti_tunnel_init(), this lead to a memory leak of
dev->tstats.

Just remove the duplicated operations in vti_fb_tunnel_init().

(candidate for -stable)

Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Saurabh Mohan <saurabh.mohan@vyatta.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-01 23:37:14 -07:00
Cong Wang
6c734fb859 gre: fix a regression in ioctl
When testing GRE tunnel, I got:

 # ip tunnel show
 get tunnel gre0 failed: Invalid argument
 get tunnel gre1 failed: Invalid argument

This is a regression introduced by commit c544193214
("GRE: Refactor GRE tunneling code.") because previously we
only check the parameters for SIOCADDTUNNEL and SIOCCHGTUNNEL,
after that commit, the check is moved for all commands.

So, just check for SIOCADDTUNNEL and SIOCCHGTUNNEL.

After this patch I got:

 # ip tunnel show
 gre0: gre/ip  remote any  local any  ttl inherit  nopmtudisc
 gre1: gre/ip  remote 192.168.122.101  local 192.168.122.45  ttl inherit

Cc: Pravin B Shelar <pshelar@nicira.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-01 23:35:22 -07:00
Jens Axboe
d0e3d0238d Merge branch 'bcache-for-3.11' of git://evilpiepirate.org/~kent/linux-bcache into for-3.11/drivers 2013-07-02 08:32:57 +02:00
Jens Axboe
5f0e5afa0d Merge tag 'v3.10-rc7' into for-3.11/drivers
Linux 3.10-rc7

Pull this in early to avoid doing it with the bcache merge,
since there are a number of changes to bcache between my old
base (3.10-rc1) and the new pull request.
2013-07-02 08:31:48 +02:00
Daniel Borkmann
bb33381d0c net: sctp: rework debugging framework to use pr_debug and friends
We should get rid of all own SCTP debug printk macros and use the ones
that the kernel offers anyway instead. This makes the code more readable
and conform to the kernel code, and offers all the features of dynamic
debbuging that pr_debug() et al has, such as only turning on/off portions
of debug messages at runtime through debugfs. The runtime cost of having
CONFIG_DYNAMIC_DEBUG enabled, but none of the debug statements printing,
is negligible [1]. If kernel debugging is completly turned off, then these
statements will also compile into "empty" functions.

While we're at it, we also need to change the Kconfig option as it /now/
only refers to the ifdef'ed code portions in outqueue.c that enable further
debugging/tracing of SCTP transaction fields. Also, since SCTP_ASSERT code
was enabled with this Kconfig option and has now been removed, we
transform those code parts into WARNs resp. where appropriate BUG_ONs so
that those bugs can be more easily detected as probably not many people
have SCTP debugging permanently turned on.

To turn on all SCTP debugging, the following steps are needed:

 # mount -t debugfs none /sys/kernel/debug
 # echo -n 'module sctp +p' > /sys/kernel/debug/dynamic_debug/control

This can be done more fine-grained on a per file, per line basis and others
as described in [2].

 [1] https://www.kernel.org/doc/ols/2009/ols2009-pages-39-46.pdf
 [2] Documentation/dynamic-debug-howto.txt

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-01 23:22:13 -07:00
Daniel Borkmann
1067964305 lib: vsprintf: add IPv4/v6 generic %p[Ii]S[pfs] format specifier
In order to avoid making code that deals with printing both, IPv4 and
IPv6 addresses, unnecessary complicated as for example ...

  if (sa.sa_family == AF_INET6)
    printk("... %pI6 ...", ..sin6_addr);
  else
    printk("... %pI4 ...", ..sin_addr.s_addr);

... it would be better to introduce a format specifier that can deal
with those kind of situations internally; just as we have a "struct
sockaddr" for generic mapping into "struct sockaddr_in" or "struct
sockaddr_in6" as e.g. done in "union sctp_addr". Then, we could
reduce the above statement into something like:

  printk("... %pIS ..", &sockaddr);

In case our pointer is NULL, pointer() then deals with that already at
an earlier point in time internally. While we're at it, support for both
%piS/%pIS, where 'S' stands for sockaddr, comes (almost) for free.

Additionally to that, postfix specifiers 'p', 'f' and 's' are supported
as suggested and initially implemented in 2009 by Joe Perches [1].
Handling of those additional specifiers orientate on the initial RFC that
was proposed. Also we support IPv6 compressed format specified by 'c' and
various other IPv4 extensions as stated in the documentation part.

Likely, there are many other areas than just SCTP in the kernel to make
use of this extension as well.

 [1] http://patchwork.ozlabs.org/patch/31480/

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
CC: Joe Perches <joe@perches.com>
CC: linux-kernel@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-01 23:22:13 -07:00
David S. Miller
936faf6c49 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/shemminger/vxlan-next
Stephen Hemminger says:

====================
Here is current updates for vxlan in net-next.  It includes Mike's changes
to handle multiple destinations and lots of little cosmetic stuff.

This is a fresh vxlan-next repository which was forked from net-next.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-01 23:17:19 -07:00
Rusty Russell
0d69a65e97 tools/lguest: real barriers.
Lguest guests are UP, but the host is probably SMP, so real barriers are
required in case the device thread and the guest are on different CPUs.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-07-02 15:42:05 +09:30
Rusty Russell
8fd9a6365e tools/lguest: fix missing rmb().
The virtio spec was missing a barrier in example code, so I went back
to look at the lguest code.  Indeed, we need one.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-07-02 15:42:04 +09:30
Luiz Capitulino
8c6bab4f38 virtio_balloon: leak_balloon(): only tell host if we got pages deflated
balloon_page_dequeue() can return NULL.  If it does for the first page
being freed then leak_balloon() will create a scatter list with len=0.
Which in turn seems to generate an invalid virtio request.

I didn't get this in practice, I found it by code review.  On the other
hand, such an invalid virtio request will cause errors in QEMU and
fill_balloon() also performs the same check implemented by this commit.

This bug was introduced in e2250429.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Acked-by: Rafael Aquini <aquini@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@kernel.org # 3.9
2013-07-02 15:42:03 +09:30
Andrew Vagin
f11335db5e virtio-pci: fix leaks of msix_affinity_masks
vp_dev->msix_vectors should be initialized before allocating
msix_affinity_masks, otherwise vp_free_vectors will not free these
objects.

unreferenced object 0xffff88010f969d88 (size 512):
  comm "systemd-udevd", pid 158, jiffies 4294673645 (age 80.545s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff816e455e>] kmemleak_alloc+0x5e/0xc0
    [<ffffffff811aa7f1>] kmem_cache_alloc_node_trace+0x141/0x2c0
    [<ffffffff8133ba23>] alloc_cpumask_var_node+0x23/0x80
    [<ffffffff8133ba8e>] alloc_cpumask_var+0xe/0x10
    [<ffffffff813fdb3d>] vp_try_to_find_vqs+0x25d/0x810
    [<ffffffff813fe171>] vp_find_vqs+0x81/0xb0
    [<ffffffffa00d2a05>] init_vqs+0x85/0x120 [virtio_balloon]
    [<ffffffffa00d2c29>] virtballoon_probe+0xf9/0x1a0 [virtio_balloon]
    [<ffffffff813fb61e>] virtio_dev_probe+0xde/0x140
    [<ffffffff814452b8>] driver_probe_device+0x98/0x3a0
    [<ffffffff8144566b>] __driver_attach+0xab/0xb0
    [<ffffffff814432f4>] bus_for_each_dev+0x94/0xb0
    [<ffffffff81444f4e>] driver_attach+0x1e/0x20
    [<ffffffff81444910>] bus_add_driver+0x200/0x280
    [<ffffffff81445c14>] driver_register+0x74/0x160
    [<ffffffff813fb7d0>] register_virtio_driver+0x20/0x40

v2: change msix_vectors uncoditionaly in vp_free_vectors

Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Andrew Vagin <avagin@openvz.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-07-02 15:42:02 +09:30
Paul Bolle
95cc2c02cf Fix comment typo "CONFIG_PAE"
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-07-02 15:42:00 +09:30
Rusty Russell
54041d8a73 modules: don't fail to load on unknown parameters.
Although parameters are supposed to be part of the kernel API, experimental
parameters are often removed.  In addition, downgrading a kernel might cause
previously-working modules to fail to load.

On balance, it's probably better to warn, and load the module anyway.
This may let through a typo, but at least the logs will show it.

Reported-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-07-02 15:38:21 +09:30
Jean Delvare
86f120031a ABI: Clarify when /sys/module/MODULENAME is created
/sys/module/MODULENAME is not created unconditionally. This can be
confusing so document the current conditions.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Rob Landley <rob@landley.net>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-07-02 15:38:20 +09:30
Jean Delvare
b634d130e4 There is no /sys/parameters
There is no such path as /sys/parameters, module parameters live in
/sys/module/*/parameters.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-07-02 15:38:19 +09:30
Mathias Krause
4f6de4d51f module: don't modify argument of module_kallsyms_lookup_name()
If we pass a pointer to a const string in the form "module:symbol"
module_kallsyms_lookup_name() will try to split the string at the colon,
i.e., will try to modify r/o data. That will, in fact, fail on a kernel
with enabled CONFIG_DEBUG_RODATA.

Avoid modifying the passed string in module_kallsyms_lookup_name(),
modify find_module_all() instead to pass it the module name length.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-07-02 15:38:18 +09:30
David Herrmann
77ef8bbc87 drm: make drm_mm_init() return void
There is no reason to return "int" as this function never fails.
Furthermore, several drivers (ast, sis) already depend on this.

Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-07-02 13:34:41 +10:00
Dave Airlie
6ef92fbea2 Merge branch 'drm-next-3.11' of git://people.freedesktop.org/~agd5f/linux into drm-next
A few more patches for 3.11:
- add debugfs interface to check current DPM state
- Fix a bug that caused problems with DPM on BTC+ asics.

* 'drm-next-3.11' of git://people.freedesktop.org/~agd5f/linux:
  drm/radeon/dpm: add debugfs support for SI
  drm/radeon/dpm: add debugfs support for cayman
  drm/radeon/dpm: add debugfs support for TN
  drm/radeon/dpm: add debugfs support for ON/LN
  drm/radeon/dpm: add debugfs support for 7xx/evergreen/btc
  drm/radeon/dpm: add debugfs support for rv6xx
  drm/radeon/dpm: add infrastructure to support debugfs info
  drm/radeon/dpm: re-enable state transitions for Cayman
  drm/radeon/dpm: re-enable state transitions for BTC
  drm/radeon: fix typo in radeon_atom_init_mc_reg_table()
  drm/radeon/atom: fix endian bug in radeon_atom_init_mc_reg_table()
  drm/radeon: remove sumo dpm/uvd bringup leftovers
2013-07-02 13:31:26 +10:00
Alexander Z Lam
a82274151a tracing: Protect ftrace_trace_arrays list in trace_events.c
There are multiple places where the ftrace_trace_arrays list is accessed in
trace_events.c without the trace_types_lock held.

Link: http://lkml.kernel.org/r/1372732674-22726-1-git-send-email-azl@google.com

Cc: Vaibhav Nagarnaik <vnagarnaik@google.com>
Cc: David Sharp <dhsharp@google.com>
Cc: Alexander Z Lam <lambchop468@gmail.com>
Cc: stable@vger.kernel.org # 3.10
Signed-off-by: Alexander Z Lam <azl@google.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-07-01 23:30:08 -04:00
Magnus Damm
1eb14ea1e6 ARM: shmobile: emev2 GIO3 resource fix
Fix GIO3 base addresses for EMEV2.

This bug was introduced by 088efd9273
("mach-shmobile: Emma Mobile EV2 GPIO support V3") which was included in v3.5.

Signed-off-by: Magnus Damm <damm@opensource.se>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Cc: stable@vger.kernel.org
2013-07-02 12:28:01 +09:00
Changli Gao
b1a5a34bd0 net: Swap ver and type in pppoe_hdr
Ver and type in pppoe_hdr should be swapped as defined by RFC2516
section-4.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-01 18:20:33 -07:00
Dave Jones
4ccb93ce74 x25: Fix broken locking in ioctl error paths.
Two of the x25 ioctl cases have error paths that break out of the function without
unlocking the socket, leading to this warning:

================================================
[ BUG: lock held when returning to user space! ]
3.10.0-rc7+ #36 Not tainted
------------------------------------------------
trinity-child2/31407 is leaving the kernel with locks still held!
1 lock held by trinity-child2/31407:
 #0:  (sk_lock-AF_X25){+.+.+.}, at: [<ffffffffa024b6da>] x25_ioctl+0x8a/0x740 [x25]

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-01 18:15:25 -07:00
Alexander Z Lam
2d71619c59 tracing: Make trace_marker use the correct per-instance buffer
The trace_marker file was present for each new instance created, but it
added the trace mark to the global trace buffer instead of to
the instance's buffer.

Link: http://lkml.kernel.org/r/1372717885-4543-2-git-send-email-azl@google.com

Cc: David Sharp <dhsharp@google.com>
Cc: Vaibhav Nagarnaik <vnagarnaik@google.com>
Cc: Alexander Z Lam <lambchop468@gmail.com>
Cc: stable@vger.kernel.org # 3.10
Signed-off-by: Alexander Z Lam <azl@google.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-07-01 21:08:15 -04:00
Eric Dumazet
aec0a40a6f netem: use rb tree to implement the time queue
Following typical setup to implement a ~100 ms RTT and big
amount of reorders has very poor performance because netem
implements the time queue using a linked list.
-----------------------------------------------------------
ETH=eth0
IFB=ifb0
modprobe ifb
ip link set dev $IFB up
tc qdisc add dev $ETH ingress 2>/dev/null
tc filter add dev $ETH parent ffff: \
   protocol ip u32 match u32 0 0 flowid 1:1 action mirred egress \
   redirect dev $IFB
ethtool -K $ETH gro off tso off gso off
tc qdisc add dev $IFB root netem delay 50ms 10ms limit 100000
tc qd add dev $ETH root netem delay 50ms limit 100000
---------------------------------------------------------

Switch netem time queue to a rb tree, so this kind of setup can work at
high speed.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-01 18:07:15 -07:00
Steven Rostedt (Red Hat)
f1ed7c741f ftrace: Do not run selftest if command line parameter is set
If the kernel command line ftrace filter parameters are set
(ftrace_filter or ftrace_notrace), force the function self test to
pass, with a warning why it was forced.

If the user adds a filter to the kernel command line, it is assumed
that they know what they are doing, and the self test should just not
run instead of failing (which disables function tracing) or clearing
the filter, as that will probably annoy the user.

If the user wants the selftest to run, the message will tell them why
it did not.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-07-01 20:57:15 -04:00
Oleg Nesterov
cf6735a4b1 tracing/kprobes: Don't pass addr=ip to perf_trace_buf_submit()
kprobe_perf_func() and kretprobe_perf_func() pass addr=ip to
perf_trace_buf_submit() for no reason.

This sets perf_sample_data->addr for PERF_SAMPLE_ADDR, we already
have perf_sample_data->ip initialized if PERF_SAMPLE_IP.

Link: http://lkml.kernel.org/r/20130620173811.GA13161@redhat.com

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-07-01 20:34:28 -04:00
Steven Rostedt (Red Hat)
10246fa35d tracing: Use flag buffer_disabled for irqsoff tracer
If the ring buffer is disabled and the irqsoff tracer records a trace it
will clear out its buffer and lose the data it had previously recorded.

Currently there's a callback when writing to the tracing_of file, but if
tracing is disabled via the function tracer trigger, it will not inform
the irqsoff tracer to stop recording.

By using the "mirror" flag (buffer_disabled) in the trace_array, that keeps
track of the status of the trace_array's buffer, it gives the irqsoff
tracer a fast way to know if it should record a new trace or not.
The flag may be a little behind the real state of the buffer, but it
should not affect the trace too much. It's more important for the irqsoff
tracer to be fast.

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-07-01 20:34:28 -04:00
Oleg Nesterov
b04d52e368 tracing/kprobes: Turn trace_probe->files into list_head
I think that "ftrace_event_file *trace_probe[]" complicates the
code for no reason, turn it into list_head to simplify the code.
enable_trace_probe() no longer needs synchronize_sched().

This needs the extra sizeof(list_head) memory for every attached
ftrace_event_file, hopefully not a problem in this case.

Link: http://lkml.kernel.org/r/20130620173814.GA13165@redhat.com

Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-07-01 20:34:27 -04:00
Tom Zanussi
3baa5e4cf2 tracing: Fix disabling of soft disable
The comment on the soft disable 'disable' case of
__ftrace_event_enable_disable() states that the soft disable bit
should be cleared in that case, but currently only the soft mode bit
is actually cleared.

This essentially leaves the standard non-soft-enable enable/disable
paths as the only way to clear the soft disable flag, but the soft
disable bit should also be cleared when removing a trigger with '!'.

Also, the SOFT_DISABLED bit should never be set if SOFT_MODE is
cleared.

This fixes the above discrepancies.

Link: http://lkml.kernel.org/r/b9c68dd50bc07019e6c67d3f9b29be4ef1b2badb.1372479499.git.tom.zanussi@linux.intel.com

Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-07-01 20:34:26 -04:00
Tom Zanussi
44a6a4ee1a tracing: Add missing syscall_metadata comment
Add the missing syscall_metadata description for the enter_fields
struct member.

Link: http://lkml.kernel.org/r/74c3407cd1e5d37f2c5aaca637aa4d35f66f1aa2.1372479499.git.tom.zanussi@linux.intel.com

Signed-off-by: Tom Zanussi <tom.zanussi@intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-07-01 20:34:25 -04:00
Tom Zanussi
a439059610 tracing: Simplify code for showing of soft disabled flag
Rather than enumerating each permutation, build the enable state
string up from the combination of states.  This also allows for the
simpler addition of more states.

Link: http://lkml.kernel.org/r/9aff5af6dee2f5a40ca30df41c39d5f33e998d7a.1372479499.git.tom.zanussi@linux.intel.com

Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-07-01 20:34:25 -04:00
Oleg Nesterov
3fe3d6193e tracing/kprobes: Kill probe_enable_lock
enable_trace_probe() and disable_trace_probe() should not worry about
serialization, the caller (perf_trace_init or __ftrace_set_clr_event)
holds event_mutex.

They are also called by kprobe_trace_self_tests_init(), but this __init
function can't race with itself or trace_events.c

And note that this code depended on event_mutex even before 41a7dd420c
which introduced probe_enable_lock. In fact it assumes that the caller
kprobe_register() can never race with itself. Otherwise, say, tp->flags
manipulations are racy.

Link: http://lkml.kernel.org/r/20130620173809.GA13158@redhat.com

Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-07-01 20:34:24 -04:00
Oleg Nesterov
288e984e62 tracing/kprobes: Avoid perf_trace_buf_*() if ->perf_events is empty
perf_trace_buf_prepare() + perf_trace_buf_submit() make no sense
if this task/CPU has no active counters. Change kprobe_perf_func()
and kretprobe_perf_func() to check call->perf_events beforehand
and return if this list is empty.

For example, "perf record -e some_probe -p1". Only /sbin/init will
report, all other threads which hit the same probe will do
perf_trace_buf_prepare/perf_trace_buf_submit just to realize that
nobody wants perf_swevent_event().

Link: http://lkml.kernel.org/r/20130620173806.GA13151@redhat.com

Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-07-01 20:34:23 -04:00
Steven Rostedt
6e94a78037 tracing: Failed to create system directory
Running the following:

 # cd /sys/kernel/debug/tracing
 # echo p:i do_sys_open > kprobe_events
 # echo p:j schedule >> kprobe_events
 # cat kprobe_events
p:kprobes/i do_sys_open
p:kprobes/j schedule
 # echo p:i do_sys_open >> kprobe_events
 # cat kprobe_events
p:kprobes/j schedule
p:kprobes/i do_sys_open
 # ls /sys/kernel/debug/tracing/events/kprobes/
enable  filter  j

Notice that the 'i' is missing from the kprobes directory.

The console produces:

"Failed to create system directory kprobes"

This is because kprobes passes in a allocated name for the system
and the ftrace event subsystem saves off that name instead of creating
a duplicate for it. But the kprobes may free the system name making
the pointer to it invalid.

This bug was introduced by 92edca073c "tracing: Use direct field, type
and system names" which switched from using kstrdup() on the system name
in favor of just keeping apointer to it, as the internal ftrace event
system names are static and exist for the life of the computer being booted.

Instead of reverting back to duplicating system names again, we can use
core_kernel_data() to determine if the passed in name was allocated or
static. Then use the MSB of the ref_count to be a flag to keep track if
the name was allocated or not. Then we can still save from having to duplicate
strings that will always exist, but still copy the ones that may be freed.

Cc: stable@vger.kernel.org # 3.10
Reported-by: "zhangwei(Jovi)" <jovi.zhangwei@huawei.com>
Reported-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-07-01 20:34:22 -04:00
Bjørn Mork
d0b5e51629 net: qmi_wwan: add TP-LINK MA260
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-01 17:01:24 -07:00
Bjørn Mork
aa3aba1cbc net: qmi_wwan: add Option GTM681W
A standard Gobi 3000 reference design module.

Reported-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-01 17:01:24 -07:00
Bjørn Mork
5a008ffa73 net: qmi_wwan: fixup Sierra Wireless MC8305 entry
The MC8305 module got an additional entry added based solely on
information from a Windows driver *.inf file. We now have the
actual descriptor layout from one of these modules, and it
consists of two alternate configurations where cfg #1 is a
normal Gobi 2k layout and cfg #2 is MBIM only, using interface
numbers 5 and 6 for MBIM control and data. The extra Windows
driver entry for interface number 5 was most likely a bug.

Deleting the bogus entry to avoid unnecessary qmi_wwan probe
failures when using the MBIM configuration.

Reported-by: Lana Black <sickmind@lavabit.com>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-01 17:01:24 -07:00
Jaegeuk Kim
a1dd3c13ce f2fs: fix to recover i_size from roll-forward
If user requests many data writes and fsync together, the last updated i_size
should be stored to the inode block consistently.

But, previous write_end just marks the inode as dirty and doesn't update its
metadata into its inode block.
After that, fsync just writes the inode block with newly updated data index
excluding inode metadata updates.

So, this patch introduces write_end in which updates inode block too when the
i_size is changed.

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-07-02 08:48:16 +09:00
Gu Zheng
5ebefc5b40 f2fs: remove the unused argument "sbi" of func destroy_fsync_dnodes()
As destroy_fsync_dnodes() is a simple list-cleanup func, so delete the unused
and unrelated f2fs_sb_info argument of it.

Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-07-02 08:48:15 +09:00
Jaegeuk Kim
763bfe1bc5 f2fs: remove reusing any prefree segments
This patch removes check_prefree_segments initially designed to enhance the
performance by narrowing the range of LBA usage across the whole block device.

When allocating a new segment, previous f2fs tries to find proper prefree
segments, and then, if finds a segment, it reuses the segment for further
data or node block allocation.

However, I found that this was totally wrong approach since the prefree segments
have several data or node blocks that will be used by the roll-forward mechanism
operated after sudden-power-off.

Let's assume the following scenario.

/* write 8MB with fsync */
for (i = 0; i < 2048; i++) {
	offset = i * 4096;
	write(fd, offset, 4KB);
	fsync(fd);
}

In this case, naive segment allocation sequence will be like:
 data segment: x, x+1, x+2, x+3
 node segment: y, y+1, y+2, y+3.

But, if we can reuse prefree segments, the sequence can be like:
 data segment: x, x+1, y, y+1
 node segment: y, y+1, y+2, y+3.
Because, y, y+1, and y+2 became prefree segments one by one, and those are
reused by data allocation.

After conducting this workload, we should consider how to recover the latest
inode with its data.
If we reuse the prefree segments such as y or y+1, we lost the old node blocks
so that f2fs even cannot start roll-forward recovery.

Therefore, I suggest that we should remove reusing prefree segments.

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-07-02 08:48:15 +09:00
Gu Zheng
6cc4af5606 f2fs: code cleanup and simplify in func {find/add}_gc_inode
This patch simplifies list operations in find_gc_inode and add_gc_inode.
Just simple code cleanup.

Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
[Jaegeuk Kim: add description]
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-07-02 08:48:14 +09:00
Namjae Jeon
8736fbf003 f2fs: optimize the init_dirty_segmap function
Optimize the while loop condition

Since this condition will always be true and while loop will
be terminated by the following condition in code:

if (segno >= TOTAL_SEGS(sbi))
    break;
Hence we can replace the while loop condition with while(1)
instead of always checking for segno to be less than Total segs.

Also we do not need to use TOTAL_SEGS() everytime. We can store
this value in a local variable since this value is constant.

Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Pankaj Kumar <pankaj.km@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-07-02 08:48:14 +09:00
Jaegeuk Kim
060dd67b3c f2fs: fix an endian conversion bug detected by sparse
This patch should fix the following bug reported by kbuild test robot.

fs/f2fs/recovery.c:233:33: sparse: incorrect type in assignment
(different base types)

parse warnings: (new ones prefixed by >>)

>> recovery.c:233: sparse: incorrect type in assignment (different base types)
   recovery.c:233:    expected unsigned int [unsigned] [assigned] ofs_in_node
   recovery.c:233:    got restricted __le16 [assigned] [usertype] ofs_in_node
>> recovery.c:238: sparse: incorrect type in assignment (different base types)
   recovery.c:238:    expected unsigned int [unsigned] ofs_in_node
   recovery.c:238:    got restricted __le16 [assigned] [usertype] ofs_in_node

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-07-02 08:48:13 +09:00
Jaegeuk Kim
7e586fa024 f2fs: fix crc endian conversion
While calculating CRC for the checkpoint block, we use __u32, but when storing
the crc value to the disk, we use __le32.

Let's fix the inconsistency.

Reported-and-Tested-by: Oded Gabbay <ogabbay@advaoptical.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-07-02 08:47:35 +09:00
Dongsheng.wang@freescale.com
a63b3bc7db powerpc/fsl: add MPIC timer wakeup support
The driver provides a way to wake up the system by the MPIC timer.

For example,
echo 5 > /sys/devices/system/mpic/timer_wakeup
echo standby > /sys/power/state

After 5 seconds the MPIC timer will generate an interrupt to wake up
the system.

Signed-off-by: Wang Dongsheng <dongsheng.wang@freescale.com>
Signed-off-by: Zhao Chenhui <chenhui.zhao@freescale.com>
Signed-off-by: Li Yang <leoli@freescale.com>
2013-07-01 18:38:42 -05:00
Dongsheng.wang@freescale.com
9e6f31a9db powerpc/mpic: create mpic subsystem object
Register a mpic subsystem at /sys/devices/system/

Signed-off-by: Wang Dongsheng <dongsheng.wang@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2013-07-01 18:38:42 -05:00
Dongsheng.wang@freescale.com
36ca09be6f powerpc/mpic: add global timer support
The MPIC global timer is a hardware timer inside the Freescale PIC complying
with OpenPIC standard. When the specified interval times out, the hardware
timer generates an interrupt. The driver currently is only tested on fsl chip,
but it can potentially support other global timers complying to OpenPIC
standard.

The two independent groups of global timer on fsl chip, group A and group B,
are identical in their functionality, except that they appear at different
locations within the PIC register map. The hardware timer can be cascaded to
create timers larger than the default 31-bit global timers. Timer cascade
fields allow configuration of up to two 63-bit timers. But These two groups
of timers cannot be cascaded together.

It can be used as a wakeup source for low power modes. It also could be used
as periodical timer for protocols, drivers and etc.

Signed-off-by: Wang Dongsheng <dongsheng.wang@freescale.com>
Signed-off-by: Li Yang <leoli@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2013-07-01 18:38:41 -05:00
Dongsheng.wang@freescale.com
5ff04b7287 powerpc/mpic: add irq_set_wake support
Add irq_set_wake support. Just add IRQF_NO_SUSPEND to desc->action->flag.
So the wake up interrupt will not be disable in suspend_device_irqs.

Signed-off-by: Wang Dongsheng <dongsheng.wang@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2013-07-01 18:38:41 -05:00
Kevin Hao
9837b43c5f powerpc/85xx: enable coreint for all the 64bit boards
With the patch 7230c564 (powerpc: Rework lazy-interrupt handling),
it seems that the coreint works pretty well on the 85xx 64bit kernel.
So use the coreint by default for these boards.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2013-07-01 18:38:41 -05:00