Commit Graph

28411 Commits

Author SHA1 Message Date
Erik Kline
0065bf4206 net: ipv6: allow choosing optimistic addresses with use_optimistic
The use_optimistic sysctl makes optimistic IPv6 addresses
equivalent to preferred addresses for source address selection
(e.g., when calling connect()), but it does not allow an
application to bind to optimistic addresses. This behaviour is
inconsistent - for example, it doesn't make sense for bind() to
an optimistic address fail with EADDRNOTAVAIL, but connect() to
choose that address outgoing address on the same socket.

Bug: 17769720
Bug: 18609055
Change-Id: I9de0d6c92ac45e29d28e318ac626c71806666f13
Signed-off-by: Erik Kline <ek@google.com>
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
2014-12-05 04:41:18 -08:00
Jane Zhou
78a68094b2 net/ping: handle protocol mismatching scenario
ping_lookup() may return a wrong sock if sk_buff's and sock's protocols
dont' match. For example, sk_buff's protocol is ETH_P_IPV6, but sock's
sk_family is AF_INET, in that case, if sk->sk_bound_dev_if is zero, a wrong
sock will be returned.
the fix is to "continue" the searching, if no matching, return NULL.

[cherry-pick of net 91a0b60346]

Bug: 18512516
Change-Id: I520223ce53c0d4e155c37d6b65a03489cc7fd494
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: netdev@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Jane Zhou <a17711@motorola.com>
Signed-off-by: Yiwei Zhao <gbjc64@motorola.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
2014-12-02 02:28:04 +00:00
Erik Kline
2ce95507d5 net: ipv6: Add a sysctl to make optimistic addresses useful candidates
Add a sysctl that causes an interface's optimistic addresses
to be considered equivalent to other non-deprecated addresses
for source address selection purposes.  Preferred addresses
will still take precedence over optimistic addresses, subject
to other ranking in the source address selection algorithm.

This is useful where different interfaces are connected to
different networks from different ISPs (e.g., a cell network
and a home wifi network).

The current behaviour complies with RFC 3484/6724, and it
makes sense if the host has only one interface, or has
multiple interfaces on the same network (same or cooperating
administrative domain(s), but not in the multiple distinct
networks case.

For example, if a mobile device has an IPv6 address on an LTE
network and then connects to IPv6-enabled wifi, while the wifi
IPv6 address is undergoing DAD, IPv6 connections will try use
the wifi default route with the LTE IPv6 address, and will get
stuck until they time out.

Also, because optimistic nodes can receive frames, issue
an RTM_NEWADDR as soon as DAD starts (with the IFA_F_OPTIMSTIC
flag appropriately set).  A second RTM_NEWADDR is sent if DAD
completes (the address flags have changed), otherwise an
RTM_DELADDR is sent.

Also: add an entry in ip-sysctl.txt for optimistic_dad.

[cherry-pick of net-next 7fd2561e4e]

Signed-off-by: Erik Kline <ek@google.com>
Acked-by: Lorenzo Colitti <lorenzo@google.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bug: 17769720
Change-Id: Ic7e50781c607e1f3a492d9ce7395946efb95c533
2014-11-05 00:46:54 +00:00
Sreeram Ramachandran
455b09d66a Handle 'sk' being NULL in UID-based routing.
Bug: 15413527
Change-Id: Iab1fae9da6053b284591628ef1de878761b137b1
Signed-off-by: Sreeram Ramachandran <sreeram@google.com>
2014-07-08 18:58:21 +00:00
Yann Soubeyrand
8d0e99be24 net: wireless: fix misplaced #endif in net/wireless/nl80211.c
The patch "nl80211: cumulative vendor command support patch" introduced
compilation error in file net/wireless/nl80211.c. The nl80211_vendor_mcgrp
variable is defined only if the CONFIG_NL80211_TESTMODE preprocessor constant
is defined. However, this variable is later used wether
CONFIG_NL80211_TESTMODE is defined or not. The cause is a misplaced #endif.

Change-Id: I466488285578d57e6554a1f8ebe71d4f3385ecf2
Signed-off-by: Dmitry Shmidt <dimitrysh@google.com>
2014-06-27 15:26:49 -07:00
Lorenzo Colitti
99a6ea48b5 net: core: Support UID-based routing.
This contains the following commits:

1. cc2f522 net: core: Add a UID range to fib rules.
2. d7ed2bd net: core: Use the socket UID in routing lookups.
3. 2f9306a net: core: Add a RTA_UID attribute to routes.
    This is so that userspace can do per-UID route lookups.
4. 8e46efb net: ipv6: Use the UID in IPv6 PMTUD
    IPv4 PMTUD already does this because ipv4_sk_update_pmtu
    uses __build_flow_key, which includes the UID.

Bug: 15413527
Change-Id: I81bd31dae655de9cce7d7a1f9a905dc1c2feba7c
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
2014-06-25 12:41:52 +09:00
Dmitry Shmidt
f6f56efe7d net: wireless: Add NL80211_FLAG_NEED_WIPHY flag to vendor command
Change-Id: I52ee3bc8a422c2a4c57cccccccd6ba3e721b4c01
Signed-off-by: Dmitry Shmidt <dimitrysh@google.com>
2014-06-24 13:20:40 -07:00
Dmitry Shmidt
8af60dae09 net: wireless: Increase scan entry expiration to fit new scan time
Change-Id: I0e23ce45d78d7c17633670973f49943a5ed6032d
Signed-off-by: Dmitry Shmidt <dimitrysh@google.com>
2014-06-24 09:41:29 -07:00
Dmitry Shmidt
47f7337804 nl80211: cumulative vendor command support patch
Based on commit d3fd06d0259232e1362c6d1da136970d26628467
Author: Johannes Berg <johannes.berg@intel.com>
Date:   Sat Jan 25 10:17:18 2014 -0800
    nl80211: vendor command support

Change-Id: I832eb4da295fe7b2c9bd8ff69ae80fe7bfe30add
Signed-off-by: Dmitry Shmidt <dimitrysh@google.com>
2014-06-17 13:42:34 -07:00
Johannes Berg
36cafd2167 nl80211: fix another nl80211_fam.attrbuf race
commit c319d50bfc upstream.

This is similar to the race Linus had reported, but in this case
it's an older bug: nl80211_prepare_wdev_dump() uses the wiphy
index in cb->args[0] as it is and thus parses the message over
and over again instead of just once because 0 is the first valid
wiphy index. Similar code in nl80211_testmode_dump() correctly
offsets the wiphy_index by 1, do that here as well.

Reported-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-16 17:34:52 -07:00
Sherman Yin
186197b12c netfilter: fix seq_printf type mismatch warning
The return type of atomic64_read() varies depending on arch.  The
arm64 version is being changed from long long to long in the mainline
for v3.16, causing a seq_printf type mismatch (%llu) in
guid_ctrl_proc_show().

This commit fixes the type mismatch by casting atomic64_read() to u64.

Change-Id: Iae0a6bd4314f5686a9f4fecbe6203e94ec0870de
Signed-off-by: Sherman Yin <shermanyin@gmail.com>
2014-06-12 14:35:38 -07:00
Andreas Henriksson
803431de4e net: Fix "ip rule delete table 256"
[ Upstream commit 13eb2ab2d3 ]

When trying to delete a table >= 256 using iproute2 the local table
will be deleted.
The table id is specified as a netlink attribute when it needs more then
8 bits and iproute2 then sets the table field to RT_TABLE_UNSPEC (0).
Preconditions to matching the table id in the rule delete code
doesn't seem to take the "table id in netlink attribute" into condition
so the frh_get_table helper function never gets to do its job when
matching against current rule.
Use the helper function twice instead of peaking at the table value directly.

Originally reported at: http://bugs.debian.org/724783

Reported-by: Nicolas HICHER <nhicher@avencall.com>
Signed-off-by: Andreas Henriksson <andreas@fatal.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-05-18 23:36:45 -07:00
Lorenzo Colitti
6ba3a0e3b1 net: support marking accepting TCP sockets
When using mark-based routing, sockets returned from accept()
may need to be marked differently depending on the incoming
connection request.

This is the case, for example, if different socket marks identify
different networks: a listening socket may want to accept
connections from all networks, but each connection should be
marked with the network that the request came in on, so that
subsequent packets are sent on the correct network.

This patch adds a sysctl to mark TCP sockets based on the fwmark
of the incoming SYN packet. If enabled, and an unmarked socket
receives a SYN, then the SYN packet's fwmark is written to the
connection's inet_request_sock, and later written back to the
accepted socket when the connection is established.  If the
socket already has a nonzero mark, then the behaviour is the same
as it is today, i.e., the listening socket's fwmark is used.

Black-box tested using user-mode linux:

- IPv4/IPv6 SYN+ACK, FIN, etc. packets are routed based on the
  mark of the incoming SYN packet.
- The socket returned by accept() is marked with the mark of the
  incoming SYN packet.
- Tested with syncookies=1 and syncookies=2.

Change-Id: I26bc1eceefd2c588d73b921865ab70e4645ade57
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
2014-05-16 20:58:31 +00:00
Lorenzo Colitti
2cf4347e48 net: Use fwmark reflection in PMTU discovery.
Currently, routing lookups used for Path PMTU Discovery in
absence of a socket or on unmarked sockets use a mark of 0.
This causes PMTUD not to work when using routing based on
netfilter fwmark mangling and fwmark ip rules, such as:

  iptables -j MARK --set-mark 17
  ip rule add fwmark 17 lookup 100

This patch causes these route lookups to use the fwmark from the
received ICMP error when the fwmark_reflect sysctl is enabled.
This allows the administrator to make PMTUD work by configuring
appropriate fwmark rules to mark the inbound ICMP packets.

Black-box tested using user-mode linux by pointing different
fwmarks at routing tables egressing on different interfaces, and
using iptables mangling to mark packets inbound on each interface
with the interface's fwmark. ICMPv4 and ICMPv6 PMTU discovery
work as expected when mark reflection is enabled and fail when
it is disabled.

Change-Id: Id7fefb7ec1ff7f5142fba43db1960b050e0dfaec
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
2014-05-16 19:20:43 +00:00
Lorenzo Colitti
5a87fa6a43 net: add a sysctl to reflect the fwmark on replies
Kernel-originated IP packets that have no user socket associated
with them (e.g., ICMP errors and echo replies, TCP RSTs, etc.)
are emitted with a mark of zero. Add a sysctl to make them have
the same mark as the packet they are replying to.

This allows an administrator that wishes to do so to use
mark-based routing, firewalling, etc. for these replies by
marking the original packets inbound.

Tested using user-mode linux:
 - ICMP/ICMPv6 echo replies and errors.
 - TCP RST packets (IPv4 and IPv6).

Change-Id: I6873d973196797bcf32e2e91976df647c7e8b85a
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
2014-05-16 18:41:19 +00:00
Lorenzo Colitti
a03f539b16 net: ipv6: autoconf routes into per-device tables
Currently, IPv6 router discovery always puts routes into
RT6_TABLE_MAIN. This causes problems for connection managers
that want to support multiple simultaneous network connections
and want control over which one is used by default (e.g., wifi
and wired).

To work around this connection managers typically take the routes
they prefer and copy them to static routes with low metrics in
the main table. This puts the burden on the connection manager
to watch netlink to see if the routes have changed, delete the
routes when their lifetime expires, etc.

Instead, this patch adds a per-interface sysctl to have the
kernel put autoconf routes into different tables. This allows
each interface to have its own autoconf table, and choosing the
default interface (or using different interfaces at the same
time for different types of traffic) can be done using
appropriate ip rules.

The sysctl behaves as follows:

- = 0: default. Put routes into RT6_TABLE_MAIN as before.
- > 0: manual. Put routes into the specified table.
- < 0: automatic. Add the absolute value of the sysctl to the
       device's ifindex, and use that table.

The automatic mode is most useful in conjunction with
net.ipv6.conf.default.accept_ra_rt_table. A connection manager
or distribution could set it to, say, -100 on boot, and
thereafter just use IP rules.

Change-Id: I82d16e3737d9cdfa6489e649e247894d0d60cbb1
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
2014-05-16 18:19:20 +00:00
Lorenzo Colitti
03fdcba8c6 net: ipv6: ping: Use socket mark in routing lookup
Change-Id: I5a61e0f9f22f193c51b1aafd270fb0642a2e0fab
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 18:18:51 +00:00
Wang, Xiaoming
a8fe78a7aa net: ipv4: current group_info should be put after using.
Plug a group_info refcount leak in ping_init.
group_info is only needed during initialization and
the code failed to release the reference on exit.
While here move grabbing the reference to a place
where it is actually needed.

Signed-off-by: Chuansheng Liu <chuansheng.liu@intel.com>
Signed-off-by: Zhang Dongxing <dongxing.zhang@intel.com>
Signed-off-by: xiaoming wang <xiaoming.wang@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-24 17:36:41 -07:00
Ruchi Kandoi
999aaabf02 nf: Remove compilation error caused by
e8430cbed3

Signed-off-by: Ruchi Kandoi <kandoiruchi@google.com>
2014-04-24 21:15:09 +00:00
Ruchi Kandoi
e8430cbed3 nf: IDLETIMER: time-stamp and suspend/resume handling.
Message notifications contains an additional timestamp field in nano seconds.
The expiry time for the timers are modified during suspend/resume.
If timer was supposed to expire while the system is suspended then a
notification is sent when it resumes with the timestamp of the scheduled expiry.

Removes the race condition for multiple work scheduled.

Bug: 13247811

Change-Id: I752c5b00225fe7085482819f975cc0eb5af89bff
Signed-off-by: Ruchi Kandoi <kandoiruchi@google.com>
2014-04-22 18:05:20 -07:00
Greg Hackmann
7d89969ad2 netfilter: xt_qtaguid: 64-bit warning fixes
Change-Id: I2adc517c0c51050ed601992fa0ea4de8f1449414
Signed-off-by: Greg Hackmann <ghackmann@google.com>
2014-02-25 10:04:34 -08:00
JP Abgrall
969ff3bbb3 tcp: add a sysctl to config the tcp_default_init_rwnd
The default initial rwnd is hardcoded to 10.

Now we allow it to be controlled via
  /proc/sys/net/ipv4/tcp_default_init_rwnd
which limits the values from 3 to 100

This is somewhat needed because ipv6 routes are
autoconfigured by the kernel.

See "An Argument for Increasing TCP's Initial Congestion Window"
in https://developers.google.com/speed/articles/tcp_initcwnd_paper.pdf

Change-Id: I386b2a9d62de0ebe05c1ebe1b4bd91b314af5c54
Signed-off-by: JP Abgrall <jpa@google.com>

Conflicts:
	net/ipv4/sysctl_net_ipv4.c
	net/ipv4/tcp_input.c
2014-02-07 18:40:10 -08:00
Ashish Sharma
537057b66d netfilter: xt_IDLETIMER: Revert to retain the kernel API format.
Reverted Change-Id: Iaeca5dd2d7878c0733923ae03309a2a7b86979ca

Change-Id: I0e0a4f60ec14330d8d8d1c5a508fa058d9919e07
Signed-off-by: Ashish Sharma <ashishsharma@google.com>
(cherry picked from commit e0a4e5b0e808d718dd9af500c5754118fc3935db)
2014-02-05 19:57:22 +00:00
Hannes Frederic Sowa
5a0312add7 ping: prevent NULL pointer dereference on write to msg_name
A plain read() on a socket does set msg->msg_name to NULL. So check for
NULL pointer first.

[Backport of net-next cf970c002d]

Bug: 12780426
Change-Id: I3df76aca2fa56478b9a33c404f7b1f0940475ef7
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
2014-01-30 11:39:33 -08:00
Lorenzo Colitti
b918c72ce2 net: ipv6: add missing lock in ping_v6_sendmsg
[net-next commit a1bdc45580]

Bug: 12800827
Change-Id: I93d897e5043dc89bc99f111c89ef4f8b1fa1885d
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-30 11:39:33 -08:00
Lorenzo Colitti
2ac994153e net: ipv6: fix wrong ping_v6_sendmsg return value
[net-next commit fbfe80c890]

ping_v6_sendmsg currently returns 0 on success. It should return
the number of bytes written instead.

Bug: 12800827
Change-Id: I7ed17dc61afbb68a84908e67e44db976ec812bad
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-30 11:39:33 -08:00
Cong Wang
51d00bd5ef ping: always initialize ->sin6_scope_id and ->sin6_flowinfo
[net-next commit c26d6b46da]

If we don't need scope id, we should initialize it to zero.
Same for ->sin6_flowinfo.

Bug: 12800827
Change-Id: Ic19792cee3f5dc30237562cf48e6bdf49817c96e
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Acked-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-30 11:39:33 -08:00
Lorenzo Colitti
fd762f67b3 net: ipv6: Add IPv6 support to the ping socket.
[net-next commit 6d0bfe2261]

This adds the ability to send ICMPv6 echo requests without a
raw socket. The equivalent ability for ICMPv4 was added in
2011.

Instead of having separate code paths for IPv4 and IPv6, make
most of the code in net/ipv4/ping.c dual-stack and only add a
few IPv6-specific bits (like the protocol definition) to a new
net/ipv6/ping.c. Hopefully this will reduce divergence and/or
duplication of bugs in the future.

Caveats:

- Setting options via ancillary data (e.g., using IPV6_PKTINFO
  to specify the outgoing interface) is not yet supported.
- There are no separate security settings for IPv4 and IPv6;
  everything is controlled by /proc/net/ipv4/ping_group_range.
- The proc interface does not yet display IPv6 ping sockets
  properly.

Tested with a patched copy of ping6 and using raw socket calls.
Compiles and works with all of CONFIG_IPV6={n,m,y}.

Bug: 12800827
Change-Id: I718cd9931823873ab44df22e8a66e12d6a0a6eb1
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-30 11:39:32 -08:00
Greg Hackmann
2c06cb2045 netfilter: xt_qtaguid: fix memory leak in seq_file handlers
Change-Id: I15b21230d52479d008a00d9e2191dda020f00925
Signed-off-by: Greg Hackmann <ghackmann@google.com>
2013-12-04 17:48:59 -08:00
Arve Hjønnevåg
21dd5d7718 netfilter: xt_qtaguid: 3.10 fixes
Stop using obsolete procfs api.

Signed-off-by: Arve Hjønnevåg <arve@android.com>
2013-07-01 15:52:04 -07:00
Arve Hjønnevåg
73570fe76d netfilter: xt_quota2: 3.10 fixes.
- Stop using obsolete create_proc_entry api.
- Use proc_set_user instead of directly accessing the private structure.

Signed-off-by: Arve Hjønnevåg <arve@android.com>
2013-07-01 15:52:03 -07:00
Arve Hjønnevåg
4af1c50c2b net: activity_stats: Stop using obsolete create_proc_read_entry api
Convert to use seq_read

Signed-off-by: Arve Hjønnevåg <arve@android.com>
2013-07-01 15:52:02 -07:00
JP Abgrall
24d852635b netfilter: xt_qtaguid: fix bad tcp_time_wait sock handling
Since (41063e9 ipv4: Early TCP socket demux), skb's can have an sk which
is not a struct sock but the smaller struct inet_timewait_sock without an
sk->sk_socket. Now we bypass sk_state == TCP_TIME_WAIT

Signed-off-by: JP Abgrall <jpa@google.com>
2013-07-01 15:51:58 -07:00
Colin Cross
1dc1499a43 af_unix: use freezable blocking calls in read
Avoid waking up every thread sleeping in read call on an AF_UNIX
socket during suspend and resume by calling a freezable blocking
call.  Previous patches modified the freezer to avoid sending
wakeups to threads that are blocked in freezable blocking calls.

This call was selected to be converted to a freezable call because
it doesn't hold any locks or release any resources when interrupted
that might be needed by another freezing task or a kernel driver
during suspend, and is a common site where idle userspace tasks are
blocked.

Change-Id: I788246a76780ea892659526e70be018b18f646c4
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Colin Cross <ccross@android.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-07-01 15:46:26 -07:00
Colin Cross
c4bdacb9e8 freezer: add unsafe versions of freezable helpers for NFS
NFS calls the freezable helpers with locks held, which is unsafe
and will cause lockdep warnings when 6aa9707 "lockdep: check
that no locks held at freeze time" is reapplied (it was reverted
in dbf520a).  NFS shouldn't be doing this, but it has
long-running syscalls that must hold a lock but also shouldn't
block suspend.  Until NFS freeze handling is rewritten to use a
signal to exit out of the critical section, add new *_unsafe
versions of the helpers that will not run the lockdep test when
6aa9707 is reapplied, and call them from NFS.

In practice the likley result of holding the lock while freezing
is that a second task blocked on the lock will never freeze,
aborting suspend, but it is possible to manufacture a case using
the cgroup freezer, the lock, and the suspend freezer to create
a deadlock.  Silencing the lockdep warning here will allow
problems to be found in other drivers that may have a more
serious deadlock risk, and prevent new problems from being added.

Change-Id: Ia17d32cdd013a6517bdd5759da900970a4427170
Signed-off-by: Colin Cross <ccross@android.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-07-01 15:37:43 -07:00
JP Abgrall
52ae0941a4 netfilter: qtaguid: rate limit some of the printks
Some of the printks are in the packet handling path.
We now ratelimit the very unlikely errors to avoid
kmsg spamming.

Signed-off-by: JP Abgrall <jpa@google.com>
2013-07-01 14:16:23 -07:00
JP Abgrall
2b6cc26b6b net: bluetooth: Remove the AID_NET_BT* gid numbers
Removed bluetooth checks for AID_NET_BT and AID_NET_BT_ADMIN
which are not useful anymore.
This is in preparation for getting rid of all the AID_* gids.

Signed-off-by: JP Abgrall <jpa@google.com>
2013-07-01 14:16:21 -07:00
JP Abgrall
af4d7c85bc netfilter: xt_qtaguid: Allow tracking loopback
In the past it would always ignore interfaces with loopback addresses.
Now we just treat them like any other.
This also helps with writing tests that check for the presence
of the qtaguid module.

Signed-off-by: JP Abgrall <jpa@google.com>
2013-07-01 14:16:20 -07:00
JP Abgrall
9766450977 netfilter: xt_qtaguid: extend iface stat to report protocols
In the past the iface_stat_fmt would only show global bytes/packets
for the skb-based numbers.
For stall detection in userspace, distinguishing tcp vs other protocols
makes it easier.
Now we report
  ifname total_skb_rx_bytes total_skb_rx_packets total_skb_tx_bytes
  total_skb_tx_packets {rx,tx}_{tcp,udp,ohter}_{bytes,packets}

Bug: 6818637
Signed-off-by: JP Abgrall <jpa@google.com>
2013-07-01 14:16:20 -07:00
JP Abgrall
e585d17670 netfilter: xt_qtaguid: remove AID_* dependency for access control
qtaguid limits what can be done with /ctrl and /stats based on group
membership.
This changes removes AID_NET_BW_STATS and AID_NET_BW_ACCT, and picks
up the groups from the gid of the matching proc entry files.

Signed-off-by: JP Abgrall <jpa@google.com>
Change-Id: I42e477adde78a12ed5eb58fbc0b277cdaadb6f94
2013-07-01 14:16:19 -07:00
Pontus Fuchs
ef8e4b827d netfilter: qtaguid: Don't BUG_ON if create_if_tag_stat fails
If create_if_tag_stat fails to allocate memory (GFP_ATOMIC) the
following will happen:

qtaguid: iface_stat: tag stat alloc failed
...
kernel BUG at xt_qtaguid.c:1482!

Signed-off-by: Pontus Fuchs <pontus.fuchs@gmail.com>
2013-07-01 14:16:14 -07:00
JP Abgrall
cb80c2ec45 netfilter: xt_qtaguid: fix error exit that would keep a spinlock.
qtudev_open() could return with a uid_tag_data_tree_lock held
when an kzalloc(..., GFP_ATOMIC) would fail.
Very unlikely to get triggered AND survive the mayhem of running out of mem.

Signed-off-by: JP Abgrall <jpa@google.com>
2013-07-01 14:16:11 -07:00
JP Abgrall
c33bcd3772 netfilter: xt_qtaguid: report only uid tags to non-privileged processes
In the past, a process could only see its own stats (uid-based summary,
and details).
Now we allow any process to see other UIDs uid-based stats, but still
hide the detailed stats.

Change-Id: I7666961ed244ac1d9359c339b048799e5db9facc
Signed-off-by: JP Abgrall <jpa@google.com>
2013-07-01 14:16:05 -07:00
Ashish Sharma
c669cb755c netfilter: xt_IDLETIMER: Rename INTERFACE to LABEL in netlink notification.
Change-Id: Iaeca5dd2d7878c0733923ae03309a2a7b86979ca
Signed-off-by: Ashish Sharma <ashishsharma@google.com>
2013-07-01 13:40:56 -07:00
JP Abgrall
c90ab94236 netfilter: xt_qtaguid: start tracking iface rx/tx at low level
qtaguid tracks the device stats by monitoring when it goes up and down,
then it gets the dev_stats().
But devs don't correctly report stats (either they don't count headers
symmetrically between rx/tx, or they count internal control messages).

Now qtaguid counts the rx/tx bytes/packets during raw:prerouting and
mangle:postrouting (nat is not available in ipv6).

The results are in
  /proc/net/xt_qtaguid/iface_stat_fmt
which outputs a format line (bash expansion):
  ifname  total_skb_{rx,tx}_{bytes,packets}

Added event counters for pre/post handling.
Added extra ctrl_*() pid/uid debugging.

Change-Id: Id84345d544ad1dd5f63e3842cab229e71d339297
Signed-off-by: JP Abgrall <jpa@google.com>
2013-07-01 13:40:45 -07:00
JP Abgrall
2b11da7d0a netfilter: xt_IDLETIMER: Add new netlink msg type
Send notifications when the label becomes active after an idle period.
Send netlink message notifications in addition to sysfs notifications.
Using a uevent with
  subsystem=xt_idletimer
  INTERFACE=...
  STATE={active,inactive}

This is backport from common android-3.0
commit: beb914e987
with uevent support instead of a new netlink message type.

Change-Id: I31677ef00c94b5f82c8457e5bf9e5e584c23c523
Signed-off-by: Ashish Sharma <ashishsharma@google.com>
Signed-off-by: JP Abgrall <jpa@google.com>
2013-07-01 13:40:45 -07:00
JP Abgrall
8b041edbc7 netfilter: xt_qtaguid: fix ipv6 protocol lookup
When updating the stats for a given uid it would incorrectly assume
IPV4 and pick up the wrong protocol when IPV6.

Change-Id: Iea4a635012b4123bf7aa93809011b7b2040bb3d5
Signed-off-by: JP Abgrall <jpa@google.com>
2013-07-01 13:40:42 -07:00
JP Abgrall
cc53cb8db6 netfilter: qtaguid: initialize a local var to keep compiler happy.
There was a case that might have seemed like new_tag_stat was not
initialized and actually used.
Added comment explaining why it was impossible, and a BUG()
in case the logic gets changed.

Change-Id: I1eddd1b6f754c08a3bf89f7e9427e5dce1dfb081
Signed-off-by: JP Abgrall <jpa@google.com>
2013-07-01 13:40:42 -07:00
Ashish Sharma
19d2b31c93 bridge: Have tx_bytes count headers like rx_bytes.
Since rx_bytes accounting does not include Ethernet Headers in
br_input.c, excluding ETH_HLEN on the transmit path for consistent
measurement of packet length on both the Tx and Rx chains.

The clean way would be for Rx to include the eth header, but the
skb len has already been adjusted by the time the br code sees the skb.
This is only a temporary workaround until we can completely ignore or
cleanly fix the skb->len handling.

Change-Id: I910de95a4686b2119da7f1f326e2154ef31f9972
Signed-off-by: Ashish Sharma <ashishsharma@google.com>
2013-07-01 13:40:37 -07:00
JP Abgrall
ef08cfe9ce netfilter: ipv6: fix crash caused by ipv6_find_hdr()
When calling:
    ipv6_find_hdr(skb, &thoff, -1, NULL)
on a fragmented packet, thoff would be left with a random
value causing callers to read random memory offsets with:
    skb_header_pointer(skb, thoff, ...)

Now we force ipv6_find_hdr() to return a failure in this case.
Calling:
  ipv6_find_hdr(skb, &thoff, -1, &fragoff)
will set fragoff as expected, and not return a failure.

Change-Id: Ib474e8a4267dd2b300feca325811330329684a88
Signed-off-by: JP Abgrall <jpa@google.com>
2013-07-01 13:40:36 -07:00