Introduce fib6_nh structure and move nexthop related data from
rt6_info and rt6_info.dst to fib6_nh. References to dev, gateway or
lwtstate from a FIB lookup perspective are converted to use fib6_nh;
datapath references to dst version are left as is.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The RTN_ type for IPv6 FIB entries is currently embedded in rt6i_flags
and dst.error. Since dst is going to be removed, it can no longer be
relied on for FIB dumps so save the route type as fib6_type.
fc_type is set in current users based on the algorithm in rt6_fill_node:
- rt6i_flags contains RTF_LOCAL: fc_type = RTN_LOCAL
- rt6i_flags contains RTF_ANYCAST: fc_type = RTN_ANYCAST
- else fc_type = RTN_UNICAST
Similarly, fib6_type is set in the rt6_info templates based on the
RTF_REJECT section of rt6_fill_node converting dst.error to RTN type.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pass network namespace reference into route add, delete and get
functions.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pass net namespace to fib6_update_sernum. It can not be marked const
as fib6_new_sernum will change ipv6.fib6_sernum.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move logic of fib_convert_metrics into ip_metrics_convert. This allows
the code that converts netlink attributes into metrics struct to be
re-used in a later patch by IPv6.
This is mostly a code move with the following changes to variable names:
- fi->fib_net becomes net
- fc_mx and fc_mx_len are passed as inputs pulled from fib_config
- metrics array is passed as an input from fi->fib_metrics->metrics
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The docs for the spi_imx platform data still refer to a -32 offset used to
specify a native chip select. This was removed in commit 602c8f4485
("spi: imx: fix use of native chip-selects with devicetree") and no
longer works as documented. Update documentation.
The macro MXC_SPI_CS() is no longer is needed.
If a board uses all native chip selects, then it's not necessary to
specify a chip select array at all, as all native is the default (this is
how device-tree configured SPI masters work too). Most of the spi-imx
platform data users have their chip select arrays removed by this patch.
This patch also fixes a bug in mx31moboard introduced in the '602 commit.
When that board was updated in commit 901f26bce6 ("ARM: imx: set
correct chip_select in platform setup") to reflect the SPI change, only
SPI bus 2 was updated and SPI bus 1 was left with non-sequential chip
selects. The mc13783 spi device on bus 1 had its chip select updated as
if it were on bus 2.
CC: Sascha Hauer <kernel@pengutronix.de>
CC: Fabio Estevam <fabio.estevam@nxp.com>
Acked-by: Greg Ungerer <gerg@linux-m68k.org>
Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Trent Piepho <tpiepho@impinj.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
The only thing it does is block module unload while work is posted from
rdma_resolve_ip().
However, this is not the right place to do this. The users of
rdma_resolve_ip() must ensure their own module does not unload until
rdma_resolve_ip() calls the callback, or until rdma_addr_cancel() is
called.
Similarly callers to rdma_addr_find_l2_eth_by_grh() must ensure their
module does not unload while they are calling code.
The only two users are already safe, so there is no need for this.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
In some scenarios, user need do some time-consuming or sleepable
operations under the hardware spinlock protection for synchronization
between the multiple subsystems.
For example, there is one PMIC efuse on Spreadtrum platform, which
need to be accessed under one hardware lock. But during the hardware
lock protection, the efuse operation is time-consuming to almost 5 ms,
so we can not disable the interrupts or preemption so long in this case.
Thus we can introduce one new mode to indicate that we just acquire the
hardware lock and do not disable interrupts or preemption, meanwhile we
should force user to protect the hardware lock with mutex or spinlock to
avoid dead-lock.
Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Like tos inherit, ttl inherit should also means inherit the inner protocol's
ttl values, which actually not implemented in vxlan yet.
But we could not treat ttl == 0 as "use the inner TTL", because that would be
used also when the "ttl" option is not specified and that would be a behavior
change, and breaking real use cases.
So add a different attribute IFLA_VXLAN_TTL_INHERIT when "ttl inherit" is
specified with ip cmd.
Reported-by: Jianlin Shi <jishi@redhat.com>
Suggested-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The statistics such as InHdrErrors should be counted on the ingress
netdev rather than on the dev from the dst, which is the egress.
Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
BPF core gets access to __inet6_bind via ipv6_bpf_stub_impl, so it is
not invoked directly outside of af_inet6.c. Make it static and move
inet6_bind after to avoid forward declaration.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Give topology clients more access to the topology data by passing index,
pcm, link_config and dai_driver to clients. This allows clients to fully
instantiate and track topology objects.
The SOF driver is the first user of these new APIs and needs them to build
component topology driver and FW objects.
Signed-off-by: Liam Girdwood <liam.r.girdwood@linux.intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Machine drivers statically define a number of DAI links that currently
cannot be changed or removed by topology. This means PCMs and platform
components cannot be changed by topology at runtime AND machine drivers
are tightly coupled to topology.
This patch allows topology to override the machine driver DAI link config
in order to reuse machine drivers with different topologies and platform
components. The patch supports :-
1) create new FE PCMs with a topology defined PCM ID.
2) destroy existing static FE PCMs
3) change the platform component driver.
4) assign any new HW params fixups.
The patch requires no changes to the machine drivers, but does add some
platform component flags that the platform component driver can assign
before loading topologies.
Signed-off-by: Liam Girdwood <liam.r.girdwood@linux.intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Changing API ndo_xdp_xmit to take a struct xdp_frame instead of struct
xdp_buff. This brings xdp_return_frame and ndp_xdp_xmit in sync.
This builds towards changing the API further to become a bulk API,
because xdp_buff is not a queue-able object while xdp_frame is.
V4: Adjust for commit 59655a5b6c ("tuntap: XDP_TX can use native XDP")
V7: Adjust for commit d9314c474d ("i40e: add support for XDP_REDIRECT")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Changing API xdp_return_frame() to take struct xdp_frame as argument,
seems like a natural choice. But there are some subtle performance
details here that needs extra care, which is a deliberate choice.
When de-referencing xdp_frame on a remote CPU during DMA-TX
completion, result in the cache-line is change to "Shared"
state. Later when the page is reused for RX, then this xdp_frame
cache-line is written, which change the state to "Modified".
This situation already happens (naturally) for, virtio_net, tun and
cpumap as the xdp_frame pointer is the queued object. In tun and
cpumap, the ptr_ring is used for efficiently transferring cache-lines
(with pointers) between CPUs. Thus, the only option is to
de-referencing xdp_frame.
It is only the ixgbe driver that had an optimization, in which it can
avoid doing the de-reference of xdp_frame. The driver already have
TX-ring queue, which (in case of remote DMA-TX completion) have to be
transferred between CPUs anyhow. In this data area, we stored a
struct xdp_mem_info and a data pointer, which allowed us to avoid
de-referencing xdp_frame.
To compensate for this, a prefetchw is used for telling the cache
coherency protocol about our access pattern. My benchmarks show that
this prefetchw is enough to compensate the ixgbe driver.
V7: Adjust for commit d9314c474d ("i40e: add support for XDP_REDIRECT")
V8: Adjust for commit bd658dda42 ("net/mlx5e: Separate dma base address
and offset in dma_sync call")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
New allocator type MEM_TYPE_PAGE_POOL for page_pool usage.
The registered allocator page_pool pointer is not available directly
from xdp_rxq_info, but it could be (if needed). For now, the driver
should keep separate track of the page_pool pointer, which it should
use for RX-ring page allocation.
As suggested by Saeed, to maintain a symmetric API it is the drivers
responsibility to allocate/create and free/destroy the page_pool.
Thus, after the driver have called xdp_rxq_info_unreg(), it is drivers
responsibility to free the page_pool, but with a RCU free call. This
is done easily via the page_pool helper page_pool_destroy() (which
avoids touching any driver code during the RCU callback, which could
happen after the driver have been unloaded).
V8: address issues found by kbuild test robot
- Address sparse should be static warnings
- Allow xdp.o to be compiled without page_pool.o
V9: Remove inline from .c file, compiler knows best
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Need a fast page recycle mechanism for ndo_xdp_xmit API for returning
pages on DMA-TX completion time, which have good cross CPU
performance, given DMA-TX completion time can happen on a remote CPU.
Refurbish my page_pool code, that was presented[1] at MM-summit 2016.
Adapted page_pool code to not depend the page allocator and
integration into struct page. The DMA mapping feature is kept,
even-though it will not be activated/used in this patchset.
[1] http://people.netfilter.org/hawk/presentations/MM-summit2016/generic_page_pool_mm_summit2016.pdf
V2: Adjustments requested by Tariq
- Changed page_pool_create return codes, don't return NULL, only
ERR_PTR, as this simplifies err handling in drivers.
V4: many small improvements and cleanups
- Add DOC comment section, that can be used by kernel-doc
- Improve fallback mode, to work better with refcnt based recycling
e.g. remove a WARN as pointed out by Tariq
e.g. quicker fallback if ptr_ring is empty.
V5: Fixed SPDX license as pointed out by Alexei
V6: Adjustments requested by Eric Dumazet
- Adjust ____cacheline_aligned_in_smp usage/placement
- Move rcu_head in struct page_pool
- Free pages quicker on destroy, minimize resources delayed an RCU period
- Remove code for forward/backward compat ABI interface
V8: Issues found by kbuild test robot
- Address sparse should be static warnings
- Only compile+link when a driver use/select page_pool,
mlx5 selects CONFIG_PAGE_POOL, although its first used in two patches
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the IDA infrastructure for getting a cyclic increasing ID number,
that is used for keeping track of each registered allocator per
RX-queue xdp_rxq_info. Instead of using the IDR infrastructure, which
uses a radix tree, use a dynamic rhashtable, for creating ID to
pointer lookup table, because this is faster.
The problem that is being solved here is that, the xdp_rxq_info
pointer (stored in xdp_buff) cannot be used directly, as the
guaranteed lifetime is too short. The info is needed on a
(potentially) remote CPU during DMA-TX completion time . In an
xdp_frame the xdp_mem_info is stored, when it got converted from an
xdp_buff, which is sufficient for the simple page refcnt based recycle
schemes.
For more advanced allocators there is a need to store a pointer to the
registered allocator. Thus, there is a need to guard the lifetime or
validity of the allocator pointer, which is done through this
rhashtable ID map to pointer. The removal and validity of of the
allocator and helper struct xdp_mem_allocator is guarded by RCU. The
allocator will be created by the driver, and registered with
xdp_rxq_info_reg_mem_model().
It is up-to debate who is responsible for freeing the allocator
pointer or invoking the allocator destructor function. In any case,
this must happen via RCU freeing.
Use the IDA infrastructure for getting a cyclic increasing ID number,
that is used for keeping track of each registered allocator per
RX-queue xdp_rxq_info.
V4: Per req of Jason Wang
- Use xdp_rxq_info_reg_mem_model() in all drivers implementing
XDP_REDIRECT, even-though it's not strictly necessary when
allocator==NULL for type MEM_TYPE_PAGE_SHARED (given it's zero).
V6: Per req of Alex Duyck
- Introduce rhashtable_lookup() call in later patch
V8: Address sparse should be static warnings (from kbuild test robot)
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The generic xdp_frame format, was inspired by the cpumap own internal
xdp_pkt format. It is now time to convert it over to the generic
xdp_frame format. The cpumap needs one extra field dev_rx.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The tuntap driver invented it's own driver specific way of queuing
XDP packets, by storing the xdp_buff information in the top of
the XDP frame data.
Convert it over to use the more generic xdp_frame structure. The
main problem with the in-driver method is that the xdp_rxq_info pointer
cannot be trused/used when dequeueing the frame.
V3: Remove check based on feedback from Jason
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is needed to convert drivers tuntap and virtio_net.
This is a generalization of what is done inside cpumap, which will be
converted later.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is done to prepare for the next patch, and it is also
nice to move this XDP related struct out of filter.h.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce an xdp_return_frame API, and convert over cpumap as
the first user, given it have queued XDP frame structure to leverage.
V3: Cleanup and remove C99 style comments, pointed out by Alex Duyck.
V6: Remove comment that id will be added later (Req by Alex Duyck)
V8: Rename enum mem_type to xdp_mem_type (found by kbuild test robot)
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is the sync up with the canonical definition of the sound
protocol in Xen:
1. Protocol version was referenced in the protocol description,
but missed its definition. Fixed by adding a constant
for current protocol version.
2. Some of the request descriptions have "reserved" fields
missed: fixed by adding corresponding entries.
3. Extend the size of the requests and responses to 64 octets.
Bump protocol version to 2.
4. Add explicit back and front synchronization
In order to provide explicit synchronization between backend and
frontend the following changes are introduced in the protocol:
- add new ring buffer for sending asynchronous events from
backend to frontend to report number of bytes played by the
frontend (XENSND_EVT_CUR_POS)
- introduce trigger events for playback control: start/stop/pause/resume
- add "req-" prefix to event-channel and ring-ref to unify naming
of the Xen event channels for requests and events
5. Add explicit back and front parameter negotiation
In order to provide explicit stream parameter negotiation between
backend and frontend the following changes are introduced in the protocol:
add XENSND_OP_HW_PARAM_QUERY request to read/update
configuration space for the parameters given: request passes
desired parameter's intervals/masks and the response to this request
returns allowed min/max intervals/masks to be used.
Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Signed-off-by: Oleksandr Grytsov <oleksandr_grytsov@epam.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
This is something that bothered us from a long time. When hid-input
doesn't know how to map a usage, it uses *_MISC. But there is something
else which increments the usage if the evdev code is already used.
This leads to few issues:
- some devices may have their ABS_X mapped to ABS_Y if they export a bad
set of usages (see the DragonRise joysticks IIRC -> fixed in a specific
HID driver)
- *_MISC + N might (will) conflict with other defined axes (my Logitech
H800 exports some multitouch axes because of that)
- this prevents to freely add some new evdev usages, because "hey, my
headset will now report ABS_COFFEE, and it's not coffee capable".
So let's try to kill this nonsense, and hope we won't break too many
devices.
I my headset case, the ABS_MISC axes are created because of some
proprietary usages, so we might not break that many devices.
For backward compatibility, a quirk HID_QUIRK_INCREMENT_USAGE_ON_DUPLICATE
is created and can be applied to any device that needs this behavior.
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Acked-by: Peter Hutterer <peter.hutterer@who-t.net>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
We might need to do some actions before the shadow variable is freed.
For example, we might need to remove it from a list or free some data
that it points to.
This is already possible now. The user can get the shadow variable
by klp_shadow_get(), do the necessary actions, and then call
klp_shadow_free().
This patch allows to do it a more elegant way. The user could implement
the needed actions in a callback that is passed to klp_shadow_free()
as a parameter. The callback usually does reverse operations to
the constructor callback that can be called by klp_shadow_*alloc().
It is especially useful for klp_shadow_free_all(). There we need to do
these extra actions for each found shadow variable with the given ID.
Note that the memory used by the shadow variable itself is still released
later by rcu callback. It is needed to protect internal structures that
keep all shadow variables. But the destructor is called immediately.
The shadow variable must not be access anyway after klp_shadow_free()
is called. The user is responsible to protect this any suitable way.
Be aware that the destructor is called under klp_shadow_lock. It is
the same as for the contructor in klp_shadow_alloc().
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
The existing API allows to pass a sample data to initialize the shadow
data. It works well when the data are position independent. But it fails
miserably when we need to set a pointer to the shadow structure itself.
Unfortunately, we might need to initialize the pointer surprisingly
often because of struct list_head. It is even worse because the list
might be hidden in other common structures, for example, struct mutex,
struct wait_queue_head.
For example, this was needed to fix races in ALSA sequencer. It required
to add mutex into struct snd_seq_client. See commit b3defb791b
("ALSA: seq: Make ioctls race-free") and commit d15d662e89
("ALSA: seq: Fix racy pool initializations")
This patch makes the API more safe. A custom constructor function and data
are passed to klp_shadow_*alloc() functions instead of the sample data.
Note that ctor_data are no longer a template for shadow->data. It might
point to any data that might be necessary when the constructor is called.
Also note that the constructor is called under klp_shadow_lock. It is
an internal spin_lock that synchronizes alloc() vs. get() operations,
see klp_shadow_get_or_alloc(). On one hand, this adds a risk of ABBA
deadlocks. On the other hand, it allows to do some operations safely.
For example, we could add the new structure into an existing list.
This must be done only once when the structure is allocated.
Reported-by: Nicolai Stange <nstange@suse.de>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Those definitions used to be part of the original patch:
https://patchwork.kernel.org/patch/2815221/
But, somehow, nobody ever noticed until today. Years later,
Arnd discovered that mmp-camera driver doesn't build and make
it depend on BROKEN.
Add the missing bits here, in order to remove BROKEN dependency.
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Add support for the global clock controller found on MSM8998
based devices. This should allow most non-multimedia device
drivers to probe and control their clocks.
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
Signed-off-by: Imran Khan <kimran@codeaurora.org>
Signed-off-by: Rajendra Nayak <rnayak@codeaurora.org>
[bjorn: Specify regs for alpha_plls, fix white spaces and add binding]
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
security_settime was a wrapper around security_settime64. There are no more
users of it. Therefore it can be removed. It was removed in:
commit 4eb1bca179 ("time: Use do_settimeofday64() internally")
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Signed-off-by: James Morris <james.morris@microsoft.com>
This tracepoint was replaced by inet_sock_set_state in 563e0bb and not
used anywhere in the kernel anymore. Remove it.
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds missing values for the max read request size.
E.g. network driver r8169 uses a value of 4K.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make lib/textsearch.c usable as kernel-doc.
Add textsearch() function family to kernel-api documentation.
Fix kernel-doc warnings in <linux/textsearch.h>:
../include/linux/textsearch.h:65: warning: Incorrect use of kernel-doc format:
* get_next_block - fetch next block of data
../include/linux/textsearch.h:82: warning: Incorrect use of kernel-doc format:
* finish - finalize/clean a series of get_next_block() calls
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some networks can make sure TCP payload can exactly fit 4KB pages,
with well chosen MSS/MTU and architectures.
Implement mmap() system call so that applications can avoid
copying data without complex splice() games.
Note that a successful mmap( X bytes) on TCP socket is consuming
bytes, as if recvmsg() has been done. (tp->copied += X)
Only PROT_READ mappings are accepted, as skb page frags
are fundamentally shared and read only.
If tcp_mmap() finds data that is not a full page, or a patch of
urgent data, -EINVAL is returned, no bytes are consumed.
Application must fallback to recvmsg() to read the problematic sequence.
mmap() wont block, regardless of socket being in blocking or
non-blocking mode. If not enough bytes are in receive queue,
mmap() would return -EAGAIN, or -EIO if socket is in a state
where no other bytes can be added into receive queue.
An application might use SO_RCVLOWAT, poll() and/or ioctl( FIONREAD)
to efficiently use mmap()
On the sender side, MSG_EOR might help to clearly separate unaligned
headers and 4K-aligned chunks if necessary.
Tested:
mlx4 (cx-3) 40Gbit NIC, with tcp_mmap program provided in following patch.
MTU set to 4168 (4096 TCP payload, 40 bytes IPv6 header, 32 bytes TCP header)
Without mmap() (tcp_mmap -s)
received 32768 MB (0 % mmap'ed) in 8.13342 s, 33.7961 Gbit,
cpu usage user:0.034 sys:3.778, 116.333 usec per MB, 63062 c-switches
received 32768 MB (0 % mmap'ed) in 8.14501 s, 33.748 Gbit,
cpu usage user:0.029 sys:3.997, 122.864 usec per MB, 61903 c-switches
received 32768 MB (0 % mmap'ed) in 8.11723 s, 33.8635 Gbit,
cpu usage user:0.048 sys:3.964, 122.437 usec per MB, 62983 c-switches
received 32768 MB (0 % mmap'ed) in 8.39189 s, 32.7552 Gbit,
cpu usage user:0.038 sys:4.181, 128.754 usec per MB, 55834 c-switches
With mmap() on receiver (tcp_mmap -s -z)
received 32768 MB (100 % mmap'ed) in 8.03083 s, 34.2278 Gbit,
cpu usage user:0.024 sys:1.466, 45.4712 usec per MB, 65479 c-switches
received 32768 MB (100 % mmap'ed) in 7.98805 s, 34.4111 Gbit,
cpu usage user:0.026 sys:1.401, 43.5486 usec per MB, 65447 c-switches
received 32768 MB (100 % mmap'ed) in 7.98377 s, 34.4296 Gbit,
cpu usage user:0.028 sys:1.452, 45.166 usec per MB, 65496 c-switches
received 32768 MB (99.9969 % mmap'ed) in 8.01838 s, 34.281 Gbit,
cpu usage user:0.02 sys:1.446, 44.7388 usec per MB, 65505 c-switches
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
SO_RCVLOWAT is properly handled in tcp_poll(), so that POLLIN is only
generated when enough bytes are available in receive queue, after
David change (commit c7004482e8 "tcp: Respect SO_RCVLOWAT in tcp_poll().")
But TCP still calls sk->sk_data_ready() for each chunk added in receive
queue, meaning thread is awaken, and goes back to sleep shortly after.
Tested:
tcp_mmap test program, receiving 32768 MB of data with SO_RCVLOWAT set to 512KB
-> Should get ~2 wakeups (c-switches) per MB, regardless of how many
(tiny or big) packets were received.
High speed (mostly full size GRO packets)
received 32768 MB (100 % mmap'ed) in 8.03112 s, 34.2266 Gbit,
cpu usage user:0.037 sys:1.404, 43.9758 usec per MB, 65497 c-switches
received 32768 MB (99.9954 % mmap'ed) in 7.98453 s, 34.4263 Gbit,
cpu usage user:0.03 sys:1.422, 44.3115 usec per MB, 65485 c-switches
Low speed (sender is ratelimited and sends 1-MSS at a time, so GRO is not helping)
received 22474.5 MB (100 % mmap'ed) in 6015.35 s, 0.0313414 Gbit,
cpu usage user:0.05 sys:1.586, 72.7952 usec per MB, 44950 c-switches
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Applications might use SO_RCVLOWAT on TCP socket hoping to receive
one [E]POLLIN event only when a given amount of bytes are ready in socket
receive queue.
Problem is that receive autotuning is not aware of this constraint,
meaning sk_rcvbuf might be too small to allow all bytes to be stored.
Add a new (struct proto_ops)->set_rcvlowat method so that a protocol
can override the default setsockopt(SO_RCVLOWAT) behavior.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If CONFIG_RASPBERRYPI_FIRMWARE=n:
drivers/gpio/gpio-raspberrypi-exp.c: In function ‘rpi_exp_gpio_get_polarity’:
drivers/gpio/gpio-raspberrypi-exp.c:71: warning: ‘get.polarity’ is used uninitialized in this function
drivers/gpio/gpio-raspberrypi-exp.c: In function ‘rpi_exp_gpio_get_direction’:
drivers/gpio/gpio-raspberrypi-exp.c:150: warning: ‘get.direction’ is used uninitialized in this function
The dummy firmware interface functions return 0, which means success,
causing subsequent code to make use of the never initialized output
parameter.
Fix this by making the dummy functions return an error code (-ENOSYS)
instead.
Note that this assumes the firmware always fills in the requested data
in the CONFIG_RASPBERRYPI_FIRMWARE=y case.
Fixes: d45f1a563b ("staging: vc04_services: fix up rpi firmware functions")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Mike Rapoport says:
These patches convert files in Documentation/vm to ReST format, add an
initial index and link it to the top level documentation.
There are no contents changes in the documentation, except few spelling
fixes. The relatively large diffstat stems from the indentation and
paragraph wrapping changes.
I've tried to keep the formatting as consistent as possible, but I could
miss some places that needed markup and add some markup where it was not
necessary.
[jc: significant conflicts in vm/hmm.rst]
Looks like these functions don't do anything in the mainline kernel so
we can just drop it.
Note that we must now also remove ir-rx51 pdata as it relies on the dummy
platform data that does not do anything. And ir-rx51 is calling a pdata
callback that doesn't do anything without checking if it exists first.
For configuring device specific minimal latencies, the interface to use
is pm_qos_add_request(). For an example, see what was done in commit
9834ffd1ec ("ASoC: omap-mcbsp: Add PM QoS support for McBSP to prevent
glitches"). I've added some comments to ir-rx51 so people using it can
add pm_qos support and test it.
Cc: Ivaylo Dimitrov <ivo.g.dimitrov.75@gmail.com>
Cc: Kevin Hilman <khilman@kernel.org>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Acked-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>