linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-05 18:41:58 +09:00

Author	SHA1	Message	Date
Jiri Olsa	b0c7f4ddbf	perf tools: Fix struct comm_str removal crash [ Upstream commit `46b3722cc7` ] We occasionaly hit following assert failure in 'perf top', when processing the /proc info in multiple threads. perf: ...include/linux/refcount.h:109: refcount_inc: Assertion `!(!refcount_inc_not_zero(r))' failed. The gdb backtrace looks like this: [Switching to Thread 0x7ffff11ba700 (LWP 13749)] 0x00007ffff50839fb in raise () from /lib64/libc.so.6 (gdb) #0 0x00007ffff50839fb in raise () from /lib64/libc.so.6 #1 0x00007ffff5085800 in abort () from /lib64/libc.so.6 #2 0x00007ffff507c0da in __assert_fail_base () from /lib64/libc.so.6 #3 0x00007ffff507c152 in __assert_fail () from /lib64/libc.so.6 #4 0x0000000000535373 in refcount_inc (r=0x7fffdc009be0) at ...include/linux/refcount.h:109 #5 0x00000000005354f1 in comm_str__get (cs=0x7fffdc009bc0) at util/comm.c:24 #6 0x00000000005356bd in __comm_str__findnew (str=0x7fffd000b260 ":2", root=0xbed5c0 <comm_str_root>) at util/comm.c:72 #7 0x000000000053579e in comm_str__findnew (str=0x7fffd000b260 ":2", root=0xbed5c0 <comm_str_root>) at util/comm.c:95 #8 0x000000000053582e in comm__new (str=0x7fffd000b260 ":2", timestamp=0, exec=false) at util/comm.c:111 #9 0x00000000005363bc in thread__new (pid=2, tid=2) at util/thread.c:57 #10 0x0000000000523da0 in ____machine__findnew_thread (machine=0xbfde38, threads=0xbfdf28, pid=2, tid=2, create=true) at util/machine.c:457 #11 0x0000000000523eb4 in __machine__findnew_thread (machine=0xbfde38, ... The failing assertion is this one: REFCOUNT_WARN(!refcount_inc_not_zero(r), ... The problem is that we keep global comm_str_root list, which is accessed by multiple threads during the 'perf top' startup and following 2 paths can race: thread 1: ... thread__new comm__new comm_str__findnew down_write(&comm_str_lock); __comm_str__findnew comm_str__get thread 2: ... comm__override or comm__free comm_str__put refcount_dec_and_test down_write(&comm_str_lock); rb_erase(&cs->rb_node, &comm_str_root); Because thread 2 first decrements the refcnt and only after then it removes the struct comm_str from the list, the thread 1 can find this object on the list with refcnt equls to 0 and hit the assert. This patch fixes the thread 1 __comm_str__findnew path, by ignoring objects that already dropped the refcnt to 0. For the rest of the objects we take the refcnt before comparing its name and release it afterwards with comm_str__put, which can also release the object completely. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Lukasz Odzioba <lukasz.odzioba@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Cc: kernel-team@lge.com Link: http://lkml.kernel.org/r/20180720101740.GA27176@krava Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-26 08:38:01 +02:00
Dan Carpenter	3cfa558660	fbdev: omapfb: off by one in omapfb_register_client() [ Upstream commit `5ec1ec35b2` ] The omapfb_register_client[] array has OMAPFB_PLANE_NUM elements so the > should be >= or we are one element beyond the end of the array. Fixes: `8b08cf2b64` ("OMAP: add TI OMAP framebuffer driver") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Imre Deak <imre.deak@solidboot.com> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-26 08:38:01 +02:00
Jiri Olsa	d38d272592	perf tools: Synthesize GROUP_DESC feature in pipe mode [ Upstream commit `e8fedff1cc` ] Stephan reported, that pipe mode does not carry the group information and thus the piped report won't display the grouped output for following command: # perf record -e '{cycles,instructions,branches}' -a sleep 4 \| perf report It has no idea about the group setup, so it will display events separately: # Overhead Command Shared Object ... # ........ ............... ....................... # 6.71% swapper [kernel.kallsyms] 2.28% offlineimap libpython2.7.so.1.0 0.78% perf [kernel.kallsyms] ... Fix GROUP_DESC feature record to be synthesized in pipe mode, so the report output is grouped if there are groups defined in record: # Overhead Command Shared ... # ........................ ............... ....... # 7.57% 0.16% 0.30% swapper [kernel 1.87% 3.15% 2.46% offlineimap libpyth 1.33% 0.00% 0.00% perf [kernel ... Reported-by: Stephane Eranian <eranian@google.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Stephane Eranian <eranian@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: David Carrillo-Cisneros <davidcc@google.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180712135202.14774-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-26 08:38:01 +02:00
Bob Peterson	d074912d2e	gfs2: Don't reject a supposedly full bitmap if we have blocks reserved [ Upstream commit `e79e0e1428` ] Before this patch, you could get into situations like this: 1. Process 1 searches for X free blocks, finds them, makes a reservation 2. Process 2 searches for free blocks in the same rgrp, but now the bitmap is full because process 1's reservation is skipped over. So it marks the bitmap as GBF_FULL. 3. Process 1 tries to allocate blocks from its own reservation, but since the GBF_FULL bit is set, it skips over the rgrp and searches elsewhere, thus not using its own reservation. This patch adds an additional check to allow processes to use their own reservations. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-26 08:38:01 +02:00
Thomas Richter	b435dd667b	perf test: Fix subtest number when showing results [ Upstream commit `9ef0112442` ] Perf test 40 for example has several subtests numbered 1-4 when displaying the start of the subtest. When the subtest results are displayed the subtests are numbered 0-3. Use this command to generate trace output: [root@s35lp76 perf]# ./perf test -Fv 40 2>/tmp/bpf1 Fix this by adjusting the subtest number when show the subtest result. Output before: [root@s35lp76 perf]# egrep '(^40\.[0-4]\| subtest [0-4]:)' /tmp/bpf1 40.1: Basic BPF filtering : BPF filter subtest 0: Ok 40.2: BPF pinning : BPF filter subtest 1: Ok 40.3: BPF prologue generation : BPF filter subtest 2: Ok 40.4: BPF relocation checker : BPF filter subtest 3: Ok [root@s35lp76 perf]# Output after: root@s35lp76 ~]# egrep '(^40\.[0-4]\| subtest [0-4]:)' /tmp/bpf1 40.1: Basic BPF filtering : BPF filter subtest 1: Ok 40.2: BPF pinning : BPF filter subtest 2: Ok 40.3: BPF prologue generation : BPF filter subtest 3: Ok 40.4: BPF relocation checker : BPF filter subtest 4: Ok [root@s35lp76 ~]# Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20180724134858.100644-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-26 08:38:00 +02:00
Todor Tomov	f86f6ebc1b	media: ov5645: Supported external clock is 24MHz [ Upstream commit `4adb0a0432` ] The external clock frequency was set to 23.88MHz by mistake because of a platform which cannot get closer to 24MHz. The supported by the driver external clock is 24MHz so set it correctly and also fix the values of the pixel clock and link clock. However allow 1% tolerance to the external clock as this difference is small enough to be insignificant. Signed-off-by: Todor Tomov <todor.tomov@linaro.org> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-26 08:38:00 +02:00
Randy Dunlap	28b6561183	mtd/maps: fix solutionengine.c printk format warnings [ Upstream commit `1d25e3eeed` ] Fix 2 printk format warnings (this driver is currently only used by arch/sh/) by using "%pap" instead of "%lx". Fixes these build warnings: ../drivers/mtd/maps/solutionengine.c: In function 'init_soleng_maps': ../include/linux/kern_levels.h:5:18: warning: format '%lx' expects argument of type 'long unsigned int', but argument 2 has type 'resource_size_t' {aka 'unsigned int'} [-Wformat=] ../drivers/mtd/maps/solutionengine.c:62:54: note: format string is defined here printk(KERN_NOTICE "Solution Engine: Flash at 0x%08lx, EPROM at 0x%08lx\n", ~~~~^ %08x ../include/linux/kern_levels.h:5:18: warning: format '%lx' expects argument of type 'long unsigned int', but argument 3 has type 'resource_size_t' {aka 'unsigned int'} [-Wformat=] ../drivers/mtd/maps/solutionengine.c:62:72: note: format string is defined here printk(KERN_NOTICE "Solution Engine: Flash at 0x%08lx, EPROM at 0x%08lx\n", ~~~~^ %08x Cc: David Woodhouse <dwmw2@infradead.org> Cc: Brian Norris <computersforpeace@gmail.com> Cc: Boris Brezillon <boris.brezillon@bootlin.com> Cc: Marek Vasut <marek.vasut@gmail.com> Cc: Richard Weinberger <richard@nod.at> Cc: linux-mtd@lists.infradead.org Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Cc: linux-sh@vger.kernel.org Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-26 08:38:00 +02:00
Wei Yongjun	745cb5eb3c	IB/ipoib: Fix error return code in ipoib_dev_init() [ Upstream commit `99a7e2bf70` ] Fix to return a negative error code from the ipoib_neigh_hash_init() error handling case instead of 0, as done elsewhere in this function. Fixes: `515ed4f3aa` ("IB/IPoIB: Separate control and data related initializations") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-26 08:38:00 +02:00
Mike Snitzer	030f2ad6ce	block: allow max_discard_segments to be stacked [ Upstream commit `42c9cdfe1e` ] Set max_discard_segments to USHRT_MAX in blk_set_stacking_limits() so that blk_stack_limits() can stack up this limit for stacked devices. before: $ cat /sys/block/nvme0n1/queue/max_discard_segments 256 $ cat /sys/block/dm-0/queue/max_discard_segments 1 after: $ cat /sys/block/nvme0n1/queue/max_discard_segments 256 $ cat /sys/block/dm-0/queue/max_discard_segments 256 Fixes: `1e739730c5` ("block: optionally merge discontiguous discard bios into a single request") Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-26 08:38:00 +02:00
Zhu Yanjun	394df59143	IB/rxe: Drop QP0 silently [ Upstream commit `536ca245c5` ] According to "Annex A16: RDMA over Converged Ethernet (RoCE)": A16.4.3 MANAGEMENT INTERFACES As defined in the base specification, a special Queue Pair, QP0 is defined solely for communication between subnet manager(s) and subnet management agents. Since such an IB-defined subnet management architecture is outside the scope of this annex, it follows that there is also no requirement that a port which conforms to this annex be associated with a QP0. Thus, for end nodes designed to conform to this annex, the concept of QP0 is undefined and unused for any port connected to an Ethernet network. CA16-8: A packet arriving at a RoCE port containing a BTH with the destination QP field set to QP0 shall be silently dropped. Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Acked-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-26 08:38:00 +02:00
Hans Verkuil	5b253f7420	media: videobuf2-core: check for q->error in vb2_core_qbuf() [ Upstream commit `b509d733d3` ] The vb2_core_qbuf() function didn't check if q->error was set. It is checked in __buf_prepare(), but that function isn't called if the buffer was already prepared before with VIDIOC_PREPARE_BUF. So check it at the start of vb2_core_qbuf() as well. Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-26 08:38:00 +02:00
Felix Fietkau	9b43283036	MIPS: ath79: fix system restart [ Upstream commit `f8a7bfe1cb` ] This patch disables irq on reboot to fix hang issues that were observed due to pending interrupts. Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: John Crispin <john@phrozen.org> Signed-off-by: Paul Burton <paul.burton@mips.com> Patchwork: https://patchwork.linux-mips.org/patch/19913/ Cc: James Hogan <jhogan@kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-26 08:38:00 +02:00
John Keeping	e1cfd4533f	dmaengine: pl330: fix irq race with terminate_all [ Upstream commit `e49756544a` ] In pl330_update() when checking if a channel has been aborted, the channel's lock is not taken, only the overall pl330_dmac lock. But in pl330_terminate_all() the aborted flag (req_running==-1) is set under the channel lock and not the pl330_dmac lock. With threaded interrupts, this leads to a potential race: pl330_terminate_all pl330_update ------------------- ------------ lock channel entry lock pl330 _stop channel unlock pl330 lock pl330 check req_running != -1 req_running = -1 _start channel Signed-off-by: John Keeping <john@metanate.com> Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-26 08:38:00 +02:00
Krzysztof Ha?asa	58119f9bd9	media: tw686x: Fix oops on buffer alloc failure [ Upstream commit `5a1a2f63d8` ] The error path currently calls tw686x_video_free() which requires vc->dev to be initialized, causing a NULL dereference on uninitizalized channels. Fix this by setting the vc->dev fields for all the channels first. Fixes: `f8afaa8dbc` ("[media] tw686x: Introduce an interface to support multiple DMA modes") Signed-off-by: Krzysztof Ha?asa <khalasa@piap.pl> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-26 08:38:00 +02:00
Masahiro Yamada	ee83ce188e	kbuild: add .DELETE_ON_ERROR special target [ Upstream commit `9c2af1c737` ] If Make gets a fatal signal while a shell is executing, it may delete the target file that the recipe was supposed to update. This is needed to make sure that it is remade from scratch when Make is next run; if Make is interrupted after the recipe has begun to write the target file, it results in an incomplete file whose time stamp is newer than that of the prerequisites files. Make automatically deletes the incomplete file on interrupt unless the target is marked .PRECIOUS. The situation is just the same as when the shell fails for some reasons. Usually when a recipe line fails, if it has changed the target file at all, the file is corrupted, or at least it is not completely updated. Yet the file’s time stamp says that it is now up to date, so the next time Make runs, it will not try to update that file. However, Make does not cater to delete the incomplete target file in this case. We need to add .DELETE_ON_ERROR somewhere in the Makefile to request it. scripts/Kbuild.include seems a suitable place to add it because it is included from almost all sub-makes. Please note .DELETE_ON_ERROR is not effective for phony targets. The external module building should never ever touch the kernel tree. The following recipe fails if include/generated/autoconf.h is missing. However, include/config/auto.conf is not deleted since it is a phony target. PHONY += include/config/auto.conf include/config/auto.conf: $(Q)test -e include/generated/autoconf.h -a -e $@ \|\| ( \ echo >&2; \ echo >&2 " ERROR: Kernel configuration is invalid."; \ echo >&2 " include/generated/autoconf.h or $@ are missing.";\ echo >&2 " Run 'make oldconfig && make prepare' on kernel src to fix it."; \ echo >&2 ; \ /bin/false) Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-26 08:37:59 +02:00
Rajan Vaja	62e442fdbc	clk: clk-fixed-factor: Clear OF_POPULATED flag in case of failure [ Upstream commit `f6dab4233d` ] Fixed factor clock has two initializations at of_clk_init() time and during platform driver probe. Before of_clk_init() call, node is marked as populated and so its probe never gets called. During of_clk_init() fixed factor clock registration may fail if any of its parent clock is not registered. In this case, it doesn't get chance to retry registration from probe. Clear OF_POPULATED flag if fixed factor clock registration fails so that clock registration is attempted again from probe. Signed-off-by: Rajan Vaja <rajan.vaja@xilinx.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-26 08:37:59 +02:00
Mikko Perttunen	d8e7792fae	clk: core: Potentially free connection id [ Upstream commit `365f7a89c8` ] Patch "clk: core: Copy connection id" made it so that the connector id 'con_id' is kstrdup_const()ed to cater to drivers that pass non-constant connection ids. The patch added the corresponding kfree_const to __clk_free_clk(), but struct clk's can be freed also via __clk_put(). Add the kfree_const call to __clk_put() and add comments to both functions to remind that the logic in them should be kept in sync. Fixes: `253160a8ad` ("clk: core: Copy connection id") Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Reviewed-by: Leonard Crestez <leonard.crestez@nxp.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-26 08:37:59 +02:00
Nicholas Mc Guire	45c800f555	clk: imx6ul: fix missing of_node_put() [ Upstream commit `11177e7a7a` ] of_find_compatible_node() is returning a device node with refcount incremented and must be explicitly decremented after the last use which is right after the us in of_iomap() here. Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org> Fixes: `787b4271a6` ("clk: imx: add imx6ul clk tree support") Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-26 08:37:59 +02:00
Andreas Gruenbacher	0fe570942c	gfs2: Special-case rindex for gfs2_grow [ Upstream commit `776125785a` ] To speed up the common case of appending to a file, gfs2_write_alloc_required presumes that writing beyond the end of a file will always require additional blocks to be allocated. This assumption is incorrect for preallocates files, but there are no negative consequences as long as some space is still left on the filesystem. One special file that always has some space preallocated beyond the end of the file is the rindex: when growing a filesystem, gfs2_grow adds one or more new resource groups and appends records describing those resource groups to the rindex; the preallocated space ensures that this is always possible. However, when a filesystem is completely full, gfs2_write_alloc_required will indicate that an additional allocation is required, and appending the next record to the rindex will fail even though space for that record has already been preallocated. To fix that, skip the incorrect optimization in gfs2_write_alloc_required, but for the rindex only. Other writes to preallocated space beyond the end of the file are still allowed to fail on completely full filesystems. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-26 08:37:59 +02:00
YueHaibing	36eb78a6ce	amd-xgbe: use dma_mapping_error to check map errors [ Upstream commit `b24dbfe9ce` ] The dma_mapping_error() returns true or false, but we want to return -ENOMEM if there was an error. Fixes: `174fd2597b` ("amd-xgbe: Implement split header receive support") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-26 08:37:59 +02:00
YueHaibing	318f224d12	xfrm: fix 'passing zero to ERR_PTR()' warning [ Upstream commit `934ffce134` ] Fix a static code checker warning: net/xfrm/xfrm_policy.c:1836 xfrm_resolve_and_create_bundle() warn: passing zero to 'ERR_PTR' xfrm_tmpl_resolve return 0 just means no xdst found, return NULL instead of passing zero to ERR_PTR. Fixes: `d809ec8955` ("xfrm: do not assume that template resolving always returns xfrms") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-26 08:37:59 +02:00
Takashi Iwai	a51e519d5b	ALSA: usb-audio: Fix multiple definitions in AU0828_DEVICE() macro [ Upstream commit `bd1cd0eb2c` ] AU0828_DEVICE() macro in quirks-table.h uses USB_DEVICE_VENDOR_SPEC() for expanding idVendor and idProduct fields. However, the latter macro adds also match_flags and bInterfaceClass, which are different from the values AU0828_DEVICE() macro sets after that. For fixing them, just expand idVendor and idProduct fields manually in AU0828_DEVICE(). This fixes sparse warnings like: sound/usb/quirks-table.h:2892:1: warning: Initializer entry defined twice Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-26 08:37:59 +02:00
Takashi Iwai	f402334e5d	ALSA: msnd: Fix the default sample sizes [ Upstream commit `7c500f9ea1` ] The default sample sizes set by msnd driver are bogus; it sets ALSA PCM format, not the actual bit width. Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-26 08:37:59 +02:00
Jean-Philippe Brucker	918cad16b4	iommu/io-pgtable-arm-v7s: Abort allocation when table address overflows the PTE [ Upstream commit `29859aeb8a` ] When run on a 64-bit system in selftest, the v7s driver may obtain page table with physical addresses larger than 32-bit. Level-2 tables are 1KB and are are allocated with slab, which doesn't accept the GFP_DMA32 flag. Currently map() truncates the address written in the PTE, causing iova_to_phys() or unmap() to access invalid memory. Kasan reports it as a use-after-free. To avoid any nasty surprise, test if the physical address fits in a PTE before returning a new table. 32-bit systems, which are the main users of this page table format, shouldn't see any difference. Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-26 08:37:59 +02:00
Miao Zhong	ea4b3539ab	iommu/arm-smmu-v3: sync the OVACKFLG to PRIQ consumer register [ Upstream commit `0d535967ac` ] When PRI queue occurs overflow, driver should update the OVACKFLG to the PRIQ consumer register, otherwise subsequent PRI requests will not be processed. Cc: Will Deacon <will.deacon@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Miao Zhong <zhongmiao@hisilicon.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-26 08:37:58 +02:00
Erich E. Hoover	a574b059c0	usb: dwc3: change stream event enable bit back to 13 [ Upstream commit `9a7faac365` ] Commit `ff3f0789b3` ("usb: dwc3: use BIT() macro where possible") changed DWC3_DEPCFG_STREAM_EVENT_EN from bit 13 to bit 12. Spotted this cleanup typo while looking at diffs between 4.9.35 and 4.14.16 for a separate issue. Fixes: `ff3f0789b3` ("usb: dwc3: use BIT() macro where possible") Signed-off-by: Erich E. Hoover <ehoover@sweptlaser.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-26 08:37:58 +02:00
Takashi Iwai	4b2a6ecd21	hv/netvsc: Fix NULL dereference at single queue mode fallback commit `b19b46346f` upstream. The recent commit `916c5e1413` ("hv/netvsc: fix handling of fallback to single queue mode") tried to fix the fallback behavior to a single queue mode, but it changed the function to return zero incorrectly, while the function should return an object pointer. Eventually this leads to a NULL dereference at the callers that expect non-NULL value. Fix it by returning the proper net_device object. Fixes: `916c5e1413` ("hv/netvsc: fix handling of fallback to single queue mode") Signed-off-by: Takashi Iwai <tiwai@suse.de> Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Alakesh Haloi <alakeshh@amazon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-26 08:37:58 +02:00
Vincent Whitchurch	effa7afc52	tcp: really ignore MSG_ZEROCOPY if no SO_ZEROCOPY [ Upstream commit `5cf4a8532c` ] According to the documentation in msg_zerocopy.rst, the SO_ZEROCOPY flag was introduced because send(2) ignores unknown message flags and any legacy application which was accidentally passing the equivalent of MSG_ZEROCOPY earlier should not see any new behaviour. Before commit `f214f915e7` ("tcp: enable MSG_ZEROCOPY"), a send(2) call which passed the equivalent of MSG_ZEROCOPY without setting SO_ZEROCOPY would succeed. However, after that commit, it fails with -ENOBUFS. So it appears that the SO_ZEROCOPY flag fails to fulfill its intended purpose. Fix it. Fixes: `f214f915e7` ("tcp: enable MSG_ZEROCOPY") Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-26 08:37:58 +02:00
Haishuang Yan	1beb52cea6	erspan: return PACKET_REJECT when the appropriate tunnel is not found [ Upstream commit `5a64506b5c` ] If erspan tunnel hasn't been established, we'd better send icmp port unreachable message after receive erspan packets. Fixes: `84e54fe0a5` ("gre: introduce native tunnel support for ERSPAN") Cc: William Tu <u9012063@gmail.com> Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com> Acked-by: William Tu <u9012063@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-26 08:37:58 +02:00
Haishuang Yan	456191a855	erspan: fix error handling for erspan tunnel [ Upstream commit `51dc63e391` ] When processing icmp unreachable message for erspan tunnel, tunnel id should be erspan_net_id instead of ipgre_net_id. Fixes: `84e54fe0a5` ("gre: introduce native tunnel support for ERSPAN") Cc: William Tu <u9012063@gmail.com> Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com> Acked-by: William Tu <u9012063@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-26 08:37:58 +02:00
Vakul Garg	04f625fc5a	net/tls: Set count of SG entries if sk_alloc_sg returns -ENOSPC [ Upstream commit `52ea992cfa` ] tls_sw_sendmsg() allocates plaintext and encrypted SG entries using function sk_alloc_sg(). In case the number of SG entries hit MAX_SKB_FRAGS, sk_alloc_sg() returns -ENOSPC and sets the variable for current SG index to '0'. This leads to calling of function tls_push_record() with 'sg_encrypted_num_elem = 0' and later causes kernel crash. To fix this, set the number of SG elements to the number of elements in plaintext/encrypted SG arrays in case sk_alloc_sg() returns -ENOSPC. Fixes: `3c4d755915` ("tls: kernel TLS support") Signed-off-by: Vakul Garg <vakul.garg@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-26 08:37:58 +02:00
Raed Salem	1de5a95668	net/mlx5: E-Switch, Fix memory leak when creating switchdev mode FDB tables [ Upstream commit `c88a026e01` ] The memory allocated for the slow path table flow group input structure was not freed upon successful return, fix that. Fixes: `1967ce6ea5` ("net/mlx5: E-Switch, Refactor fast path FDB table creation in switchdev mode") Signed-off-by: Raed Salem <raeds@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-26 08:37:58 +02:00
Jack Morgenstein	f8ce022004	net/mlx5: Fix debugfs cleanup in the device init/remove flow [ Upstream commit `5df816e7f4` ] When initializing the device (procedure init_one), the driver calls mlx5_pci_init to perform pci initialization. As part of this initialization, mlx5_pci_init creates a debugfs directory. If this creation fails, init_one aborts, returning failure to the caller (which is the probe method caller). The main reason for such a failure to occur is if the debugfs directory already exists. This can happen if the last time mlx5_pci_close was called, debugfs_remove (silently) failed due to the debugfs directory not being empty. Guarantee that such a debugfs_remove failure will not occur by instead calling debugfs_remove_recursive in procedure mlx5_pci_close. Fixes: `59211bd3b6` ("net/mlx5: Split the load/unload flow into hardware and software flows") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-26 08:37:58 +02:00
Huy Nguyen	571f1f6862	net/mlx5: Check for error in mlx5_attach_interface [ Upstream commit `47bc94b822` ] Currently, mlx5_attach_interface does not check for error after calling intf->attach or intf->add. When these two calls fails, the client is not initialized and will cause issues such as kernel panic on invalid address in the teardown path (mlx5_detach_interface) Fixes: `737a234bb6` ("net/mlx5: Introduce attach/detach to interface API") Signed-off-by: Huy Nguyen <huyn@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-26 08:37:58 +02:00
Cong Wang	5ff9c51cbd	rds: fix two RCU related problems [ Upstream commit `cc4dfb7f70` ] When a rds sock is bound, it is inserted into the bind_hash_table which is protected by RCU. But when releasing rds sock, after it is removed from this hash table, it is freed immediately without respecting RCU grace period. This could cause some use-after-free as reported by syzbot. Mark the rds sock with SOCK_RCU_FREE before inserting it into the bind_hash_table, so that it would be always freed after a RCU grace period. The other problem is in rds_find_bound(), the rds sock could be freed in between rhashtable_lookup_fast() and rds_sock_addref(), so we need to extend RCU read lock protection in rds_find_bound() to close this race condition. Reported-and-tested-by: syzbot+8967084bcac563795dc6@syzkaller.appspotmail.com Reported-by: syzbot+93a5839deb355537440f@syzkaller.appspotmail.com Cc: Sowmini Varadhan <sowmini.varadhan@oracle.com> Cc: Santosh Shilimkar <santosh.shilimkar@oracle.com> Cc: rds-devel@oss.oracle.com Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oarcle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-26 08:37:57 +02:00
Stefan Wahren	e4df3c97c3	net: qca_spi: Fix race condition in spi transfers [ Upstream commit `e65a9e480e` ] With performance optimization the spi transfer and messages of basic register operations like qcaspi_read_register moved into the private driver structure. But they weren't protected against mutual access (e.g. between driver kthread and ethtool). So dumping the QCA7000 registers via ethtool during network traffic could make spi_sync hang forever, because the completion in spi_message is overwritten. So revert the optimization completely. Fixes: `291ab06ecf` ("net: qualcomm: new Ethernet over SPI driver for QCA700") Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-26 08:37:57 +02:00
Jack Morgenstein	47f74ff002	net/mlx5: Fix use-after-free in self-healing flow [ Upstream commit `76d5581c87` ] When the mlx5 health mechanism detects a problem while the driver is in the middle of init_one or remove_one, the driver needs to prevent the health mechanism from scheduling future work; if future work is scheduled, there is a problem with use-after-free: the system WQ tries to run the work item (which has been freed) at the scheduled future time. Prevent this by disabling work item scheduling in the health mechanism when the driver is in the middle of init_one() or remove_one(). Fixes: `e126ba97db` ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Reviewed-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-26 08:37:57 +02:00
Petr Oros	799d8f1f0d	be2net: Fix memory leak in be_cmd_get_profile_config() [ Upstream commit `9d7f19dc46` ] DMA allocated memory is lost in be_cmd_get_profile_config() when we call it with non-NULL port_res parameter. Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-26 08:37:57 +02:00
Greg Kroah-Hartman	1244bbb3e9	Linux 4.14.71	2018-09-19 22:43:49 +02:00
Linus Torvalds	06274364ed	mm: get rid of vmacache_flush_all() entirely commit `7a9cdebdcc` upstream. Jann Horn points out that the vmacache_flush_all() function is not only potentially expensive, it's buggy too. It also happens to be entirely unnecessary, because the sequence number overflow case can be avoided by simply making the sequence number be 64-bit. That doesn't even grow the data structures in question, because the other adjacent fields are already 64-bit. So simplify the whole thing by just making the sequence number overflow case go away entirely, which gets rid of all the complications and makes the code faster too. Win-win. [ Oleg Nesterov points out that the VMACACHE_FULL_FLUSHES statistics also just goes away entirely with this ] Reported-by: Jann Horn <jannh@google.com> Suggested-by: Will Deacon <will.deacon@arm.com> Acked-by: Davidlohr Bueso <dave@stgolabs.net> Cc: Oleg Nesterov <oleg@redhat.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-19 22:43:48 +02:00
Ian Kent	8b34a7b14e	autofs: fix autofs_sbi() does not check super block type commit `0633da48f0` upstream. autofs_sbi() does not check the superblock magic number to verify it has been given an autofs super block. Backport Note: autofs4 has been renamed to autofs upstream. As a result the upstream patch does not apply cleanly onto 4.14.y. Link: http://lkml.kernel.org/r/153475422934.17131.7563724552005298277.stgit@pluto.themaw.net Reported-by: <syzbot+87c3c541582e56943277@syzkaller.appspotmail.com> Signed-off-by: Ian Kent <raven@themaw.net> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Zubin Mithra <zsm@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-19 22:43:48 +02:00
Jason Wang	daf0ca743b	tuntap: fix use after free during release commit `7063efd33b` upstream. After commit `b196d88aba` ("tun: fix use after free for ptr_ring") we need clean up tx ring during release(). But unfortunately, it tries to do the cleanup blindly after socket were destroyed which will lead another use-after-free. Fix this by doing the cleanup before dropping the last reference of the socket in __tun_detach(). Backport Note :- Upstream commit moves the ptr_ring_cleanup call from tun_chr_close to __tun_detach. Upstream applied that patch after replacing skb_array with ptr_ring. This patch moves the skb_array_cleanup call from tun_chr_close to __tun_detach. Reported-by: Andrei Vagin <avagin@virtuozzo.com> Acked-by: Andrei Vagin <avagin@virtuozzo.com> Fixes: `b196d88aba` ("tun: fix use after free for ptr_ring") Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Zubin Mithra <zsm@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-19 22:43:48 +02:00
Jason Wang	ab75811f71	tun: fix use after free for ptr_ring commit `b196d88aba` upstream. We used to initialize ptr_ring during TUNSETIFF, this is because its size depends on the tx_queue_len of netdevice. And we try to clean it up when socket were detached from netdevice. A race were spotted when trying to do uninit during a read which will lead a use after free for pointer ring. Solving this by always initialize a zero size ptr_ring in open() and do resizing during TUNSETIFF, and then we can safely do cleanup during close(). With this, there's no need for the workaround that was introduced by commit `4df0bfc799` ("tun: fix a memory leak for tfile->tx_array"). Backport Note :- Comparison with the upstream patch: [1] A "semantic revert" of the changes made in 4df0bfc799("tun: fix a memory leak for tfile->tx_array"). `4df0bfc799` was applied upstream, and then skb array was changed to use ptr_ring. The upstream patch then removes the changes introduced by `4df0bfc799`. This backport does the same; "revert" the changes made by `4df0bfc799`. [2] xdp_rxq_info_unreg() being called in relevant locations As xdp_rxq_info related patches are not present in 4.14, these changes are not needed in the backport. [3] An instance of ptr_ring_init needs to be replaced by skb_array_init Inside tun_attach() [4] ptr_ring_cleanup needs to be replaced by skb_array_cleanup Inside tun_chr_close() Note that the backport for `7063efd33b` ("tuntap: fix use after free during release") needs to be applied on top of this patch. Reported-by: syzbot+e8b902c3c3fadf0a9dba@syzkaller.appspotmail.com Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Michael S. Tsirkin <mst@redhat.com> Fixes: `1576d98605` ("tun: switch to use skb array for tx") Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Zubin Mithra <zsm@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-19 22:43:48 +02:00
Wei Yongjun	8626c40a30	mtd: ubi: wl: Fix error return code in ubi_wl_init() commit `7233982ade` upstream. Fix to return error code -ENOMEM from the kmem_cache_alloc() error handling case instead of 0, as done elsewhere in this function. Fixes: `f78e5623f4` ("ubi: fastmap: Erase outdated anchor PEBs during attach") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Richard Weinberger <richard@nod.at> Cc: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-19 22:43:48 +02:00
Taehee Yoo	08fb833b40	ip: frags: fix crash in ip_do_fragment() commit `5d407b071d` upstream A kernel crash occurrs when defragmented packet is fragmented in ip_do_fragment(). In defragment routine, skb_orphan() is called and skb->ip_defrag_offset is set. but skb->sk and skb->ip_defrag_offset are same union member. so that frag->sk is not NULL. Hence crash occurrs in skb->sk check routine in ip_do_fragment() when defragmented packet is fragmented. test commands: %iptables -t nat -I POSTROUTING -j MASQUERADE %hping3 192.168.4.2 -s 1000 -p 2000 -d 60000 splat looks like: [ 261.069429] kernel BUG at net/ipv4/ip_output.c:636! [ 261.075753] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI [ 261.083854] CPU: 1 PID: 1349 Comm: hping3 Not tainted 4.19.0-rc2+ #3 [ 261.100977] RIP: 0010:ip_do_fragment+0x1613/0x2600 [ 261.106945] Code: e8 e2 38 e3 fe 4c 8b 44 24 18 48 8b 74 24 08 e9 92 f6 ff ff 80 3c 02 00 0f 85 da 07 00 00 48 8b b5 d0 00 00 00 e9 25 f6 ff ff <0f> 0b 0f 0b 44 8b 54 24 58 4c 8b 4c 24 18 4c 8b 5c 24 60 4c 8b 6c [ 261.127015] RSP: 0018:ffff8801031cf2c0 EFLAGS: 00010202 [ 261.134156] RAX: 1ffff1002297537b RBX: ffffed0020639e6e RCX: 0000000000000004 [ 261.142156] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880114ba9bd8 [ 261.150157] RBP: ffff880114ba8a40 R08: ffffed0022975395 R09: ffffed0022975395 [ 261.158157] R10: 0000000000000001 R11: ffffed0022975394 R12: ffff880114ba9ca4 [ 261.166159] R13: 0000000000000010 R14: ffff880114ba9bc0 R15: dffffc0000000000 [ 261.174169] FS: 00007fbae2199700(0000) GS:ffff88011b400000(0000) knlGS:0000000000000000 [ 261.183012] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 261.189013] CR2: 00005579244fe000 CR3: 0000000119bf4000 CR4: 00000000001006e0 [ 261.198158] Call Trace: [ 261.199018] ? dst_output+0x180/0x180 [ 261.205011] ? save_trace+0x300/0x300 [ 261.209018] ? ip_copy_metadata+0xb00/0xb00 [ 261.213034] ? sched_clock_local+0xd4/0x140 [ 261.218158] ? kill_l4proto+0x120/0x120 [nf_conntrack] [ 261.223014] ? rt_cpu_seq_stop+0x10/0x10 [ 261.227014] ? find_held_lock+0x39/0x1c0 [ 261.233008] ip_finish_output+0x51d/0xb50 [ 261.237006] ? ip_fragment.constprop.56+0x220/0x220 [ 261.243011] ? nf_ct_l4proto_register_one+0x5b0/0x5b0 [nf_conntrack] [ 261.250152] ? rcu_is_watching+0x77/0x120 [ 261.255010] ? nf_nat_ipv4_out+0x1e/0x2b0 [nf_nat_ipv4] [ 261.261033] ? nf_hook_slow+0xb1/0x160 [ 261.265007] ip_output+0x1c7/0x710 [ 261.269005] ? ip_mc_output+0x13f0/0x13f0 [ 261.273002] ? __local_bh_enable_ip+0xe9/0x1b0 [ 261.278152] ? ip_fragment.constprop.56+0x220/0x220 [ 261.282996] ? nf_hook_slow+0xb1/0x160 [ 261.287007] raw_sendmsg+0x21f9/0x4420 [ 261.291008] ? dst_output+0x180/0x180 [ 261.297003] ? sched_clock_cpu+0x126/0x170 [ 261.301003] ? find_held_lock+0x39/0x1c0 [ 261.306155] ? stop_critical_timings+0x420/0x420 [ 261.311004] ? check_flags.part.36+0x450/0x450 [ 261.315005] ? _raw_spin_unlock_irq+0x29/0x40 [ 261.320995] ? _raw_spin_unlock_irq+0x29/0x40 [ 261.326142] ? cyc2ns_read_end+0x10/0x10 [ 261.330139] ? raw_bind+0x280/0x280 [ 261.334138] ? sched_clock_cpu+0x126/0x170 [ 261.338995] ? check_flags.part.36+0x450/0x450 [ 261.342991] ? __lock_acquire+0x4500/0x4500 [ 261.348994] ? inet_sendmsg+0x11c/0x500 [ 261.352989] ? dst_output+0x180/0x180 [ 261.357012] inet_sendmsg+0x11c/0x500 [ ... ] v2: - clear skb->sk at reassembly routine.(Eric Dumarzet) Fixes: `fa0f527358` ("ip: use rb trees for IP frag queue.") Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-19 22:43:48 +02:00
Peter Oskolkov	b3a0c61b73	ip: process in-order fragments efficiently This patch changes the runtime behavior of IP defrag queue: incoming in-order fragments are added to the end of the current list/"run" of in-order fragments at the tail. On some workloads, UDP stream performance is substantially improved: RX: ./udp_stream -F 10 -T 2 -l 60 TX: ./udp_stream -c -H <host> -F 10 -T 5 -l 60 with this patchset applied on a 10Gbps receiver: throughput=9524.18 throughput_units=Mbit/s upstream (net-next): throughput=4608.93 throughput_units=Mbit/s Reported-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Peter Oskolkov <posk@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit `a4fd284a1f`) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-19 22:43:48 +02:00
Peter Oskolkov	c91f27fb57	ip: add helpers to process in-order fragments faster. This patch introduces several helper functions/macros that will be used in the follow-up patch. No runtime changes yet. The new logic (fully implemented in the second patch) is as follows: * Nodes in the rb-tree will now contain not single fragments, but lists of consecutive fragments ("runs"). * At each point in time, the current "active" run at the tail is maintained/tracked. Fragments that arrive in-order, adjacent to the previous tail fragment, are added to this tail run without triggering the re-balancing of the rb-tree. * If a fragment arrives out of order with the offset _before_ the tail run, it is inserted into the rb-tree as a single fragment. * If a fragment arrives after the current tail fragment (with a gap), it starts a new "tail" run, as is inserted into the rb-tree at the end as the head of the new run. skb->cb is used to store additional information needed here (suggested by Eric Dumazet). Reported-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Peter Oskolkov <posk@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit `353c9cb360`) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-19 22:43:48 +02:00
Dan Carpenter	04b28f406e	ipv4: frags: precedence bug in ip_expire() We accidentally removed the parentheses here, but they are required because '!' has higher precedence than '&'. Fixes: `fa0f527358` ("ip: use rb trees for IP frag queue.") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit `70837ffe30`) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-19 22:43:48 +02:00
Eric Dumazet	6b921536f1	net: sk_buff rbnode reorg commit `bffa72cf7f` upstream skb->rbnode shares space with skb->next, skb->prev and skb->tstamp Current uses (TCP receive ofo queue and netem) need to save/restore tstamp, while skb->dev is either NULL (TCP) or a constant for a given queue (netem). Since we plan using an RB tree for TCP retransmit queue to speedup SACK processing with large BDP, this patch exchanges skb->dev and skb->tstamp. This saves some overhead in both TCP and netem. v2: removes the swtstamp field from struct tcp_skb_cb Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Soheil Hassas Yeganeh <soheil@google.com> Cc: Wei Wang <weiwan@google.com> Cc: Willem de Bruijn <willemb@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-19 22:43:47 +02:00
Eric Dumazet	37c7cc80b1	net: add rb_to_skb() and other rb tree helpers Geeralize private netem_rb_to_skb() TCP rtx queue will soon be converted to rb-tree, so we will need skb_rbtree_walk() helpers. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit `18a4c0eab2`) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-09-19 22:43:47 +02:00

1 2 3 4 5 ...

715131 Commits