linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-10 04:48:04 +09:00

Author	SHA1	Message	Date
Peter Zijlstra	6b02ab68ec	UPSTREAM: sched/fair: Fix and optimize the fork() path The task_fork_fair() callback already calls __set_task_cpu() and takes rq->lock. If we move the sched_class::task_fork callback in sched_fork() under the existing p->pi_lock, right after its set_task_cpu() call, we can avoid doing two such calls and omit the IRQ disabling on the rq->lock. Change to __set_task_cpu() to skip the migration bits, this is a new task, not a migration. Similarly, make wake_up_new_task() use __set_task_cpu() for the same reason, the task hasn't actually migrated as it hasn't ever ran. This cures the problem of calling migrate_task_rq_fair(), which does remove_entity_from_load_avg() on tasks that have never been added to the load avg to begin with. This bug would result in transiently messed up load_avg values, averaged out after a few dozen milliseconds. This is probably the reason why this bug was not found for such a long time. Reported-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit `e210bffd39`) Change-Id: Icbddbaa6e8c1071859673d8685bc3f38955cf144 Signed-off-by: Brendan Jackman <brendan.jackman@arm.com> Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-10-27 13:30:32 +01:00
Chris Redpath	792510d9b3	BACKPORT: sched/fair: Make it possible to account fair load avg consistently While set_task_rq_fair() is introduced in mainline by commit `ad936d8658` ("sched/fair: Make it possible to account fair load avg consistently"), the function results to be introduced here by the backport of commit `09a43ace1f` ("sched/fair: Propagate load during synchronous attach/detach"). The problem (apart from the confusion introduced by the backport) is actually that set_task_rq_fair() is currently not called at all. Fix the problem by backporting again commit `ad936d8658` ("sched/fair: Make it possible to account fair load avg consistently"). Original change log: The current code accounts for the time a task was absent from the fair class (per ATTACH_AGE_LOAD). However it does not work correctly when a task got migrated or moved to another cgroup while outside of the fair class. This patch tries to address that by aging on migration. We locklessly read the 'last_update_time' stamp from both the old and new cfs_rq, ages the load upto the old time, and sets it to the new time. These timestamps should in general not be more than 1 tick apart from one another, so there is a definite bound on things. Signed-off-by: Byungchul Park <byungchul.park@lge.com> [ Changelog, a few edits and !SMP build fix ] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1445616981-29904-2-git-send-email-byungchul.park@lge.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry-picked from `ad936d8658`) Signed-off-by: Juri Lelli <juri.lelli@arm.com> Signed-off-by: Chris Redpath <chris.redpath@arm.com> Change-Id: I17294ab0ada3901d35895014715fd60952949358 Signed-off-by: Brendan Jackman <brendan.jackman@arm.com>	2017-10-27 13:30:32 +01:00
Chris Redpath	fac311be26	cpufreq/sched: Consider max cpu capacity when choosing frequencies When using schedfreq on cpus with max capacity significantly smaller than 1024, the tick update uses non-normalised capacities - this leads to selecting an incorrect OPP as we were scaling the frequency as if the max capacity achievable was 1024 rather than the max for that particular cpu or group. This could result in a cpu being stuck at the lowest OPP and unable to generate enough utilisation to climb out if the max capacity is significantly smaller than 1024. Instead, normalize the capacity to be in the range 0-1024 in the tick so that when we later select a frequency, we get the correct one. Also comments updated to be clearer about what is needed. Change-Id: Id84391c7ac015311002ada21813a353ee13bee60 Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-10-27 13:30:32 +01:00
Martijn Coenen	c8bc3e3a3e	ANDROID: binder: show high watermark of alloc->pages. Show the high watermark of the index into the alloc->pages array, to facilitate sizing the buffer on a per-process basis. Change-Id: I2b40cd16628e0ee45216c51dc9b3c5b0c862032e Signed-off-by: Martijn Coenen <maco@android.com>	2017-10-26 15:16:08 +02:00
Martijn Coenen	95317055df	ANDROID: binder: Add thread->process_todo flag. This flag determines whether the thread should currently process the work in the thread->todo worklist. The prime usecase for this is improving the performance of synchronous transactions: all synchronous transactions post a BR_TRANSACTION_COMPLETE to the calling thread, but there's no reason to return that command to userspace right away - userspace anyway needs to wait for the reply. Likewise, a synchronous transaction that contains a binder object can cause a BC_ACQUIRE/BC_INCREFS to be returned to userspace; since the caller must anyway hold a strong/weak ref for the duration of the call, postponing these commands until the reply comes in is not a problem. Note that this flag is not used to determine whether a thread can handle process work; a thread should never pick up process work when thread work is still pending. Before patch: ------------------------------------------------------------------ Benchmark Time CPU Iterations ------------------------------------------------------------------ BM_sendVec_binderize/4 45959 ns 20288 ns 34351 BM_sendVec_binderize/8 45603 ns 20080 ns 34909 BM_sendVec_binderize/16 45528 ns 20113 ns 34863 BM_sendVec_binderize/32 45551 ns 20122 ns 34881 BM_sendVec_binderize/64 45701 ns 20183 ns 34864 BM_sendVec_binderize/128 45824 ns 20250 ns 34576 BM_sendVec_binderize/256 45695 ns 20171 ns 34759 BM_sendVec_binderize/512 45743 ns 20211 ns 34489 BM_sendVec_binderize/1024 46169 ns 20430 ns 34081 After patch: ------------------------------------------------------------------ Benchmark Time CPU Iterations ------------------------------------------------------------------ BM_sendVec_binderize/4 42939 ns 17262 ns 40653 BM_sendVec_binderize/8 42823 ns 17243 ns 40671 BM_sendVec_binderize/16 42898 ns 17243 ns 40594 BM_sendVec_binderize/32 42838 ns 17267 ns 40527 BM_sendVec_binderize/64 42854 ns 17249 ns 40379 BM_sendVec_binderize/128 42881 ns 17288 ns 40427 BM_sendVec_binderize/256 42917 ns 17297 ns 40429 BM_sendVec_binderize/512 43184 ns 17395 ns 40411 BM_sendVec_binderize/1024 43119 ns 17357 ns 40432 Signed-off-by: Martijn Coenen <maco@android.com> Change-Id: Ia70287066d62aba64e98ac44ff1214e37ca75693	2017-10-26 15:16:08 +02:00
Kevin Brodsky	cc005cde42	UPSTREAM: arm64: compat: Remove leftover variable declaration (cherry picked from commit `82d24d114f`) Commit `a1d5ebaf8c` ("arm64: big-endian: don't treat code as data when copying sigret code") moved the 32-bit sigreturn trampoline code from the aarch32_sigret_code array to kuser32.S. The commit removed the array definition from signal32.c, but not its declaration in signal32.h. Remove the leftover declaration. Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com> Signed-off-by: Mark Salyzyn <salyzyn@android.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Bug: 20045882 Bug: 63737556 Change-Id: Ic8a5f0e367f0ecd5c5ddd9e3885d0285f91cf89e	2017-10-25 21:22:23 +00:00
Chris Redpath	4f8767d1ca	ANDROID: sched/fair: Select correct capacity state for energy_diff The util returned from group_max_util is not capped at the max util present in the group, so it can be larger than the capacity stored in the array. Ensure that when this happens, we always use the last entry in the array to fetch energy from. Tested with synthetics on Juno board. Bug: 38159576 Change-Id: I89fb52fb7e68fa3e682e308acc232596672d03f7 Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-10-25 19:28:51 +00:00
Dmitry Shmidt	89805266af	Revert "UPSTREAM: efi/libstub/arm64: Set -fpie when building the EFI stub" It break boot with UEFI bootloader This reverts commit `2f2860a504`.	2017-10-24 12:42:10 -07:00
Leo Yan	774481506a	cpufreq: schedutil: clamp util to CPU maximum capacity The code is to get the CPU util by accumulate different scheduling classes and when the total util value is larger than CPU capacity then it clamps util to CPU maximum capacity. So we can get correct util value when use PELT signal but if with WALT signal it misses to clamp util value. On the other hand, WALT doesn't accumulate different class utilization but it needs to applying boost margin for WALT signal the CPU util value is possible to be larger than CPU capacity; so this patch is to always clamp util to CPU maximum capacity. Change-Id: I05481ddbf20246bb9be15b6bd21b6ec039015ea8 Signed-off-by: Leo Yan <leo.yan@linaro.org>	2017-10-24 15:49:27 +00:00
Sherry Yang	a7e1d33c7a	FROMLIST: android: binder: Fix null ptr dereference in debug msg (from https://patchwork.kernel.org/patch/9990323/) Don't access next->data in kernel debug message when the next buffer is null. Bug: 36007193 Change-Id: Ib8240d7e9a7087a2256e88c0ae84b9df0f2d0224 Acked-by: Arve Hjønnevåg <arve@android.com> Signed-off-by: Sherry Yang <sherryy@android.com>	2017-10-23 19:10:08 +00:00
Sherry Yang	c9391ba6ee	FROMLIST: android: binder: Change binder_shrinker to static (from https://patchwork.kernel.org/patch/9990321/) binder_shrinker struct is not used anywhere outside of binder_alloc.c and should be static. Bug: 63926541 Change-Id: I7a13d4ddbaaf3721cddfe1d860e34c7be80dd082 Acked-by: Arve Hjønnevåg <arve@android.com> Signed-off-by: Sherry Yang <sherryy@android.com>	2017-10-23 19:09:37 +00:00
Chris Redpath	d34d2c97ae	cpufreq/sched: Use cpu max freq rather than policy max When we convert capacity into frequency, we used policy->max to get the max freq of the cpu. Since this can be changed by userspace policy or thermal events, we are potentially asking for a lower frequency than the utilization demands. Change over to using cpuinfo.max which is the max freq supported by that cpu rather than the currently-chosen max. Frequency granted still honours the max policy. Tested by setting a userspace policy and observing the relevant vars in a trace. In this instance, we ask for around 1ghz instead of 620MHz. freq_new=1013512 unfixed_freq_new=624487 capacity=546 cpuinfo_max=1900800 policy_max=1171200 Change-Id: I8c5694db42243c6fb78bb9be9046b06ac81295e7 Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-10-23 12:30:46 +01:00
Greg Kroah-Hartman	89074de67a	Merge 4.4.94 into android-4.4 Changes in 4.4.94 percpu: make this_cpu_generic_read() atomic w.r.t. interrupts drm/dp/mst: save vcpi with payloads MIPS: Fix minimum alignment requirement of IRQ stack sctp: potential read out of bounds in sctp_ulpevent_type_enabled() bpf/verifier: reject BPF_ALU64\|BPF_END udpv6: Fix the checksum computation when HW checksum does not apply ip6_gre: skb_push ipv6hdr before packing the header in ip6gre_header net: emac: Fix napi poll list corruption packet: hold bind lock when rebinding to fanout hook bpf: one perf event close won't free bpf program attached by another perf event isdn/i4l: fetch the ppp_write buffer in one shot vti: fix use after free in vti_tunnel_xmit/vti6_tnl_xmit l2tp: Avoid schedule while atomic in exit_net l2tp: fix race condition in l2tp_tunnel_delete tun: bail out from tun_get_user() if the skb is empty packet: in packet_do_bind, test fanout with bind_lock held packet: only test po->has_vnet_hdr once in packet_snd net: Set sk_prot_creator when cloning sockets to the right proto tipc: use only positive error codes in messages Revert "bsg-lib: don't free job in bsg_prepare_job" locking/lockdep: Add nest_lock integrity test watchdog: kempld: fix gcc-4.3 build irqchip/crossbar: Fix incorrect type of local variables mac80211_hwsim: check HWSIM_ATTR_RADIO_NAME length mac80211: fix power saving clients handling in iwlwifi net/mlx4_en: fix overflow in mlx4_en_init_timestamp() netfilter: nf_ct_expect: Change __nf_ct_expect_check() return value. iio: adc: xilinx: Fix error handling Btrfs: send, fix failure to rename top level inode due to name collision f2fs: do not wait for writeback in write_begin md/linear: shutup lockdep warnning sparc64: Migrate hvcons irq to panicked cpu net/mlx4_core: Fix VF overwrite of module param which disables DMFS on new probed PFs crypto: xts - Add ECB dependency ocfs2/dlmglue: prepare tracking logic to avoid recursive cluster lock slub: do not merge cache if slub_debug contains a never-merge flag scsi: scsi_dh_emc: return success in clariion_std_inquiry() net: mvpp2: release reference to txq_cpu[] entry after unmapping i2c: at91: ensure state is restored after suspending ceph: clean up unsafe d_parent accesses in build_dentry_path uapi: fix linux/rds.h userspace compilation errors uapi: fix linux/mroute6.h userspace compilation errors target/iscsi: Fix unsolicited data seq_end_offset calculation nfsd/callback: Cleanup callback cred on shutdown cpufreq: CPPC: add ACPI_PROCESSOR dependency Revert "tty: goldfish: Fix a parameter of a call to free_irq" Linux 4.4.94 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>	2017-10-22 08:09:11 +02:00
Greg Kroah-Hartman	af9a9a7bed	Linux 4.4.94	2017-10-21 17:09:07 +02:00
Greg Kroah-Hartman	401231d063	Revert "tty: goldfish: Fix a parameter of a call to free_irq" This reverts commit `01b3db29ba` which is commit `1a5c2d1de7` upstream. Ben writes: This fixes a bug introduced in 4.6 by commit `465893e188` "tty: goldfish: support platform_device with id -1". For earlier kernel versions, it introduces a bug. So let's drop it. Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org	2017-10-21 17:09:06 +02:00
Arnd Bergmann	cdbbea7809	cpufreq: CPPC: add ACPI_PROCESSOR dependency [ Upstream commit `a578884fa0` ] Without the Kconfig dependency, we can get this warning: warning: ACPI_CPPC_CPUFREQ selects ACPI_CPPC_LIB which has unmet direct dependencies (ACPI && ACPI_PROCESSOR) Fixes: `5477fb3bd1` (ACPI / CPPC: Add a CPUFreq driver for use with CPPC) Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-21 17:09:06 +02:00
Kinglong Mee	c2c6f43e02	nfsd/callback: Cleanup callback cred on shutdown [ Upstream commit `f7d1ddbe76` ] The rpccred gotten from rpc_lookup_machine_cred() should be put when state is shutdown. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-21 17:09:06 +02:00
Varun Prakash	429a4ac589	target/iscsi: Fix unsolicited data seq_end_offset calculation [ Upstream commit `4d65491c26` ] In case of unsolicited data for the first sequence seq_end_offset must be set to minimum of total data length and FirstBurstLength, so do not add cmd->write_data_done to the min of total data length and FirstBurstLength. This patch avoids that with ImmediateData=Yes, InitialR2T=No, MaxXmitDataSegmentLength < FirstBurstLength that a WRITE command with IO size above FirstBurstLength triggers sequence error messages, for example Set following parameters on target (linux-4.8.12) ImmediateData = Yes InitialR2T = No MaxXmitDataSegmentLength = 8k FirstBurstLength = 64k Log in from Open iSCSI initiator and execute dd if=/dev/zero of=/dev/sdb bs=128k count=1 oflag=direct Error messages on target Command ITT: 0x00000035 with Offset: 65536, Length: 8192 outside of Sequence 73728:131072 while DataSequenceInOrder=Yes. Command ITT: 0x00000035, received DataSN: 0x00000001 higher than expected 0x00000000. Unable to perform within-command recovery while ERL=0. Signed-off-by: Varun Prakash <varun@chelsio.com> [ bvanassche: Use min() instead of open-coding it / edited patch description ] Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-21 17:09:06 +02:00
Dmitry V. Levin	823ba64c57	uapi: fix linux/mroute6.h userspace compilation errors [ Upstream commit `72aa107df6` ] Include <linux/in6.h> to fix the following linux/mroute6.h userspace compilation errors: /usr/include/linux/mroute6.h:80:22: error: field 'mf6cc_origin' has incomplete type struct sockaddr_in6 mf6cc_origin; /* Origin of mcast / /usr/include/linux/mroute6.h:81:22: error: field 'mf6cc_mcastgrp' has incomplete type struct sockaddr_in6 mf6cc_mcastgrp; / Group in question */ /usr/include/linux/mroute6.h:91:22: error: field 'src' has incomplete type struct sockaddr_in6 src; /usr/include/linux/mroute6.h:92:22: error: field 'grp' has incomplete type struct sockaddr_in6 grp; /usr/include/linux/mroute6.h:132:18: error: field 'im6_src' has incomplete type struct in6_addr im6_src, im6_dst; /usr/include/linux/mroute6.h:132:27: error: field 'im6_dst' has incomplete type struct in6_addr im6_src, im6_dst; Signed-off-by: Dmitry V. Levin <ldv@altlinux.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-21 17:09:06 +02:00
Dmitry V. Levin	028a419869	uapi: fix linux/rds.h userspace compilation errors [ Upstream commit `feb0869d90` ] Consistently use types from linux/types.h to fix the following linux/rds.h userspace compilation errors: /usr/include/linux/rds.h:106:2: error: unknown type name 'uint8_t' uint8_t name[32]; /usr/include/linux/rds.h:107:2: error: unknown type name 'uint64_t' uint64_t value; /usr/include/linux/rds.h:117:2: error: unknown type name 'uint64_t' uint64_t next_tx_seq; /usr/include/linux/rds.h:118:2: error: unknown type name 'uint64_t' uint64_t next_rx_seq; /usr/include/linux/rds.h:121:2: error: unknown type name 'uint8_t' uint8_t transport[TRANSNAMSIZ]; /* null term ascii */ /usr/include/linux/rds.h:122:2: error: unknown type name 'uint8_t' uint8_t flags; /usr/include/linux/rds.h:129:2: error: unknown type name 'uint64_t' uint64_t seq; /usr/include/linux/rds.h:130:2: error: unknown type name 'uint32_t' uint32_t len; /usr/include/linux/rds.h:135:2: error: unknown type name 'uint8_t' uint8_t flags; /usr/include/linux/rds.h:139:2: error: unknown type name 'uint32_t' uint32_t sndbuf; /usr/include/linux/rds.h:144:2: error: unknown type name 'uint32_t' uint32_t rcvbuf; /usr/include/linux/rds.h:145:2: error: unknown type name 'uint64_t' uint64_t inum; /usr/include/linux/rds.h:153:2: error: unknown type name 'uint64_t' uint64_t hdr_rem; /usr/include/linux/rds.h:154:2: error: unknown type name 'uint64_t' uint64_t data_rem; /usr/include/linux/rds.h:155:2: error: unknown type name 'uint32_t' uint32_t last_sent_nxt; /usr/include/linux/rds.h:156:2: error: unknown type name 'uint32_t' uint32_t last_expected_una; /usr/include/linux/rds.h:157:2: error: unknown type name 'uint32_t' uint32_t last_seen_una; /usr/include/linux/rds.h:164:2: error: unknown type name 'uint8_t' uint8_t src_gid[RDS_IB_GID_LEN]; /usr/include/linux/rds.h:165:2: error: unknown type name 'uint8_t' uint8_t dst_gid[RDS_IB_GID_LEN]; /usr/include/linux/rds.h:167:2: error: unknown type name 'uint32_t' uint32_t max_send_wr; /usr/include/linux/rds.h:168:2: error: unknown type name 'uint32_t' uint32_t max_recv_wr; /usr/include/linux/rds.h:169:2: error: unknown type name 'uint32_t' uint32_t max_send_sge; /usr/include/linux/rds.h:170:2: error: unknown type name 'uint32_t' uint32_t rdma_mr_max; /usr/include/linux/rds.h:171:2: error: unknown type name 'uint32_t' uint32_t rdma_mr_size; /usr/include/linux/rds.h:212:9: error: unknown type name 'uint64_t' typedef uint64_t rds_rdma_cookie_t; /usr/include/linux/rds.h:215:2: error: unknown type name 'uint64_t' uint64_t addr; /usr/include/linux/rds.h:216:2: error: unknown type name 'uint64_t' uint64_t bytes; /usr/include/linux/rds.h:221:2: error: unknown type name 'uint64_t' uint64_t cookie_addr; /usr/include/linux/rds.h:222:2: error: unknown type name 'uint64_t' uint64_t flags; /usr/include/linux/rds.h:228:2: error: unknown type name 'uint64_t' uint64_t cookie_addr; /usr/include/linux/rds.h:229:2: error: unknown type name 'uint64_t' uint64_t flags; /usr/include/linux/rds.h:234:2: error: unknown type name 'uint64_t' uint64_t flags; /usr/include/linux/rds.h:240:2: error: unknown type name 'uint64_t' uint64_t local_vec_addr; /usr/include/linux/rds.h:241:2: error: unknown type name 'uint64_t' uint64_t nr_local; /usr/include/linux/rds.h:242:2: error: unknown type name 'uint64_t' uint64_t flags; /usr/include/linux/rds.h:243:2: error: unknown type name 'uint64_t' uint64_t user_token; /usr/include/linux/rds.h:248:2: error: unknown type name 'uint64_t' uint64_t local_addr; /usr/include/linux/rds.h:249:2: error: unknown type name 'uint64_t' uint64_t remote_addr; /usr/include/linux/rds.h:252:4: error: unknown type name 'uint64_t' uint64_t compare; /usr/include/linux/rds.h:253:4: error: unknown type name 'uint64_t' uint64_t swap; /usr/include/linux/rds.h:256:4: error: unknown type name 'uint64_t' uint64_t add; /usr/include/linux/rds.h:259:4: error: unknown type name 'uint64_t' uint64_t compare; /usr/include/linux/rds.h:260:4: error: unknown type name 'uint64_t' uint64_t swap; /usr/include/linux/rds.h:261:4: error: unknown type name 'uint64_t' uint64_t compare_mask; /usr/include/linux/rds.h:262:4: error: unknown type name 'uint64_t' uint64_t swap_mask; /usr/include/linux/rds.h:265:4: error: unknown type name 'uint64_t' uint64_t add; /usr/include/linux/rds.h:266:4: error: unknown type name 'uint64_t' uint64_t nocarry_mask; /usr/include/linux/rds.h:269:2: error: unknown type name 'uint64_t' uint64_t flags; /usr/include/linux/rds.h:270:2: error: unknown type name 'uint64_t' uint64_t user_token; /usr/include/linux/rds.h:274:2: error: unknown type name 'uint64_t' uint64_t user_token; /usr/include/linux/rds.h:275:2: error: unknown type name 'int32_t' int32_t status; Signed-off-by: Dmitry V. Levin <ldv@altlinux.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-21 17:09:06 +02:00
Jeff Layton	c7a20ed295	ceph: clean up unsafe d_parent accesses in build_dentry_path [ Upstream commit `c6b0b656ca` ] While we hold a reference to the dentry when build_dentry_path is called, we could end up racing with a rename that changes d_parent. Handle that situation correctly, by using the rcu_read_lock to ensure that the parent dentry and inode stick around long enough to safely check ceph_snap and ceph_ino. Link: http://tracker.ceph.com/issues/18148 Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Yan, Zheng <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-21 17:09:06 +02:00
Alexandre Belloni	c128baf6a1	i2c: at91: ensure state is restored after suspending [ Upstream commit `e3ccc921b7` ] When going to suspend, the I2C registers may be lost because the power to VDDcore is cut. Restore them when resuming. Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Acked-by: Ludovic Desroches <ludovic.desroches@microchip.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-21 17:09:06 +02:00
Thomas Petazzoni	d7ecae7266	net: mvpp2: release reference to txq_cpu[] entry after unmapping [ Upstream commit `36fb7435b6` ] The mvpp2_txq_bufs_free() function is called upon TX completion to DMA unmap TX buffers, and free the corresponding SKBs. It gets the references to the SKB to free and the DMA buffer to unmap from a per-CPU txq_pcpu data structure. However, the code currently increments the pointer to the next entry before doing the DMA unmap and freeing the SKB. It does not cause any visible problem because for a given SKB the TX completion is guaranteed to take place on the CPU where the TX was started. However, it is much more logical to increment the pointer to the next entry once the current entry has been completely unmapped/released. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Acked-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-21 17:09:05 +02:00
Dan Carpenter	693e6513b2	scsi: scsi_dh_emc: return success in clariion_std_inquiry() [ Upstream commit `4d7d39a18b` ] We accidentally return an uninitialized variable on success. Fixes: `b6ff1b14cd` ("[SCSI] scsi_dh: Update EMC handler") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-21 17:09:05 +02:00
Grygorii Maistrenko	9ac38e30f2	slub: do not merge cache if slub_debug contains a never-merge flag [ Upstream commit `c6e28895a4` ] In case CONFIG_SLUB_DEBUG_ON=n, find_mergeable() gets debug features from commandline but never checks if there are features from the SLAB_NEVER_MERGE set. As a result selected by slub_debug caches are always mergeable if they have been created without a custom constructor set or without one of the SLAB_* debug features on. This moves the SLAB_NEVER_MERGE check below the flags update from commandline to make sure it won't merge the slab cache if one of the debug features is on. Link: http://lkml.kernel.org/r/20170101124451.GA4740@lp-laptop-d Signed-off-by: Grygorii Maistrenko <grygoriimkd@gmail.com> Reviewed-by: Pekka Enberg <penberg@kernel.org> Acked-by: David Rientjes <rientjes@google.com> Acked-by: Christoph Lameter <cl@linux.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-21 17:09:05 +02:00
Eric Ren	315689d2e2	ocfs2/dlmglue: prepare tracking logic to avoid recursive cluster lock [ Upstream commit `439a36b8ef` ] We are in the situation that we have to avoid recursive cluster locking, but there is no way to check if a cluster lock has been taken by a precess already. Mostly, we can avoid recursive locking by writing code carefully. However, we found that it's very hard to handle the routines that are invoked directly by vfs code. For instance: const struct inode_operations ocfs2_file_iops = { .permission = ocfs2_permission, .get_acl = ocfs2_iop_get_acl, .set_acl = ocfs2_iop_set_acl, }; Both ocfs2_permission() and ocfs2_iop_get_acl() call ocfs2_inode_lock(PR): do_sys_open may_open inode_permission ocfs2_permission ocfs2_inode_lock() <=== first time generic_permission get_acl ocfs2_iop_get_acl ocfs2_inode_lock() <=== recursive one A deadlock will occur if a remote EX request comes in between two of ocfs2_inode_lock(). Briefly describe how the deadlock is formed: On one hand, OCFS2_LOCK_BLOCKED flag of this lockres is set in BAST(ocfs2_generic_handle_bast) when downconvert is started on behalf of the remote EX lock request. Another hand, the recursive cluster lock (the second one) will be blocked in in __ocfs2_cluster_lock() because of OCFS2_LOCK_BLOCKED. But, the downconvert never complete, why? because there is no chance for the first cluster lock on this node to be unlocked - we block ourselves in the code path. The idea to fix this issue is mostly taken from gfs2 code. 1. introduce a new field: struct ocfs2_lock_res.l_holders, to keep track of the processes' pid who has taken the cluster lock of this lock resource; 2. introduce a new flag for ocfs2_inode_lock_full: OCFS2_META_LOCK_GETBH; it means just getting back disk inode bh for us if we've got cluster lock. 3. export a helper: ocfs2_is_locked_by_me() is used to check if we have got the cluster lock in the upper code path. The tracking logic should be used by some of the ocfs2 vfs's callbacks, to solve the recursive locking issue cuased by the fact that vfs routines can call into each other. The performance penalty of processing the holder list should only be seen at a few cases where the tracking logic is used, such as get/set acl. You may ask what if the first time we got a PR lock, and the second time we want a EX lock? fortunately, this case never happens in the real world, as far as I can see, including permission check, (get\|set)_(acl\|attr), and the gfs2 code also do so. [sfr@canb.auug.org.au remove some inlines] Link: http://lkml.kernel.org/r/20170117100948.11657-2-zren@suse.com Signed-off-by: Eric Ren <zren@suse.com> Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com> Reviewed-by: Joseph Qi <jiangqi903@gmail.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Mark Fasheh <mfasheh@versity.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-21 17:09:05 +02:00
Milan Broz	d3335f5653	crypto: xts - Add ECB dependency [ Upstream commit `12cb3a1c41` ] Since the commit `f1c131b454` crypto: xts - Convert to skcipher the XTS mode is based on ECB, so the mode must select ECB otherwise it can fail to initialize. Signed-off-by: Milan Broz <gmazyland@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-21 17:09:05 +02:00
Majd Dibbiny	02744a55ed	net/mlx4_core: Fix VF overwrite of module param which disables DMFS on new probed PFs [ Upstream commit `95f1ba9a24` ] In the VF driver, module parameter mlx4_log_num_mgm_entry_size was mistakenly overwritten -- and in a manner which overrode the device-managed flow steering option encoded in the parameter. log_num_mgm_entry_size is a global module parameter which affects all ConnectX-3 PFs installed on that host. If a VF changes log_num_mgm_entry_size, this will affect all PFs which are probed subsequent to the change (by disabling DMFS for those PFs). Fixes: `3c439b5586` ("mlx4_core: Allow choosing flow steering mode") Signed-off-by: Majd Dibbiny <majd@mellanox.com> Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-21 17:09:05 +02:00
Vijay Kumar	7bf94b9595	sparc64: Migrate hvcons irq to panicked cpu [ Upstream commit `7dd4fcf5b7` ] On panic, all other CPUs are stopped except the one which had hit panic. To keep console alive, we need to migrate hvcons irq to panicked CPU. Signed-off-by: Vijay Kumar <vijay.ac.kumar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-21 17:09:05 +02:00
Shaohua Li	d14591e83b	md/linear: shutup lockdep warnning [ Upstream commit `d939cdfde3` ] Commit 03a9e24(md linear: fix a race between linear_add() and linear_congested()) introduces the warnning. Acked-by: Coly Li <colyli@suse.de> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-21 17:09:05 +02:00
Jaegeuk Kim	48ca88f935	f2fs: do not wait for writeback in write_begin [ Upstream commit `86d54795c9` ] Otherwise we can get livelock like below. [79880.428136] dbench D 0 18405 18404 0x00000000 [79880.428139] Call Trace: [79880.428142] __schedule+0x219/0x6b0 [79880.428144] schedule+0x36/0x80 [79880.428147] schedule_timeout+0x243/0x2e0 [79880.428152] ? update_sd_lb_stats+0x16b/0x5f0 [79880.428155] ? ktime_get+0x3c/0xb0 [79880.428157] io_schedule_timeout+0xa6/0x110 [79880.428161] __lock_page+0xf7/0x130 [79880.428164] ? unlock_page+0x30/0x30 [79880.428167] pagecache_get_page+0x16b/0x250 [79880.428171] grab_cache_page_write_begin+0x20/0x40 [79880.428182] f2fs_write_begin+0xa2/0xdb0 [f2fs] [79880.428192] ? f2fs_mark_inode_dirty_sync+0x16/0x30 [f2fs] [79880.428197] ? kmem_cache_free+0x79/0x200 [79880.428203] ? __mark_inode_dirty+0x17f/0x360 [79880.428206] generic_perform_write+0xbb/0x190 [79880.428213] ? file_update_time+0xa4/0xf0 [79880.428217] __generic_file_write_iter+0x19b/0x1e0 [79880.428226] f2fs_file_write_iter+0x9c/0x180 [f2fs] [79880.428231] __vfs_write+0xc5/0x140 [79880.428235] vfs_write+0xb2/0x1b0 [79880.428238] SyS_write+0x46/0xa0 [79880.428242] entry_SYSCALL_64_fastpath+0x1e/0xad Fixes: cae96a5c8ab6 ("f2fs: check io submission more precisely") Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-21 17:09:05 +02:00
Robbie Ko	3109615b52	Btrfs: send, fix failure to rename top level inode due to name collision [ Upstream commit `4dd9920d99` ] Under certain situations, an incremental send operation can fail due to a premature attempt to create a new top level inode (a direct child of the subvolume/snapshot root) whose name collides with another inode that was removed from the send snapshot. Consider the following example scenario. Parent snapshot: . (ino 256, gen 8) \|---- a1/ (ino 257, gen 9) \|---- a2/ (ino 258, gen 9) Send snapshot: . (ino 256, gen 3) \|---- a2/ (ino 257, gen 7) In this scenario, when receiving the incremental send stream, the btrfs receive command fails like this (ran in verbose mode, -vv argument): rmdir a1 mkfile o257-7-0 rename o257-7-0 -> a2 ERROR: rename o257-7-0 -> a2 failed: Is a directory What happens when computing the incremental send stream is: 1) An operation to remove the directory with inode number 257 and generation 9 is issued. 2) An operation to create the inode with number 257 and generation 7 is issued. This creates the inode with an orphanized name of "o257-7-0". 3) An operation rename the new inode 257 to its final name, "a2", is issued. This is incorrect because inode 258, which has the same name and it's a child of the same parent (root inode 256), was not yet processed and therefore no rmdir operation for it was yet issued. The rename operation is issued because we fail to detect that the name of the new inode 257 collides with inode 258, because their parent, a subvolume/snapshot root (inode 256) has a different generation in both snapshots. So fix this by ignoring the generation value of a parent directory that matches a root inode (number 256) when we are checking if the name of the inode currently being processed collides with the name of some other inode that was not yet processed. We can achieve this scenario of different inodes with the same number but different generation values either by mounting a filesystem with the inode cache option (-o inode_cache) or by creating and sending snapshots across different filesystems, like in the following example: $ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt $ mkdir /mnt/a1 $ mkdir /mnt/a2 $ btrfs subvolume snapshot -r /mnt /mnt/snap1 $ btrfs send /mnt/snap1 -f /tmp/1.snap $ umount /mnt $ mkfs.btrfs -f /dev/sdc $ mount /dev/sdc /mnt $ touch /mnt/a2 $ btrfs subvolume snapshot -r /mnt /mnt/snap2 $ btrfs receive /mnt -f /tmp/1.snap # Take note that once the filesystem is created, its current # generation has value 7 so the inode from the second snapshot has # a generation value of 7. And after receiving the first snapshot # the filesystem is at a generation value of 10, because the call to # create the second snapshot bumps the generation to 8 (the snapshot # creation ioctl does a transaction commit), the receive command calls # the snapshot creation ioctl to create the first snapshot, which bumps # the filesystem's generation to 9, and finally when the receive # operation finishes it calls an ioctl to transition the first snapshot # (snap1) from RW mode to RO mode, which does another transaction commit # and bumps the filesystem's generation to 10. $ rm -f /tmp/1.snap $ btrfs send /mnt/snap1 -f /tmp/1.snap $ btrfs send -p /mnt/snap1 /mnt/snap2 -f /tmp/2.snap $ umount /mnt $ mkfs.btrfs -f /dev/sdd $ mount /dev/sdd /mnt $ btrfs receive /mnt /tmp/1.snap # Receive of snapshot snap2 used to fail. $ btrfs receive /mnt /tmp/2.snap Signed-off-by: Robbie Ko <robbieko@synology.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> [Rewrote changelog to be more precise and clear] Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-21 17:09:04 +02:00
Christophe JAILLET	4d134d830e	iio: adc: xilinx: Fix error handling [ Upstream commit `ca1c39ef76` ] Reorder error handling labels in order to match the way resources have been allocated. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-21 17:09:04 +02:00
Jarno Rajahalme	5c65ed5c07	netfilter: nf_ct_expect: Change __nf_ct_expect_check() return value. [ Upstream commit `4b86c459c7` ] Commit `4dee62b1b9` ("netfilter: nf_ct_expect: nf_ct_expect_insert() returns void") inadvertently changed the successful return value of nf_ct_expect_related_report() from 0 to 1 due to __nf_ct_expect_check() returning 1 on success. Prevent this regression in the future by changing the return value of __nf_ct_expect_check() to 0 on success. Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Joe Stringer <joe@ovn.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-21 17:09:04 +02:00
Eric Dumazet	743a3ce1e0	net/mlx4_en: fix overflow in mlx4_en_init_timestamp() [ Upstream commit `47d3a07528` ] The cited commit makes a great job of finding optimal shift/multiplier values assuming a 10 seconds wrap around, but forgot to change the overflow_period computation. It overflows in cyclecounter_cyc2ns(), and the final result is 804 ms, which is silly. Lets simply use 5 seconds, no need to recompute this, given how it is supposed to work. Later, we will use a timer instead of a work queue, since the new RX allocation schem will no longer need mlx4_en_recover_from_oom() and the service_task firing every 250 ms. Fixes: `31c128b66e` ("net/mlx4_en: Choose time-stamping shift value according to HW frequency") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tariq Toukan <tariqt@mellanox.com> Cc: Eugenia Emantayev <eugenia@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-21 17:09:04 +02:00
Emmanuel Grumbach	7ed668eeb8	mac80211: fix power saving clients handling in iwlwifi [ Upstream commit `d98937f4ea` ] iwlwifi now supports RSS and can't let mac80211 track the PS state based on the Rx frames since they can come out of order. iwlwifi is now advertising AP_LINK_PS, and uses explicit notifications to teach mac80211 about the PS state of the stations and the PS poll / uAPSD trigger frames coming our way from the peers. Because of that, the TIM stopped being maintained in mac80211. I tried to fix this in commit `c68df2e7be` ("mac80211: allow using AP_LINK_PS with mac80211-generated TIM IE") but that was later reverted by Felix in commit `6c18a6b4e7` ("Revert "mac80211: allow using AP_LINK_PS with mac80211-generated TIM IE") since it broke drivers that do not implement set_tim. Since none of the drivers that set AP_LINK_PS have the set_tim() handler set besides iwlwifi, I can bail out in __sta_info_recalc_tim if AP_LINK_PS AND .set_tim is not implemented. Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-21 17:09:04 +02:00
Johannes Berg	3e8c1a04d3	mac80211_hwsim: check HWSIM_ATTR_RADIO_NAME length [ Upstream commit `ff4dd73dd2` ] Unfortunately, the nla policy was defined to have HWSIM_ATTR_RADIO_NAME as an NLA_STRING, rather than NLA_NUL_STRING, so we can't use it as a NUL-terminated string in the kernel. Rather than break the API, kasprintf() the string to a new buffer to guarantee NUL termination. Reported-by: Andrew Zaborowski <andrew.zaborowski@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-21 17:09:04 +02:00
Franck Demathieu	4a464dacc2	irqchip/crossbar: Fix incorrect type of local variables [ Upstream commit `b28ace1266` ] The max and entry variables are unsigned according to the dt-bindings. Fix following 3 sparse issues (-Wtypesign): drivers/irqchip/irq-crossbar.c:222:52: warning: incorrect type in argument 3 (different signedness) drivers/irqchip/irq-crossbar.c:222:52: expected unsigned int [usertype] out_value drivers/irqchip/irq-crossbar.c:222:52: got int <noident> drivers/irqchip/irq-crossbar.c:245:56: warning: incorrect type in argument 4 (different signedness) drivers/irqchip/irq-crossbar.c:245:56: expected unsigned int [usertype] out_value drivers/irqchip/irq-crossbar.c:245:56: got int <noident> drivers/irqchip/irq-crossbar.c:263:56: warning: incorrect type in argument 4 (different signedness) drivers/irqchip/irq-crossbar.c:263:56: expected unsigned int [usertype] out_value drivers/irqchip/irq-crossbar.c:263:56: got int <noident> Signed-off-by: Franck Demathieu <fdemathieu@gmail.com> Cc: marc.zyngier@arm.com Cc: jason@lakedaemon.net Link: http://lkml.kernel.org/r/20170223094855.6546-1-fdemathieu@gmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-21 17:09:04 +02:00
Arnd Bergmann	7e53f0390d	watchdog: kempld: fix gcc-4.3 build [ Upstream commit `3736d4eb6a` ] gcc-4.3 can't decide whether the constant value in kempld_prescaler[PRESCALER_21] is built-time constant or not, and gets confused by the logic in do_div(): drivers/watchdog/kempld_wdt.o: In function `kempld_wdt_set_stage_timeout': kempld_wdt.c:(.text.kempld_wdt_set_stage_timeout+0x130): undefined reference to `__aeabi_uldivmod' This adds a call to ACCESS_ONCE() to force it to not consider it to be constant, and leaves the more efficient normal case in place for modern compilers, using an #ifdef to annotate why we do this hack. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-21 17:09:04 +02:00
Peter Zijlstra	28eab3db72	locking/lockdep: Add nest_lock integrity test [ Upstream commit `7fb4a2cea6` ] Boqun reported that hlock->references can overflow. Add a debug test for that to generate a clear error when this happens. Without this, lockdep is likely to report a mysterious failure on unlock. Reported-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nicolai Hähnle <Nicolai.Haehnle@amd.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-21 17:09:03 +02:00
Greg Kroah-Hartman	d44e463c94	Revert "bsg-lib: don't free job in bsg_prepare_job" This reverts commit `668cee82cd` which was commit `f507b54dcc` upstream. Ben reports: That function doesn't exist here (it was introduced in 4.13). Instead, this backport has modified bsg_create_job(), creating a leak. Please revert this on the 3.18, 4.4 and 4.9 stable branches. So I'm dropping it from here. Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org	2017-10-21 17:09:03 +02:00
Parthasarathy Bhuvaragan	01e3e63151	tipc: use only positive error codes in messages [ Upstream commit `aad06212d3` ] In commit `e3a77561e7` ("tipc: split up function tipc_msg_eval()"), we have updated the function tipc_msg_lookup_dest() to set the error codes to negative values at destination lookup failures. Thus when the function sets the error code to -TIPC_ERR_NO_NAME, its inserted into the 4 bit error field of the message header as 0xf instead of TIPC_ERR_NO_NAME (1). The value 0xf is an unknown error code. In this commit, we set only positive error code. Fixes: `e3a77561e7` ("tipc: split up function tipc_msg_eval()") Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-21 17:09:03 +02:00
Christoph Paasch	685699703a	net: Set sk_prot_creator when cloning sockets to the right proto [ Upstream commit `9d538fa60b` ] sk->sk_prot and sk->sk_prot_creator can differ when the app uses IPV6_ADDRFORM (transforming an IPv6-socket to an IPv4-one). Which is why sk_prot_creator is there to make sure that sk_prot_free() does the kmem_cache_free() on the right kmem_cache slab. Now, if such a socket gets transformed back to a listening socket (using connect() with AF_UNSPEC) we will allocate an IPv4 tcp_sock through sk_clone_lock() when a new connection comes in. But sk_prot_creator will still point to the IPv6 kmem_cache (as everything got copied in sk_clone_lock()). When freeing, we will thus put this memory back into the IPv6 kmem_cache although it was allocated in the IPv4 cache. I have seen memory corruption happening because of this. With slub-debugging and MEMCG_KMEM enabled this gives the warning "cache_from_obj: Wrong slab cache. TCPv6 but object is from TCP" A C-program to trigger this: void main(void) { int fd = socket(AF_INET6, SOCK_STREAM, IPPROTO_TCP); int new_fd, newest_fd, client_fd; struct sockaddr_in6 bind_addr; struct sockaddr_in bind_addr4, client_addr1, client_addr2; struct sockaddr unsp; int val; memset(&bind_addr, 0, sizeof(bind_addr)); bind_addr.sin6_family = AF_INET6; bind_addr.sin6_port = ntohs(42424); memset(&client_addr1, 0, sizeof(client_addr1)); client_addr1.sin_family = AF_INET; client_addr1.sin_port = ntohs(42424); client_addr1.sin_addr.s_addr = inet_addr("127.0.0.1"); memset(&client_addr2, 0, sizeof(client_addr2)); client_addr2.sin_family = AF_INET; client_addr2.sin_port = ntohs(42421); client_addr2.sin_addr.s_addr = inet_addr("127.0.0.1"); memset(&unsp, 0, sizeof(unsp)); unsp.sa_family = AF_UNSPEC; bind(fd, (struct sockaddr )&bind_addr, sizeof(bind_addr)); listen(fd, 5); client_fd = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP); connect(client_fd, (struct sockaddr )&client_addr1, sizeof(client_addr1)); new_fd = accept(fd, NULL, NULL); close(fd); val = AF_INET; setsockopt(new_fd, SOL_IPV6, IPV6_ADDRFORM, &val, sizeof(val)); connect(new_fd, &unsp, sizeof(unsp)); memset(&bind_addr4, 0, sizeof(bind_addr4)); bind_addr4.sin_family = AF_INET; bind_addr4.sin_port = ntohs(42421); bind(new_fd, (struct sockaddr )&bind_addr4, sizeof(bind_addr4)); listen(new_fd, 5); client_fd = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP); connect(client_fd, (struct sockaddr )&client_addr2, sizeof(client_addr2)); newest_fd = accept(new_fd, NULL, NULL); close(new_fd); close(client_fd); close(new_fd); } As far as I can see, this bug has been there since the beginning of the git-days. Signed-off-by: Christoph Paasch <cpaasch@apple.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-21 17:09:03 +02:00
Willem de Bruijn	1299f7e17e	packet: only test po->has_vnet_hdr once in packet_snd [ Upstream commit `da7c956101` ] Packet socket option po->has_vnet_hdr can be updated concurrently with other operations if no ring is attached. Do not test the option twice in packet_snd, as the value may change in between calls. A race on setsockopt disable may cause a packet > mtu to be sent without having GSO options set. Fixes: `bfd5f4a3d6` ("packet: Add GSO/csum offload support.") Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-21 17:09:03 +02:00
Willem de Bruijn	1b6c80e797	packet: in packet_do_bind, test fanout with bind_lock held [ Upstream commit `4971613c16` ] Once a socket has po->fanout set, it remains a member of the group until it is destroyed. The prot_hook must be constant and identical across sockets in the group. If fanout_add races with packet_do_bind between the test of po->fanout and taking the lock, the bind call may make type or dev inconsistent with that of the fanout group. Hold po->bind_lock when testing po->fanout to avoid this race. I had to introduce artificial delay (local_bh_enable) to actually observe the race. Fixes: `dc99f60069` ("packet: Add fanout support.") Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-21 17:09:03 +02:00
Alexander Potapenko	ee534927f0	tun: bail out from tun_get_user() if the skb is empty [ Upstream commit `2580c4c17a` ] KMSAN (https://github.com/google/kmsan) reported accessing uninitialized skb->data[0] in the case the skb is empty (i.e. skb->len is 0): ================================================ BUG: KMSAN: use of uninitialized memory in tun_get_user+0x19ba/0x3770 CPU: 0 PID: 3051 Comm: probe Not tainted 4.13.0+ #3140 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: ... __msan_warning_32+0x66/0xb0 mm/kmsan/kmsan_instr.c:477 tun_get_user+0x19ba/0x3770 drivers/net/tun.c:1301 tun_chr_write_iter+0x19f/0x300 drivers/net/tun.c:1365 call_write_iter ./include/linux/fs.h:1743 new_sync_write fs/read_write.c:457 __vfs_write+0x6c3/0x7f0 fs/read_write.c:470 vfs_write+0x3e4/0x770 fs/read_write.c:518 SYSC_write+0x12f/0x2b0 fs/read_write.c:565 SyS_write+0x55/0x80 fs/read_write.c:557 do_syscall_64+0x242/0x330 arch/x86/entry/common.c:284 entry_SYSCALL64_slow_path+0x25/0x25 arch/x86/entry/entry_64.S:245 ... origin: ... kmsan_poison_shadow+0x6e/0xc0 mm/kmsan/kmsan.c:211 slab_alloc_node mm/slub.c:2732 __kmalloc_node_track_caller+0x351/0x370 mm/slub.c:4351 __kmalloc_reserve net/core/skbuff.c:138 __alloc_skb+0x26a/0x810 net/core/skbuff.c:231 alloc_skb ./include/linux/skbuff.h:903 alloc_skb_with_frags+0x1d7/0xc80 net/core/skbuff.c:4756 sock_alloc_send_pskb+0xabf/0xfe0 net/core/sock.c:2037 tun_alloc_skb drivers/net/tun.c:1144 tun_get_user+0x9a8/0x3770 drivers/net/tun.c:1274 tun_chr_write_iter+0x19f/0x300 drivers/net/tun.c:1365 call_write_iter ./include/linux/fs.h:1743 new_sync_write fs/read_write.c:457 __vfs_write+0x6c3/0x7f0 fs/read_write.c:470 vfs_write+0x3e4/0x770 fs/read_write.c:518 SYSC_write+0x12f/0x2b0 fs/read_write.c:565 SyS_write+0x55/0x80 fs/read_write.c:557 do_syscall_64+0x242/0x330 arch/x86/entry/common.c:284 return_from_SYSCALL_64+0x0/0x6a arch/x86/entry/entry_64.S:245 ================================================ Make sure tun_get_user() doesn't touch skb->data[0] unless there is actual data. C reproducer below: ========================== // autogenerated by syzkaller (http://github.com/google/syzkaller) #define _GNU_SOURCE #include <fcntl.h> #include <linux/if_tun.h> #include <netinet/ip.h> #include <net/if.h> #include <string.h> #include <sys/ioctl.h> int main() { int sock = socket(PF_INET, SOCK_STREAM, IPPROTO_IP); int tun_fd = open("/dev/net/tun", O_RDWR); struct ifreq req; memset(&req, 0, sizeof(struct ifreq)); strcpy((char*)&req.ifr_name, "gre0"); req.ifr_flags = IFF_UP \| IFF_MULTICAST; ioctl(tun_fd, TUNSETIFF, &req); ioctl(sock, SIOCSIFFLAGS, "gre0"); write(tun_fd, "hi", 0); return 0; } ========================== Signed-off-by: Alexander Potapenko <glider@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-21 17:09:03 +02:00
Sabrina Dubroca	b5f689d94b	l2tp: fix race condition in l2tp_tunnel_delete [ Upstream commit `62b982eeb4` ] If we try to delete the same tunnel twice, the first delete operation does a lookup (l2tp_tunnel_get), finds the tunnel, calls l2tp_tunnel_delete, which queues it for deletion by l2tp_tunnel_del_work. The second delete operation also finds the tunnel and calls l2tp_tunnel_delete. If the workqueue has already fired and started running l2tp_tunnel_del_work, then l2tp_tunnel_delete will queue the same tunnel a second time, and try to free the socket again. Add a dead flag to prevent firing the workqueue twice. Then we can remove the check of queue_work's result that was meant to prevent that race but doesn't. Reproducer: ip l2tp add tunnel tunnel_id 3000 peer_tunnel_id 4000 local 192.168.0.2 remote 192.168.0.1 encap udp udp_sport 5000 udp_dport 6000 ip l2tp add session name l2tp1 tunnel_id 3000 session_id 1000 peer_session_id 2000 ip link set l2tp1 up ip l2tp del tunnel tunnel_id 3000 ip l2tp del tunnel tunnel_id 3000 Fixes: `f8ccac0e44` ("l2tp: put tunnel socket release on a workqueue") Reported-by: Jianlin Shi <jishi@redhat.com> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-21 17:09:03 +02:00
Ridge Kennedy	110cf3dd4b	l2tp: Avoid schedule while atomic in exit_net [ Upstream commit `12d656af4e` ] While destroying a network namespace that contains a L2TP tunnel a "BUG: scheduling while atomic" can be observed. Enabling lockdep shows that this is happening because l2tp_exit_net() is calling l2tp_tunnel_closeall() (via l2tp_tunnel_delete()) from within an RCU critical section. l2tp_exit_net() takes rcu_read_lock_bh() << list_for_each_entry_rcu() >> l2tp_tunnel_delete() l2tp_tunnel_closeall() __l2tp_session_unhash() synchronize_rcu() << Illegal inside RCU critical section >> BUG: sleeping function called from invalid context in_atomic(): 1, irqs_disabled(): 0, pid: 86, name: kworker/u16:2 INFO: lockdep is turned off. CPU: 2 PID: 86 Comm: kworker/u16:2 Tainted: G W O 4.4.6-at1 #2 Hardware name: Xen HVM domU, BIOS 4.6.1-xs125300 05/09/2016 Workqueue: netns cleanup_net 0000000000000000 ffff880202417b90 ffffffff812b0013 ffff880202410ac0 ffffffff81870de8 ffff880202417bb8 ffffffff8107aee8 ffffffff81870de8 0000000000000c51 0000000000000000 ffff880202417be0 ffffffff8107b024 Call Trace: [<ffffffff812b0013>] dump_stack+0x85/0xc2 [<ffffffff8107aee8>] ___might_sleep+0x148/0x240 [<ffffffff8107b024>] __might_sleep+0x44/0x80 [<ffffffff810b21bd>] synchronize_sched+0x2d/0xe0 [<ffffffff8109be6d>] ? trace_hardirqs_on+0xd/0x10 [<ffffffff8105c7bb>] ? __local_bh_enable_ip+0x6b/0xc0 [<ffffffff816a1b00>] ? _raw_spin_unlock_bh+0x30/0x40 [<ffffffff81667482>] __l2tp_session_unhash+0x172/0x220 [<ffffffff81667397>] ? __l2tp_session_unhash+0x87/0x220 [<ffffffff8166888b>] l2tp_tunnel_closeall+0x9b/0x140 [<ffffffff81668c74>] l2tp_tunnel_delete+0x14/0x60 [<ffffffff81668dd0>] l2tp_exit_net+0x110/0x270 [<ffffffff81668d5c>] ? l2tp_exit_net+0x9c/0x270 [<ffffffff815001c3>] ops_exit_list.isra.6+0x33/0x60 [<ffffffff81501166>] cleanup_net+0x1b6/0x280 ... This bug can easily be reproduced with a few steps: $ sudo unshare -n bash # Create a shell in a new namespace # ip link set lo up # ip addr add 127.0.0.1 dev lo # ip l2tp add tunnel remote 127.0.0.1 local 127.0.0.1 tunnel_id 1 \ peer_tunnel_id 1 udp_sport 50000 udp_dport 50000 # ip l2tp add session name foo tunnel_id 1 session_id 1 \ peer_session_id 1 # ip link set foo up # exit # Exit the shell, in turn exiting the namespace $ dmesg ... [942121.089216] BUG: scheduling while atomic: kworker/u16:3/13872/0x00000200 ... To fix this, move the call to l2tp_tunnel_closeall() out of the RCU critical section, and instead call it from l2tp_tunnel_del_work(), which is running from the l2tp_wq workqueue. Fixes: `2b551c6e7d` ("l2tp: close sessions before initiating tunnel delete") Signed-off-by: Ridge Kennedy <ridge.kennedy@alliedtelesis.co.nz> Acked-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-21 17:09:03 +02:00
Alexey Kodanev	93040aa178	vti: fix use after free in vti_tunnel_xmit/vti6_tnl_xmit [ Upstream commit `36f6ee22d2` ] When running LTP IPsec tests, KASan might report: BUG: KASAN: use-after-free in vti_tunnel_xmit+0xeee/0xff0 [ip_vti] Read of size 4 at addr ffff880dc6ad1980 by task swapper/0/0 ... Call Trace: <IRQ> dump_stack+0x63/0x89 print_address_description+0x7c/0x290 kasan_report+0x28d/0x370 ? vti_tunnel_xmit+0xeee/0xff0 [ip_vti] __asan_report_load4_noabort+0x19/0x20 vti_tunnel_xmit+0xeee/0xff0 [ip_vti] ? vti_init_net+0x190/0x190 [ip_vti] ? save_stack_trace+0x1b/0x20 ? save_stack+0x46/0xd0 dev_hard_start_xmit+0x147/0x510 ? icmp_echo.part.24+0x1f0/0x210 __dev_queue_xmit+0x1394/0x1c60 ... Freed by task 0: save_stack_trace+0x1b/0x20 save_stack+0x46/0xd0 kasan_slab_free+0x70/0xc0 kmem_cache_free+0x81/0x1e0 kfree_skbmem+0xb1/0xe0 kfree_skb+0x75/0x170 kfree_skb_list+0x3e/0x60 __dev_queue_xmit+0x1298/0x1c60 dev_queue_xmit+0x10/0x20 neigh_resolve_output+0x3a8/0x740 ip_finish_output2+0x5c0/0xe70 ip_finish_output+0x4ba/0x680 ip_output+0x1c1/0x3a0 xfrm_output_resume+0xc65/0x13d0 xfrm_output+0x1e4/0x380 xfrm4_output_finish+0x5c/0x70 Can be fixed if we get skb->len before dst_output(). Fixes: `b9959fd3b0` ("vti: switch to new ip tunnel code") Fixes: `22e1b23daf` ("vti6: Support inter address family tunneling.") Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-21 17:09:02 +02:00
Meng Xu	d9cb4dc022	isdn/i4l: fetch the ppp_write buffer in one shot [ Upstream commit `02388bf87f` ] In isdn_ppp_write(), the header (i.e., protobuf) of the buffer is fetched twice from userspace. The first fetch is used to peek at the protocol of the message and reset the huptimer if necessary; while the second fetch copies in the whole buffer. However, given that buf resides in userspace memory, a user process can race to change its memory content across fetches. By doing so, we can either avoid resetting the huptimer for any type of packets (by first setting proto to PPP_LCP and later change to the actual type) or force resetting the huptimer for LCP packets. This patch changes this double-fetch behavior into two single fetches decided by condition (lp->isdn_device < 0 \|\| lp->isdn_channel <0). A more detailed discussion can be found at https://marc.info/?l=linux-kernel&m=150586376926123&w=2 Signed-off-by: Meng Xu <mengxu.gatech@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-21 17:09:02 +02:00

1 2 3 4 5 ...

570097 Commits