linux

mirror of https://github.com/hardkernel/linux.git synced 2026-04-03 11:43:03 +09:00

Author	SHA1	Message	Date
Peter Zijlstra	4b9300bf83	UPSTREAM: sched/core: Add missing update_rq_clock() in post_init_entity_util_avg() Address this rq-clock update bug: WARNING: CPU: 0 PID: 0 at ../kernel/sched/sched.h:797 post_init_entity_util_avg() rq->clock_update_flags < RQCF_ACT_SKIP Call Trace: __warn() post_init_entity_util_avg() wake_up_new_task() _do_fork() kernel_thread() rest_init() start_kernel() Change-Id: I0d8b0c83b19447dc53ebd397bb14f756cb97d8a3 Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit `4126bad671`) Signed-off-by: Quentin Perret <quentin.perret@arm.com>	2017-11-02 18:20:00 +00:00
Vincent Guittot	4604331553	UPSTREAM: sched/fair: Fix task group initialization The moves of tasks are now propagated down to root and the utilization of cfs_rq reflects reality so it doesn't need to be estimated at init. Change-Id: I496f26d22cbbe38d78d683f7eeb389b997eb9035 Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Morten.Rasmussen@arm.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: bsegall@google.com Cc: kernellwp@gmail.com Cc: pjt@google.com Cc: yuyang.du@intel.com Link: http://lkml.kernel.org/r/1478598827-32372-7-git-send-email-vincent.guittot@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit `d03266910a`) Signed-off-by: Quentin Perret <quentin.perret@arm.com>	2017-11-02 18:19:24 +00:00
Chris Redpath	d870d26fb3	cpufreq/sched: Consider max cpu capacity when choosing frequencies When using schedfreq on cpus with max capacity significantly smaller than 1024, the tick update uses non-normalised capacities - this leads to selecting an incorrect OPP as we were scaling the frequency as if the max capacity achievable was 1024 rather than the max for that particular cpu or group. This could result in a cpu being stuck at the lowest OPP and unable to generate enough utilisation to climb out if the max capacity is significantly smaller than 1024. Instead, normalize the capacity to be in the range 0-1024 in the tick so that when we later select a frequency, we get the correct one. Also comments updated to be clearer about what is needed. Change-Id: Id84391c7ac015311002ada21813a353ee13bee60 Signed-off-by: Chris Redpath <chris.redpath@arm.com> (cherry picked from commit bc3b0db024f3d0b323e7d93994655e9e3d9f8d68) Signed-off-by: Quentin Perret <quentin.perret@arm.com>	2017-11-02 11:15:14 -07:00
Chris Redpath	22f1127850	cpufreq/sched: Use cpu max freq rather than policy max When we convert capacity into frequency, we used policy->max to get the max freq of the cpu. Since this can be changed by userspace policy or thermal events, we are potentially asking for a lower frequency than the utilization demands. Change over to using cpuinfo.max which is the max freq supported by that cpu rather than the currently-chosen max. Frequency granted still honours the max policy. Tested by setting a userspace policy and observing the relevant vars in a trace. In this instance, we ask for around 1ghz instead of 620MHz. freq_new=1013512 unfixed_freq_new=624487 capacity=546 cpuinfo_max=1900800 policy_max=1171200 Change-Id: I8c5694db42243c6fb78bb9be9046b06ac81295e7 Signed-off-by: Chris Redpath <chris.redpath@arm.com> (cherry picked from commit 114c6ceb2e5b0e30442e2ba5f78cfaedf76bf1f9) [trivial cherry-pick issue] Signed-off-by: Quentin Perret <quentin.perret@arm.com>	2017-11-02 11:15:14 -07:00
Dietmar Eggemann	dad887ffaf	sched/fair: remove erroneous RCU_LOCKDEP_WARN from start_cpu() Fixes: https://bugs.linaro.org/show_bug.cgi?id=3075 Change-Id: I62d714fc4b9366a9b2535649aa92d1edc840cf94 Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Brendan Jackman <brendan.jackman@arm.com> Signed-off-by: Chris Redpath <chris.redpath@arm.com> (cherry picked from commit 2617b1391f69dfbd6ffc340c235059107745aac2) Signed-off-by: Quentin Perret <quentin.perret@arm.com>	2017-11-02 11:15:14 -07:00
Quentin Perret	5783838362	Merge branch 'android-4.9' into android-4.9-eas-dev Change-Id: Ic2303d3a53ca8dba9306fdc4f26ee77662e7534f Signed-off-by: Quentin Perret <quentin.perret@arm.com>	2017-11-01 10:34:19 +00:00
Abhilash Kesavan	3861f0b0f1	ANDROID: sched/walt: Fix divide by zero error in cpufreq notifier On systems using ACPI cpufreq, at boot-time only CPU0 max_freq has been populated when load_scale_factor/capacity recomputation is triggered on all possible cpus. This leads to the following divide by zero panic when rq->max_freq for any cpu other than CPU0 is accessed: [ 4.827597] divide error: 0000 [#1] PREEMPT SMP [ 4.827604] [<ffffffff862ad88e>] ? notifier_call_chain+0x37/0x57 [ 4.827607] [<ffffffff862adaf3>] ? __blocking_notifier_call_chain+0x46/0x5c [ 4.827611] [<ffffffff8684b9f4>] ? cpufreq_set_policy+0xa6/0x2dc [ 4.827614] [<ffffffff869e4d55>] ? cpufreq_init_policy+0xba/0xde [ 4.827617] [<ffffffff8684bd4c>] ? cpufreq_update_policy+0x122/0x122 [ 4.827620] [<ffffffff8684c51c>] ? cpufreq_online+0x5a7/0x64e [ 4.827623] [<ffffffff8684c63b>] ? cpufreq_add_dev+0x78/0xed [ 4.827627] [<ffffffff866ce412>] ? subsys_interface_register+0xd6/0x117 [ 4.827630] [<ffffffff8684b410>] ? cpufreq_register_driver+0x168/0x2b2 [ 4.827632] [<ffffffff8684b410>] ? cpufreq_register_driver+0x168/0x2b2 [ 4.827636] [<ffffffff86f83b2e>] ? cpufreq_gov_dbs_init+0xc/0xc [ 4.827639] [<ffffffff86f83d43>] ? acpi_cpufreq_init+0x215/0x282 [ 4.827641] [<ffffffff86f83b2e>] ? cpufreq_gov_dbs_init+0xc/0xc [ 4.827645] [<ffffffff8620046e>] ? do_one_initcall+0x9c/0x12e [ 4.827649] [<ffffffff86f3d058>] ? kernel_init_freeable+0x190/0x22b [ 4.827651] [<ffffffff869e59d9>] ? rest_init+0x7c/0x7c [ 4.827653] [<ffffffff869e59e3>] ? kernel_init+0xa/0xf3 [ 4.827656] [<ffffffff869ed482>] ? ret_from_fork+0x22/0x30 On ACPI based systems, for CPUs with uninitialized rq->max_freq skip the re-calculation. Reference discussion on a similar issue here: https://www.spinics.net/lists/arm-kernel/msg557612.html Change-Id: I5025e3f6db671e9e72369f85708d3b0482d5dad2 Signed-off-by: Abhilash Kesavan <abhilash.kesavan@intel.com>	2017-10-27 16:04:52 +00:00
Chris Redpath	185507e31e	ANDROID: sched/fair: Select correct capacity state for energy_diff The util returned from group_max_util is not capped at the max util present in the group, so it can be larger than the capacity stored in the array. Ensure that when this happens, we always use the last entry in the array to fetch energy from. Tested with synthetics on Juno board. Change-Id: I89fb52fb7e68fa3e682e308acc232596672d03f7 Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-10-26 15:17:49 +01:00
Rafael J. Wysocki	53d56d4130	BACKPORT: cpufreq: schedutil: Use policy-dependent transition delays Make the schedutil governor take the initial (default) value of the rate_limit_us sysfs attribute from the (new) transition_delay_us policy parameter (to be set by the scaling driver). That will allow scaling drivers to make schedutil use smaller default values of rate_limit_us and reduce the default average time interval between consecutive frequency changes. Make intel_pstate set transition_delay_us to 500. BACKPORT: Modified to support the separate up_rate_limit_us and down_rate_limit_us (upstream just has a single rate_limit_us). Also dropped the changes for intel_pstate as there's a merge conflict. Change-Id: I62a8543879a4d8582cdcb31ebd55607705d1c8b1 Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> (cherry picked from commit `1b72e7fd30`) Signed-off-by: Brendan Jackman <brendan.jackman@arm.com>	2017-10-26 02:26:34 +00:00
Greg Kroah-Hartman	f108c7d9b5	Merge 4.9.58 into android-4.9 Changes in 4.9.58 MIPS: Fix minimum alignment requirement of IRQ stack Revert "bsg-lib: don't free job in bsg_prepare_job" xen-netback: Use GFP_ATOMIC to allocate hash locking/lockdep: Add nest_lock integrity test watchdog: kempld: fix gcc-4.3 build irqchip/crossbar: Fix incorrect type of local variables initramfs: finish fput() before accessing any binary from initramfs mac80211_hwsim: check HWSIM_ATTR_RADIO_NAME length ALSA: hda: Add Geminilake HDMI codec ID qed: Don't use attention PTT for configuring BW mac80211: fix power saving clients handling in iwlwifi net/mlx4_en: fix overflow in mlx4_en_init_timestamp() staging: vchiq_2835_arm: Make cache-line-size a required DT property netfilter: nf_ct_expect: Change __nf_ct_expect_check() return value. iio: adc: xilinx: Fix error handling f2fs: do SSR for data when there is enough free space sched/fair: Update rq clock before changing a task's CPU affinity Btrfs: send, fix failure to rename top level inode due to name collision f2fs: do not wait for writeback in write_begin md/linear: shutup lockdep warnning sparc64: Migrate hvcons irq to panicked cpu net/mlx4_core: Fix VF overwrite of module param which disables DMFS on new probed PFs crypto: xts - Add ECB dependency mm/memory_hotplug: set magic number to page->freelist instead of page->lru.next ocfs2/dlmglue: prepare tracking logic to avoid recursive cluster lock slub: do not merge cache if slub_debug contains a never-merge flag scsi: scsi_dh_emc: return success in clariion_std_inquiry() ASoC: mediatek: add I2C dependency for CS42XX8 drm/amdgpu: refuse to reserve io mem for split VRAM buffers net: mvpp2: release reference to txq_cpu[] entry after unmapping qede: Prevent index problems in loopback test qed: Reserve doorbell BAR space for present CPUs qed: Read queue state before releasing buffer i2c: at91: ensure state is restored after suspending ceph: don't update_dentry_lease unless we actually got one ceph: fix bogus endianness change in ceph_ioctl_set_layout ceph: clean up unsafe d_parent accesses in build_dentry_path uapi: fix linux/rds.h userspace compilation errors uapi: fix linux/mroute6.h userspace compilation errors IB/hfi1: Use static CTLE with Preset 6 for integrated HFIs IB/hfi1: Allocate context data on memory node target/iscsi: Fix unsolicited data seq_end_offset calculation hrtimer: Catch invalid clockids again nfsd/callback: Cleanup callback cred on shutdown powerpc/perf: Add restrictions to PMC5 in power9 DD1 drm/nouveau/gr/gf100-: fix ccache error logging regulator: core: Resolve supplies before disabling unused regulators btmrvl: avoid double-disable_irq() race EDAC, mce_amd: Print IPID and Syndrome on a separate line cpufreq: CPPC: add ACPI_PROCESSOR dependency usb: dwc3: gadget: Correct ISOC DATA PIDs for short packets Linux 4.9.58 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>	2017-10-23 09:35:27 +02:00
Marc Zyngier	0c92e73293	hrtimer: Catch invalid clockids again [ Upstream commit `336a9cde10` ] commit `82e88ff1ea` ("hrtimer: Revert CLOCK_MONOTONIC_RAW support") removed unfortunately a sanity check in the hrtimer code which was part of that MONOTONIC_RAW patch series. It would have caught the bogus usage of CLOCK_MONOTONIC_RAW in the wireless code. So bring it back. It is way too easy to take any random clockid and feed it to the hrtimer subsystem. At best, it gets mapped to a monotonic base, but it would be better to just catch illegal values as early as possible. Detect invalid clockids, map them to CLOCK_MONOTONIC and emit a warning. [ tglx: Replaced the BUG by a WARN and gracefully map to CLOCK_MONOTONIC ] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Tomasz Nowicki <tn@semihalf.com> Cc: Christoffer Dall <christoffer.dall@linaro.org> Link: http://lkml.kernel.org/r/1452879670-16133-3-git-send-email-marc.zyngier@arm.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-21 17:21:38 +02:00
Wanpeng Li	ab3d531745	sched/fair: Update rq clock before changing a task's CPU affinity [ Upstream commit `a499c3ead8` ] This is triggered during boot when CONFIG_SCHED_DEBUG is enabled: ------------[ cut here ]------------ WARNING: CPU: 6 PID: 81 at kernel/sched/sched.h:812 set_next_entity+0x11d/0x380 rq->clock_update_flags < RQCF_ACT_SKIP CPU: 6 PID: 81 Comm: torture_shuffle Not tainted 4.10.0+ #1 Hardware name: LENOVO ThinkCentre M8500t-N000/SHARKBAY, BIOS FBKTC1AUS 02/16/2016 Call Trace: dump_stack+0x85/0xc2 __warn+0xcb/0xf0 warn_slowpath_fmt+0x5f/0x80 set_next_entity+0x11d/0x380 set_curr_task_fair+0x2b/0x60 do_set_cpus_allowed+0x139/0x180 __set_cpus_allowed_ptr+0x113/0x260 set_cpus_allowed_ptr+0x10/0x20 torture_shuffle+0xfd/0x180 kthread+0x10f/0x150 ? torture_shutdown_init+0x60/0x60 ? kthread_create_on_node+0x60/0x60 ret_from_fork+0x31/0x40 ---[ end trace dd94d92344cea9c6 ]--- The task is running && !queued, so there is no rq clock update before calling set_curr_task(). This patch fixes it by updating rq clock after holding rq->lock/pi_lock just as what other dequeue + put_prev + enqueue + set_curr story does. Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1487749975-5994-1-git-send-email-wanpeng.li@hotmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-21 17:21:35 +02:00
Peter Zijlstra	8b0be545de	locking/lockdep: Add nest_lock integrity test [ Upstream commit `7fb4a2cea6` ] Boqun reported that hlock->references can overflow. Add a debug test for that to generate a clear error when this happens. Without this, lockdep is likely to report a mysterious failure on unlock. Reported-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nicolai Hähnle <Nicolai.Haehnle@amd.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-21 17:21:33 +02:00
Greg Kroah-Hartman	b86d2b1467	Merge 4.9.57 into android-4.9 Changes in 4.9.57 ext4: in ext4_seek_{hole,data}, return -ENXIO for negative offsets CIFS: Reconnect expired SMB sessions nl80211: Define policy for packet pattern attributes rcu: Allow for page faults in NMI handlers USB: dummy-hcd: Fix deadlock caused by disconnect detection MIPS: math-emu: Remove pr_err() calls from fpu_emu() dmaengine: edma: Align the memcpy acnt array size with the transfer dmaengine: ti-dma-crossbar: Fix possible race condition with dma_inuse HID: usbhid: fix out-of-bounds bug crypto: shash - Fix zero-length shash ahash digest crash KVM: MMU: always terminate page walks at level 1 KVM: nVMX: fix guest CR4 loading when emulating L2 to L1 exit usb: renesas_usbhs: Fix DMAC sequence for receiving zero-length packet pinctrl/amd: Fix build dependency on pinmux code iommu/amd: Finish TLB flush in amd_iommu_unmap() device property: Track owner device of device property fs/mpage.c: fix mpage_writepage() for pages with buffers ALSA: usb-audio: Kill stray URB at exiting ALSA: seq: Fix use-after-free at creating a port ALSA: seq: Fix copy_from_user() call inside lock ALSA: caiaq: Fix stray URB at probe error path ALSA: line6: Fix missing initialization before error path ALSA: line6: Fix leftover URB at error-path during probe drm/i915/edp: Get the Panel Power Off timestamp after panel is off drm/i915: Read timings from the correct transcoder in intel_crtc_mode_get() drm/i915/bios: parse DDI ports also for CHV for HDMI DDC pin and DP AUX channel usb: gadget: configfs: Fix memory leak of interface directory data usb: gadget: composite: Fix use-after-free in usb_composite_overwrite_options direct-io: Prevent NULL pointer access in submit_page_section fix unbalanced page refcounting in bio_map_user_iov more bio_map_user_iov() leak fixes bio_copy_user_iov(): don't ignore ->iov_offset USB: serial: ftdi_sio: add id for Cypress WICED dev board USB: serial: cp210x: add support for ELV TFD500 USB: serial: option: add support for TP-Link LTE module USB: serial: qcserial: add Dell DW5818, DW5819 USB: serial: console: fix use-after-free after failed setup x86/alternatives: Fix alt_max_short macro to really be a max() KVM: nVMX: update last_nonleaf_level when initializing nested EPT Linux 4.9.57 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>	2017-10-18 09:45:37 +02:00
Paul E. McKenney	97535791d8	rcu: Allow for page faults in NMI handlers commit `28585a8326` upstream. A number of architecture invoke rcu_irq_enter() on exception entry in order to allow RCU read-side critical sections in the exception handler when the exception is from an idle or nohz_full CPU. This works, at least unless the exception happens in an NMI handler. In that case, rcu_nmi_enter() would already have exited the extended quiescent state, which would mean that rcu_irq_enter() would (incorrectly) cause RCU to think that it is again in an extended quiescent state. This will in turn result in lockdep splats in response to later RCU read-side critical sections. This commit therefore causes rcu_irq_enter() and rcu_irq_exit() to take no action if there is an rcu_nmi_enter() in effect, thus avoiding the unscheduled return to RCU quiescent state. This in turn should make the kernel safe for on-demand RCU voyeurism. Link: http://lkml.kernel.org/r/20170922211022.GA18084@linux.vnet.ibm.com Fixes: `0be964be0` ("module: Sanitize RCU usage and locking") Reported-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-18 09:35:38 +02:00
Brendan Jackman	55f81c1c68	UPSTREAM: sched/fair: Fix usage of find_idlest_group() when the local group is idlest find_idlest_group() returns NULL when the local group is idlest. The caller then continues the find_idlest_group() search at a lower level of the current CPU's sched_domain hierarchy. find_idlest_group_cpu() is not consulted and, crucially, @new_cpu is not updated. This means the search is pointless and we return @prev_cpu from select_task_rq_fair(). This is fixed by initialising @new_cpu to @cpu instead of @prev_cpu. Change-Id: I8644a35179b00d4833befe237aa443ec840046ec Signed-off-by: Brendan Jackman <brendan.jackman@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Josef Bacik <jbacik@fb.com> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20171005114516.18617-6-brendan.jackman@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit `93f50f9024`) Signed-off-by: Brendan Jackman <brendan.jackman@arm.com>	2017-10-13 13:31:44 +01:00
Brendan Jackman	a8c3c11c0e	UPSTREAM: sched/fair: Fix usage of find_idlest_group() when no groups are allowed When 'p' is not allowed on any of the CPUs in the sched_domain, we currently return NULL from find_idlest_group(), and pointlessly continue the search on lower sched_domain levels (where 'p' is also not allowed) before returning prev_cpu regardless (as we have not updated new_cpu). Add an explicit check for this case, and add a comment to find_idlest_group(). Now when find_idlest_group() returns NULL, it always means that the local group is allowed and idlest. Change-Id: I900538644e62dab68f1cfddb59d41846d991ce09 Signed-off-by: Brendan Jackman <brendan.jackman@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Reviewed-by: Josef Bacik <jbacik@fb.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20171005114516.18617-5-brendan.jackman@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit `6fee85ccbc`) Signed-off-by: Brendan Jackman <brendan.jackman@arm.com>	2017-10-13 13:31:43 +01:00
Brendan Jackman	13c4d4fdca	UPSTREAM: sched/fair: Fix find_idlest_group() when local group is not allowed When the local group is not allowed we do not modify this_*_load from their initial value of 0. That means that the load checks at the end of find_idlest_group cause us to incorrectly return NULL. Fixing the initial values to ULONG_MAX means we will instead return the idlest remote group in that case. Change-Id: I4cb6269a653205b1d1ba6dce1ca42359faf85362 Signed-off-by: Brendan Jackman <brendan.jackman@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Reviewed-by: Josef Bacik <jbacik@fb.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20171005114516.18617-4-brendan.jackman@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit `0d10ab952e`) Signed-off-by: Brendan Jackman <brendan.jackman@arm.com>	2017-10-13 13:31:42 +01:00
Brendan Jackman	da6485cee9	UPSTREAM: sched/fair: Remove unnecessary comparison with -1 Since commit: `83a0a96a5f` ("sched/fair: Leverage the idle state info when choosing the "idlest" cpu") find_idlest_group_cpu() (formerly find_idlest_cpu) no longer returns -1, so we can simplify the checking of the return value in find_idlest_cpu(). Change-Id: I43828722863cf286971c92fea1f3019f00b8e627 Signed-off-by: Brendan Jackman <brendan.jackman@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Josef Bacik <jbacik@fb.com> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20171005114516.18617-3-brendan.jackman@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit `e90381eaec`) Signed-off-by: Brendan Jackman <brendan.jackman@arm.com>	2017-10-13 13:31:42 +01:00
Brendan Jackman	6cc75c9bd2	UPSTREAM: sched/fair: Move select_task_rq_fair() slow-path into its own function In preparation for changes that would otherwise require adding a new level of indentation to the while(sd) loop, create a new function find_idlest_cpu() which contains this loop, and rename the existing find_idlest_cpu() to find_idlest_group_cpu(). Code inside the while(sd) loop is unchanged. @new_cpu is added as a variable in the new function, with the same initial value as the @new_cpu in select_task_rq_fair(). Change-Id: Ic08cc71eb4d0408b8987b2b15fcb7326956d7e81 Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Brendan Jackman <brendan.jackman@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Josef Bacik <jbacik@fb.com> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20171005114516.18617-2-brendan.jackman@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit `18bd1b4bd5`) Signed-off-by: Brendan Jackman <brendan.jackman@arm.com>	2017-10-13 13:31:41 +01:00
Brendan Jackman	e019d8f5b5	UPSTREAM: sched/fair: Force balancing on NOHZ balance if local group has capacity The "goto force_balance" here is intended to mitigate the fact that avg_load calculations can result in bad placement decisions when priority is asymmetrical. The original commit that adds it: `fab476228b` ("sched: Force balancing on newidle balance if local group has capacity") explains: Under certain situations, such as a niced down task (i.e. nice = -15) in the presence of nr_cpus NICE0 tasks, the niced task lands on a sched group and kicks away other tasks because of its large weight. This leads to sub-optimal utilization of the machine. Even though the sched group has capacity, it does not pull tasks because sds.this_load >> sds.max_load, and f_b_g() returns NULL. A similar but inverted issue also affects ARM big.LITTLE (asymmetrical CPU capacity) systems - consider 8 always-running, same-priority tasks on a system with 4 "big" and 4 "little" CPUs. Suppose that 5 of them end up on the "big" CPUs (which will be represented by one sched_group in the DIE sched_domain) and 3 on the "little" (the other sched_group in DIE), leaving one CPU unused. Because the "big" group has a higher group_capacity its avg_load may not present an imbalance that would cause migrating a task to the idle "little". The force_balance case here solves the problem but currently only for CPU_NEWLY_IDLE balances, which in theory might never happen on the unused CPU. Including CPU_IDLE in the force_balance case means there's an upper bound on the time before we can attempt to solve the underutilization: after DIE's sd->balance_interval has passed the next nohz balance kick will help us out. Change-Id: I4cbc8ced11e6a56f98ec67633e67e09cbb515268 Signed-off-by: Brendan Jackman <brendan.jackman@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170807163900.25180-1-brendan.jackman@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit `583ffd99d7`) Signed-off-by: Brendan Jackman <brendan.jackman@arm.com>	2017-10-13 13:31:28 +01:00
Vincent Guittot	08c53a651c	UPSTREAM: sched: use load_avg for selecting idlest group find_idlest_group() only compares the runnable_load_avg when looking for the least loaded group. But on fork intensive use case like hackbench where tasks blocked quickly after the fork, this can lead to selecting the same CPU instead of other CPUs, which have similar runnable load but a lower load_avg. When the runnable_load_avg of 2 CPUs are close, we now take into account the amount of blocked load as a 2nd selection factor. There is now 3 zones for the runnable_load of the rq: -[0 .. (runnable_load - imbalance)] : Select the new rq which has significantly less runnable_load -](runnable_load - imbalance) .. (runnable_load + imbalance)[ : The runnable loads are close so we use load_avg to chose between the 2 rq -[(runnable_load + imbalance) .. ULONG_MAX] : Keep the current rq which has significantly less runnable_load The scale factor that is currently used for comparing runnable_load, doesn't work well with small value. As an example, the use of a scaling factor fails as soon as this_runnable_load == 0 because we always select local rq even if min_runnable_load is only 1, which doesn't really make sense because they are just the same. So instead of scaling factor, we use an absolute margin for runnable_load to detect CPUs with similar runnable_load and we keep using scaling factor for blocked load. For use case like hackbench, this enable the scheduler to select different CPUs during the fork sequence and to spread tasks across the system. Tests have been done on a Hikey board (ARM based octo cores) for several kernel. The result below gives min, max, avg and stdev values of 18 runs with each configuration. The v4.8+patches configuration also includes the changes below which is part of the proposal made by Peter to ensure that the clock will be up to date when the fork task will be attached to the rq. @@ -2568,6 +2568,7 @@ void wake_up_new_task(struct task_struct *p) __set_task_cpu(p, select_task_rq(p, task_cpu(p), SD_BALANCE_FORK, 0)); #endif rq = __task_rq_lock(p, &rf); + update_rq_clock(rq); post_init_entity_util_avg(&p->se); activate_task(rq, p, 0); hackbench -P -g 1 `ea86cb4b76` `7dc603c902` v4.8 v4.8+patches min 0.049 0.050 0.051 0,048 avg 0.057 0.057(0%) 0.057(0%) 0,055(+5%) max 0.066 0.068 0.070 0,063 stdev +/-9% +/-9% +/-8% +/-9% Change-Id: I8fe40bbd106454282b1963db825705f3cf377b9e Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Tested-by: Matt Fleming <matt@codeblueprint.co.uk> Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>	2017-10-12 16:41:43 -07:00
Vincent Guittot	ff634002ac	UPSTREAM: sched: fix find_idlest_group for fork During fork, the utilization of a task is init once the rq has been selected because the current utilization level of the rq is used to set the utilization of the fork task. As the task's utilization is still null at this step of the fork sequence, it doesn't make sense to look for some spare capacity that can fit the task's utilization. Furthermore, I can see perf regressions for the test "hackbench -P -g 1" because the least loaded policy is always bypassed and tasks are not spread during fork. With this patch and the fix below, we are back to same performances as for v4.8. The fix below is only a temporary one used for the test until a smarter solution is found because we can't simply remove the test which is useful for others benchmarks @@ -5708,13 +5708,6 @@ static int select_idle_cpu(struct task_struct p, struct sched_domain sd, int t avg_cost = this_sd->avg_scan_cost; - /* - * Due to large variance we need a large fuzz factor; hackbench in - * particularly is sensitive here. - */ - if ((avg_idle / 512) < avg_cost) - return -1; - time = local_clock(); for_each_cpu_wrap(cpu, sched_domain_span(sd), target, wrap) { Change-Id: If7dd1c2a81fd74419733b8ecff519ae8cb8eb32d Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Acked-by: Morten Rasmussen <morten.rasmussen@arm.com> Tested-by: Matt Fleming <matt@codeblueprint.co.uk> Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>	2017-10-12 16:41:35 -07:00
Greg Kroah-Hartman	cdbe07ad26	Merge 4.9.55 into android-4.9 Changes in 4.9.55 USB: gadgetfs: Fix crash caused by inadequate synchronization USB: gadgetfs: fix copy_to_user while holding spinlock usb: gadget: udc: atmel: set vbus irqflags explicitly usb: gadget: udc: renesas_usb3: fix for no-data control transfer usb: gadget: udc: renesas_usb3: fix Pn_RAMMAP.Pn_MPKT value usb: gadget: udc: renesas_usb3: Fix return value of usb3_write_pipe() usb-storage: unusual_devs entry to fix write-access regression for Seagate external drives usb-storage: fix bogus hardware error messages for ATA pass-thru devices usb: renesas_usbhs: fix the BCLR setting condition for non-DCP pipe usb: renesas_usbhs: fix usbhsf_fifo_clear() for RX direction ALSA: usb-audio: Check out-of-bounds access by corrupted buffer descriptor usb: pci-quirks.c: Corrected timeout values used in handshake USB: cdc-wdm: ignore -EPIPE from GetEncapsulatedResponse USB: dummy-hcd: fix connection failures (wrong speed) USB: dummy-hcd: fix infinite-loop resubmission bug USB: dummy-hcd: Fix erroneous synchronization change USB: devio: Don't corrupt user memory usb: gadget: mass_storage: set msg_registered after msg registered USB: g_mass_storage: Fix deadlock when driver is unbound USB: uas: fix bug in handling of alternate settings USB: core: harden cdc_parse_cdc_header usb: Increase quirk delay for USB devices USB: fix out-of-bounds in usb_set_configuration xhci: fix finding correct bus_state structure for USB 3.1 hosts xhci: Fix sleeping with spin_lock_irq() held in ASmedia 1042A workaround xhci: set missing SuperSpeedPlus Link Protocol bit in roothub descriptor Revert "xhci: Limit USB2 port wake support for AMD Promontory hosts" iio: adc: twl4030: Fix an error handling path in 'twl4030_madc_probe()' iio: adc: twl4030: Disable the vusb3v1 rugulator in the error handling path of 'twl4030_madc_probe()' iio: ad_sigma_delta: Implement a dedicated reset function staging: iio: ad7192: Fix - use the dedicated reset function avoiding dma from stack. iio: core: Return error for failed read_reg IIO: BME280: Updates to Humidity readings need ctrl_reg write! iio: ad7793: Fix the serial interface reset iio: adc: mcp320x: Fix readout of negative voltages iio: adc: mcp320x: Fix oops on module unload uwb: properly check kthread_run return value uwb: ensure that endpoint is interrupt staging: vchiq_2835_arm: Fix NULL ptr dereference in free_pagelist mm, oom_reaper: skip mm structs with mmu notifiers lib/ratelimit.c: use deferred printk() version lsm: fix smack_inode_removexattr and xattr_getsecurity memleak ALSA: compress: Remove unused variable Revert "ALSA: echoaudio: purge contradictions between dimension matrix members and total number of members" ALSA: usx2y: Suppress kernel warning at page allocation failures mlxsw: spectrum: Prevent mirred-related crash on removal net: sched: fix use-after-free in tcf_action_destroy and tcf_del_walker sctp: potential read out of bounds in sctp_ulpevent_type_enabled() tcp: update skb->skb_mstamp more carefully bpf/verifier: reject BPF_ALU64\|BPF_END tcp: fix data delivery rate udpv6: Fix the checksum computation when HW checksum does not apply ip6_gre: skb_push ipv6hdr before packing the header in ip6gre_header net: phy: Fix mask value write on gmii2rgmii converter speed register ip6_tunnel: do not allow loading ip6_tunnel if ipv6 is disabled in cmdline net/sched: cls_matchall: fix crash when used with classful qdisc tcp: fastopen: fix on syn-data transmit failure net: emac: Fix napi poll list corruption packet: hold bind lock when rebinding to fanout hook bpf: one perf event close won't free bpf program attached by another perf event isdn/i4l: fetch the ppp_write buffer in one shot net_sched: always reset qdisc backlog in qdisc_reset() net: qcom/emac: specify the correct size when mapping a DMA buffer vti: fix use after free in vti_tunnel_xmit/vti6_tnl_xmit l2tp: Avoid schedule while atomic in exit_net l2tp: fix race condition in l2tp_tunnel_delete tun: bail out from tun_get_user() if the skb is empty net: dsa: Fix network device registration order packet: in packet_do_bind, test fanout with bind_lock held packet: only test po->has_vnet_hdr once in packet_snd net: Set sk_prot_creator when cloning sockets to the right proto netlink: do not proceed if dump's start() errs ip6_gre: ip6gre_tap device should keep dst ip6_tunnel: update mtu properly for ARPHRD_ETHER tunnel device in tx path tipc: use only positive error codes in messages net: rtnetlink: fix info leak in RTM_GETSTATS call socket, bpf: fix possible use after free powerpc/64s: Use emergency stack for kernel TM Bad Thing program checks powerpc/tm: Fix illegal TM state in signal handler percpu: make this_cpu_generic_read() atomic w.r.t. interrupts driver core: platform: Don't read past the end of "driver_override" buffer Drivers: hv: fcopy: restore correct transfer length stm class: Fix a use-after-free ftrace: Fix kmemleak in unregister_ftrace_graph HID: i2c-hid: allocate hid buffers for real worst case HID: wacom: leds: Don't try to control the EKR's read-only LEDs HID: wacom: Always increment hdev refcount within wacom_get_hdev_data HID: wacom: bits shifted too much for 9th and 10th buttons rocker: fix rocker_tlv_put_* functions for KASAN netlink: fix nla_put_{u8,u16,u32} for KASAN iwlwifi: mvm: use IWL_HCMD_NOCOPY for MCAST_FILTER_CMD iwlwifi: add workaround to disable wide channels in 5GHz scsi: sd: Do not override max_sectors_kb sysfs setting brcmfmac: add length check in brcmf_cfg80211_escan_handler() brcmfmac: setup passive scan if requested by user-space drm/i915/bios: ignore HDMI on port A nvme-pci: Use PCI bus address for data/queues in CMB mmc: core: add driver strength selection when selecting hs400es sched/cpuset/pm: Fix cpuset vs. suspend-resume bugs vfs: deny copy_file_range() for non regular files ext4: fix data corruption for mmap writes ext4: Don't clear SGID when inheriting ACLs ext4: don't allow encrypted operations without keys f2fs: don't allow encrypted operations without keys KVM: x86: fix singlestepping over syscall Linux 4.9.55 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>	2017-10-12 22:31:24 +02:00
Peter Zijlstra	ba15518c26	sched/cpuset/pm: Fix cpuset vs. suspend-resume bugs commit `50e7663233` upstream. Cpusets vs. suspend-resume is _completely_ broken. And it got noticed because it now resulted in non-cpuset usage breaking too. On suspend cpuset_cpu_inactive() doesn't call into cpuset_update_active_cpus() because it doesn't want to move tasks about, there is no need, all tasks are frozen and won't run again until after we've resumed everything. But this means that when we finally do call into cpuset_update_active_cpus() after resuming the last frozen cpu in cpuset_cpu_active(), the top_cpuset will not have any difference with the cpu_active_mask and this it will not in fact do _anything_. So the cpuset configuration will not be restored. This was largely hidden because we would unconditionally create identity domains and mobile users would not in fact use cpusets much. And servers what do use cpusets tend to not suspend-resume much. An addition problem is that we'd not in fact wait for the cpuset work to finish before resuming the tasks, allowing spurious migrations outside of the specified domains. Fix the rebuild by introducing cpuset_force_rebuild() and fix the ordering with cpuset_wait_for_hotplug(). Reported-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@vger.kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: `deb7aa308e` ("cpuset: reorganize CPU / memory hotplug handling") Link: http://lkml.kernel.org/r/20170907091338.orwxrqkbfkki3c24@hirez.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-12 11:51:25 +02:00
Shu Wang	a3ec104976	ftrace: Fix kmemleak in unregister_ftrace_graph commit `2b0b8499ae` upstream. The trampoline allocated by function tracer was overwriten by function_graph tracer, and caused a memory leak. The save_global_trampoline should have saved the previous trampoline in register_ftrace_graph() and restored it in unregister_ftrace_graph(). But as it is implemented, save_global_trampoline was only used in unregister_ftrace_graph as default value 0, and it overwrote the previous trampoline's value. Causing the previous allocated trampoline to be lost. kmmeleak backtrace: kmemleak_vmalloc+0x77/0xc0 __vmalloc_node_range+0x1b5/0x2c0 module_alloc+0x7c/0xd0 arch_ftrace_update_trampoline+0xb5/0x290 ftrace_startup+0x78/0x210 register_ftrace_function+0x8b/0xd0 function_trace_init+0x4f/0x80 tracing_set_tracer+0xe6/0x170 tracing_set_trace_write+0x90/0xd0 __vfs_write+0x37/0x170 vfs_write+0xb2/0x1b0 SyS_write+0x55/0xc0 do_syscall_64+0x67/0x180 return_from_SYSCALL_64+0x0/0x6a [ Looking further into this, I found that this was left over from when the function and function graph tracers shared the same ftrace_ops. But in commit `5f151b2401` ("ftrace: Fix function_profiler and function tracer together"), the two were separated, and the save_global_trampoline no longer was necessary (and it may have been broken back then too). -- Steven Rostedt ] Link: http://lkml.kernel.org/r/20170912021454.5976-1-shuwang@redhat.com Fixes: `5f151b2401` ("ftrace: Fix function_profiler and function tracer together") Signed-off-by: Shu Wang <shuwang@redhat.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-12 11:51:24 +02:00
Yonghong Song	0dee549f79	bpf: one perf event close won't free bpf program attached by another perf event [ Upstream commit `ec9dd352d5` ] This patch fixes a bug exhibited by the following scenario: 1. fd1 = perf_event_open with attr.config = ID1 2. attach bpf program prog1 to fd1 3. fd2 = perf_event_open with attr.config = ID1 <this will be successful> 4. user program closes fd2 and prog1 is detached from the tracepoint. 5. user program with fd1 does not work properly as tracepoint no output any more. The issue happens at step 4. Multiple perf_event_open can be called successfully, but only one bpf prog pointer in the tp_event. In the current logic, any fd release for the same tp_event will free the tp_event->prog. The fix is to free tp_event->prog only when the closing fd corresponds to the one which registered the program. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-12 11:51:21 +02:00
Edward Cree	e159492b3c	bpf/verifier: reject BPF_ALU64\|BPF_END [ Upstream commit `e67b8a685c` ] Neither ___bpf_prog_run nor the JITs accept it. Also adds a new test case. Fixes: `17a5267067` ("bpf: verifier (add verifier core)") Signed-off-by: Edward Cree <ecree@solarflare.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-12 11:51:20 +02:00
Joel Fernandes	2b3a26c86b	FROMLIST: tracing: Add support for preempt and irq enable/disable events Preempt and irq trace events can be used for tracing the start and end of an atomic section which can be used by a trace viewer like systrace to graphically view the start and end of an atomic section and correlate them with latencies and scheduling issues. This also serves as a prelude to using synthetic events or probes to rewrite the preempt and irqsoff tracers, along with numerous benefits of using trace events features for these events. Change-Id: I4f992714c0161a415b17a03aa14cce823d8ca3bb Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zilstra <peterz@infradead.org> Cc: kernel-team@android.com Link: https://patchwork.kernel.org/patch/9988157/ Signed-off-by: Joel Fernandes <joelaf@google.com>	2017-10-06 16:20:25 -07:00
Joel Fernandes	a7e410c135	FROMLIST: tracing: Prepare to add preempt and irq trace events In preparation of adding irqsoff and preemptsoff enable and disable trace events, move required functions and code to make it easier to add these events in a later patch. This patch is just code movement and no functional change. Change-Id: I587d411da5efbc4959bcccd7a05c7a66c231e1e0 Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: kernel-team@android.com Link: https://patchwork.kernel.org/patch/9988159/ Signed-off-by: Joel Fernandes <joelaf@google.com>	2017-10-06 16:19:52 -07:00
Greg Kroah-Hartman	379e3b2a6d	Merge 4.9.53 into android-4.9 Changes in 4.9.53 cifs: release cifs root_cred after exit_cifs cifs: release auth_key.response for reconnect. fs/proc: Report eip/esp in /prod/PID/stat for coredumping mac80211: fix VLAN handling with TXQs mac80211_hwsim: Use proper TX power mac80211: flush hw_roc_start work before cancelling the ROC genirq: Make sparse_irq_lock protect what it should protect KVM: PPC: Book3S: Fix race and leak in kvm_vm_ioctl_create_spapr_tce() KVM: PPC: Book3S HV: Protect updates to spapr_tce_tables list tracing: Fix trace_pipe behavior for instance traces tracing: Erase irqsoff trace with empty write md/raid5: fix a race condition in stripe batch md/raid5: preserve STRIPE_ON_UNPLUG_LIST in break_stripe_batch_list scsi: scsi_transport_iscsi: fix the issue that iscsi_if_rx doesn't parse nlmsg properly drm/radeon: disable hard reset in hibernate for APUs crypto: drbg - fix freeing of resources crypto: talitos - Don't provide setkey for non hmac hashing algs. crypto: talitos - fix sha224 crypto: talitos - fix hashing security/keys: properly zero out sensitive key material in big_key security/keys: rewrite all of big_key crypto KEYS: fix writing past end of user-supplied buffer in keyring_read() KEYS: prevent creating a different user's keyrings KEYS: prevent KEYCTL_READ on negative key powerpc/pseries: Fix parent_dn reference leak in add_dt_node() powerpc/tm: Flush TM only if CPU has TM feature powerpc/ftrace: Pass the correct stack pointer for DYNAMIC_FTRACE_WITH_REGS s390/mm: fix write access check in gup_huge_pmd() PM: core: Fix device_pm_check_callbacks() Fix SMB3.1.1 guest authentication to Samba SMB3: Warn user if trying to sign connection that authenticated as guest SMB: Validate negotiate (to protect against downgrade) even if signing off SMB3: Don't ignore O_SYNC/O_DSYNC and O_DIRECT flags vfs: Return -ENXIO for negative SEEK_HOLE / SEEK_DATA offsets nl80211: check for the required netlink attributes presence bsg-lib: don't free job in bsg_prepare_job iw_cxgb4: remove the stid on listen create failure iw_cxgb4: put ep reference in pass_accept_req() selftests/seccomp: Support glibc 2.26 siginfo_t.h seccomp: fix the usage of get/put_seccomp_filter() in seccomp_get_filter() arm64: Make sure SPsel is always set arm64: fault: Route pte translation faults via do_translation_fault KVM: VMX: extract __pi_post_block KVM: VMX: avoid double list add with VT-d posted interrupts KVM: VMX: simplify and fix vmx_vcpu_pi_load kvm/x86: Handle async PF in RCU read-side critical sections KVM: VMX: Do not BUG() on out-of-bounds guest IRQ kvm: nVMX: Don't allow L2 to access the hardware CR8 xfs: validate bdev support for DAX inode flag etnaviv: fix gem object list corruption PCI: Fix race condition with driver_override btrfs: fix NULL pointer dereference from free_reloc_roots() btrfs: propagate error to btrfs_cmp_data_prepare caller btrfs: prevent to set invalid default subvolid x86/mm: Fix fault error path using unsafe vma pointer x86/fpu: Don't let userspace set bogus xcomp_bv gfs2: Fix debugfs glocks dump timer/sysclt: Restrict timer migration sysctl values to 0 and 1 KVM: VMX: do not change SN bit in vmx_update_pi_irte() KVM: VMX: remove WARN_ON_ONCE in kvm_vcpu_trigger_posted_interrupt cxl: Fix driver use count KVM: VMX: use cmpxchg64 video: fbdev: aty: do not leak uninitialized padding in clk to userspace swiotlb-xen: implement xen_swiotlb_dma_mmap callback Linux 4.9.53 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>	2017-10-05 10:37:37 +02:00
Myungho Jung	4c00015385	timer/sysclt: Restrict timer migration sysctl values to 0 and 1 commit `b94bf594cf` upstream. timer_migration sysctl acts as a boolean switch, so the allowed values should be restricted to 0 and 1. Add the necessary extra fields to the sysctl table entry to enforce that. [ tglx: Rewrote changelog ] Signed-off-by: Myungho Jung <mhjungk@gmail.com> Link: http://lkml.kernel.org/r/1492640690-3550-1-git-send-email-mhjungk@gmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Kazuhiro Hayashi <kazuhiro3.hayashi@toshiba.co.jp> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-05 09:44:04 +02:00
Oleg Nesterov	be69c4c00a	seccomp: fix the usage of get/put_seccomp_filter() in seccomp_get_filter() commit `66a733ea6b` upstream. As Chris explains, get_seccomp_filter() and put_seccomp_filter() can end up using different filters. Once we drop ->siglock it is possible for task->seccomp.filter to have been replaced by SECCOMP_FILTER_FLAG_TSYNC. Fixes: `f8e529ed94` ("seccomp, ptrace: add support for dumping seccomp filters") Reported-by: Chris Salls <chrissalls5@gmail.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> [tycho: add __get_seccomp_filter vs. open coding refcount_inc()] Signed-off-by: Tycho Andersen <tycho@docker.com> [kees: tweak commit log] Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-05 09:44:02 +02:00
Bo Yan	5fb4be27da	tracing: Erase irqsoff trace with empty write commit `8dd33bcb70` upstream. One convenient way to erase trace is "echo > trace". However, this is currently broken if the current tracer is irqsoff tracer. This is because irqsoff tracer use max_buffer as the default trace buffer. Set the max_buffer as the one to be cleared when it's the trace buffer currently in use. Link: http://lkml.kernel.org/r/1505754215-29411-1-git-send-email-byan@nvidia.com Cc: <mingo@redhat.com> Fixes: `4acd4d00f` ("tracing: give easy way to clear trace buffer") Signed-off-by: Bo Yan <byan@nvidia.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-05 09:43:59 +02:00
Tahsin Erdogan	97d402e6ee	tracing: Fix trace_pipe behavior for instance traces commit `75df6e688c` upstream. When reading data from trace_pipe, tracing_wait_pipe() performs a check to see if tracing has been turned off after some data was read. Currently, this check always looks at global trace state, but it should be checking the trace instance where trace_pipe is located at. Because of this bug, cat instances/i1/trace_pipe in the following script will immediately exit instead of waiting for data: cd /sys/kernel/debug/tracing echo 0 > tracing_on mkdir -p instances/i1 echo 1 > instances/i1/tracing_on echo 1 > instances/i1/events/sched/sched_process_exec/enable cat instances/i1/trace_pipe Link: http://lkml.kernel.org/r/20170917102348.1615-1-tahsin@google.com Fixes: `10246fa35d` ("tracing: give easy way to clear trace buffer") Signed-off-by: Tahsin Erdogan <tahsin@google.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-05 09:43:59 +02:00
Thomas Gleixner	3d5960c8c6	genirq: Make sparse_irq_lock protect what it should protect commit `12ac1d0f6c` upstream. for_each_active_irq() iterates the sparse irq allocation bitmap. The caller must hold sparse_irq_lock. Several code pathes expect that an active bit in the sparse bitmap also has a valid interrupt descriptor. Unfortunately that's not true. The (de)allocation is a two step process, which holds the sparse_irq_lock only across the queue/remove from the radix tree and the set/clear in the allocation bitmap. If a iteration locks sparse_irq_lock between the two steps, then it might see an active bit but the corresponding irq descriptor is NULL. If that is dereferenced unconditionally, then the kernel oopses. Of course, all iterator sites could be audited and fixed, but.... There is no reason why the sparse_irq_lock needs to be dropped between the two steps, in fact the code becomes simpler when the mutex is held across both and the semantics become more straight forward, so future problems of missing NULL pointer checks in the iteration are avoided and all existing sites are fixed in one go. Expand the lock held sections so both operations are covered and the bitmap and the radixtree are in sync. Fixes: `a05a900a51` ("genirq: Make sparse_lock a mutex") Reported-and-tested-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-05 09:43:58 +02:00
Steve Muckle	96a28fcc7c	ANDROID: add script to fetch android kernel config fragments The Android kernel config fragments now live in a separate repository. To prevent others from having to search for this location, add a script to fetch and unpack the fragments. Update .gitignore to include these fragments. Change-Id: If2d4a59b86e4573b0a9b3190025dfe4191870b46 Signed-off-by: Steve Muckle <smuckle@google.com>	2017-10-03 17:19:26 +00:00
Greg Kroah-Hartman	c30c69c76c	Merge 4.9.52 into android-4.9 Changes in 4.9.52 SUNRPC: Refactor svc_set_num_threads() NFSv4: Fix callback server shutdown mm: prevent double decrease of nr_reserved_highatomic orangefs: Don't clear SGID when inheriting ACLs IB/{qib, hfi1}: Avoid flow control testing for RDMA write operation drm/sun4i: Implement drm_driver lastclose to restore fbdev console IB/addr: Fix setting source address in addr6_resolve() tty: improve tty_insert_flip_char() fast path tty: improve tty_insert_flip_char() slow path tty: fix __tty_insert_flip_char regression pinctrl/amd: save pin registers over suspend/resume Input: i8042 - add Gigabyte P57 to the keyboard reset table MIPS: math-emu: <MAX\|MAXA\|MIN\|MINA>.<D\|S>: Fix quiet NaN propagation MIPS: math-emu: <MAX\|MAXA\|MIN\|MINA>.<D\|S>: Fix cases of both inputs zero MIPS: math-emu: <MAX\|MIN>.<D\|S>: Fix cases of both inputs negative MIPS: math-emu: <MAXA\|MINA>.<D\|S>: Fix cases of input values with opposite signs MIPS: math-emu: <MAXA\|MINA>.<D\|S>: Fix cases of both infinite inputs MIPS: math-emu: MINA.<D\|S>: Fix some cases of infinity and zero inputs MIPS: math-emu: Handle zero accumulator case in MADDF and MSUBF separately MIPS: math-emu: <MADDF\|MSUBF>.<D\|S>: Fix NaN propagation MIPS: math-emu: <MADDF\|MSUBF>.<D\|S>: Fix some cases of infinite inputs MIPS: math-emu: <MADDF\|MSUBF>.<D\|S>: Fix some cases of zero inputs MIPS: math-emu: <MADDF\|MSUBF>.<D\|S>: Clean up "maddf_flags" enumeration MIPS: math-emu: <MADDF\|MSUBF>.S: Fix accuracy (32-bit case) MIPS: math-emu: <MADDF\|MSUBF>.D: Fix accuracy (64-bit case) crypto: ccp - Fix XTS-AES-128 support on v5 CCPs crypto: AF_ALG - remove SGL terminator indicator when chaining ext4: fix incorrect quotaoff if the quota feature is enabled ext4: fix quota inconsistency during orphan cleanup for read-only mounts powerpc: Fix DAR reporting when alignment handler faults block: Relax a check in blk_start_queue() md/bitmap: disable bitmap_resize for file-backed bitmaps. skd: Avoid that module unloading triggers a use-after-free skd: Submit requests to firmware before triggering the doorbell scsi: zfcp: fix queuecommand for scsi_eh commands when DIX enabled scsi: zfcp: add handling for FCP_RESID_OVER to the fcp ingress path scsi: zfcp: fix capping of unsuccessful GPN_FT SAN response trace records scsi: zfcp: fix passing fsf_req to SCSI trace on TMF to correlate with HBA scsi: zfcp: fix missing trace records for early returns in TMF eh handlers scsi: zfcp: fix payload with full FCP_RSP IU in SCSI trace records scsi: zfcp: trace HBA FSF response by default on dismiss or timedout late response scsi: zfcp: trace high part of "new" 64 bit SCSI LUN scsi: megaraid_sas: set minimum value of resetwaittime to be 1 secs scsi: megaraid_sas: Check valid aen class range to avoid kernel panic scsi: megaraid_sas: Return pended IOCTLs with cmd_status MFI_STAT_WRONG_STATE in case adapter is dead scsi: storvsc: fix memory leak on ring buffer busy scsi: sg: remove 'save_scat_len' scsi: sg: use standard lists for sg_requests scsi: sg: off by one in sg_ioctl() scsi: sg: factor out sg_fill_request_table() scsi: sg: fixup infoleak when using SG_GET_REQUEST_TABLE scsi: qla2xxx: Correction to vha->vref_count timeout scsi: qla2xxx: Fix an integer overflow in sysfs code ftrace: Fix selftest goto location on error ftrace: Fix memleak when unregistering dynamic ops when tracing disabled tracing: Add barrier to trace_printk() buffer nesting modification tracing: Apply trace_clock changes to instance max buffer ARC: Re-enable MMU upon Machine Check exception PCI: shpchp: Enable bridge bus mastering if MSI is enabled PCI: pciehp: Report power fault only once until we clear it net/netfilter/nf_conntrack_core: Fix net_conntrack_lock() s390/mm: fix local TLB flushing vs. detach of an mm address space s390/mm: fix race on mm->context.flush_mm media: v4l2-compat-ioctl32: Fix timespec conversion media: uvcvideo: Prevent heap overflow when accessing mapped controls PM / devfreq: Fix memory leak when fail to register device bcache: initialize dirty stripes in flash_dev_run() bcache: Fix leak of bdev reference bcache: do not subtract sectors_to_gc for bypassed IO bcache: correct cache_dirty_target in __update_writeback_rate() bcache: Correct return value for sysfs attach errors bcache: fix for gc and write-back race bcache: fix bch_hprint crash and improve output Linux 4.9.52 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>	2017-09-27 14:56:06 +02:00
Baohong Liu	cf052336d0	tracing: Apply trace_clock changes to instance max buffer commit `170b3b1050` upstream. Currently trace_clock timestamps are applied to both regular and max buffers only for global trace. For instance trace, trace_clock timestamps are applied only to regular buffer. But, regular and max buffers can be swapped, for example, following a snapshot. So, for instance trace, bad timestamps can be seen following a snapshot. Let's apply trace_clock timestamps to instance max buffer as well. Link: http://lkml.kernel.org/r/ebdb168d0be042dcdf51f81e696b17fabe3609c1.1504642143.git.tom.zanussi@linux.intel.com Fixes: `277ba0446` ("tracing: Add interface to allow multiple trace buffers") Signed-off-by: Baohong Liu <baohong.liu@intel.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-09-27 14:39:23 +02:00
Steven Rostedt (VMware)	96cf918df4	tracing: Add barrier to trace_printk() buffer nesting modification commit `3d9622c12c` upstream. trace_printk() uses 4 buffers, one for each context (normal, softirq, irq and NMI), such that it does not need to worry about one context preempting the other. There's a nesting counter that gets incremented to figure out which buffer to use. If the context gets preempted by another context which calls trace_printk() it will increment the counter and use the next buffer, and restore the counter when it is finished. The problem is that gcc may optimize the modification of the buffer nesting counter and it may not be incremented in memory before the buffer is used. If this happens, and the context gets interrupted by another context, it could pick the same buffer and corrupt the one that is being used. Compiler barriers need to be added after the nesting variable is incremented and before it is decremented to prevent usage of the context buffers by more than one context at the same time. Cc: Andy Lutomirski <luto@kernel.org> Fixes: `e2ace00117` ("tracing: Choose static tp_printk buffer by explicit nesting count") Hat-tip-to: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-09-27 14:39:23 +02:00
Steven Rostedt (VMware)	100553e197	ftrace: Fix memleak when unregistering dynamic ops when tracing disabled commit `edb096e007` upstream. If function tracing is disabled by the user via the function-trace option or the proc sysctl file, and a ftrace_ops that was allocated on the heap is unregistered, then the shutdown code exits out without doing the proper clean up. This was found via kmemleak and running the ftrace selftests, as one of the tests unregisters with function tracing disabled. # cat kmemleak unreferenced object 0xffffffffa0020000 (size 4096): comm "swapper/0", pid 1, jiffies 4294668889 (age 569.209s) hex dump (first 32 bytes): 55 ff 74 24 10 55 48 89 e5 ff 74 24 18 55 48 89 U.t$.UH...t$.UH. e5 48 81 ec a8 00 00 00 48 89 44 24 50 48 89 4c .H......H.D$PH.L backtrace: [<ffffffff81d64665>] kmemleak_vmalloc+0x85/0xf0 [<ffffffff81355631>] __vmalloc_node_range+0x281/0x3e0 [<ffffffff8109697f>] module_alloc+0x4f/0x90 [<ffffffff81091170>] arch_ftrace_update_trampoline+0x160/0x420 [<ffffffff81249947>] ftrace_startup+0xe7/0x300 [<ffffffff81249bd2>] register_ftrace_function+0x72/0x90 [<ffffffff81263786>] trace_selftest_ops+0x204/0x397 [<ffffffff82bb8971>] trace_selftest_startup_function+0x394/0x624 [<ffffffff81263a75>] run_tracer_selftest+0x15c/0x1d7 [<ffffffff82bb83f1>] init_trace_selftests+0x75/0x192 [<ffffffff81002230>] do_one_initcall+0x90/0x1e2 [<ffffffff82b7d620>] kernel_init_freeable+0x350/0x3fe [<ffffffff81d61ec3>] kernel_init+0x13/0x122 [<ffffffff81d72c6a>] ret_from_fork+0x2a/0x40 [<ffffffffffffffff>] 0xffffffffffffffff Fixes: `12cce594fa` ("ftrace/x86: Allow !CONFIG_PREEMPT dynamic ops to use allocated trampolines") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-09-27 14:39:23 +02:00
Steven Rostedt (VMware)	df865f86b0	ftrace: Fix selftest goto location on error commit `46320a6acc` upstream. In the second iteration of trace_selftest_ops(), the error goto label is wrong in the case where trace_selftest_test_global_cnt is off. In the case of error, it leaks the dynamic ops that was allocated. Fixes: `95950c2e` ("ftrace: Add self-tests for multiple function trace users") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-09-27 14:39:23 +02:00
Vincent Guittot	0b4a2f1d16	UPSTREAM: sched/fair: Fix FTQ noise bench regression commit `bc4278987e` upstream. A regression of the FTQ noise has been reported by Ying Huang, on the following hardware: 8 threads Intel(R) Core(TM)i7-4770 CPU @ 3.40GHz with 8G memory ... which was caused by this commit: commit `4e5160766f` ("sched/fair: Propagate asynchrous detach") The only part of the patch that can increase the noise is the update of blocked load of group entity in update_blocked_averages(). We can optimize this call and skip the update of group entity if its load and utilization are already null and there is no pending propagation of load in the task group. This optimization partly restores the noise score. A more agressive optimization has been tried but has shown worse score. Reported-by: ying.huang@linux.intel.com Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dietmar.eggemann@arm.com Cc: ying.huang@intel.com Fixes: `4e5160766f` ("sched/fair: Propagate asynchrous detach") Link: http://lkml.kernel.org/r/1489758442-2877-1-git-send-email-vincent.guittot@linaro.org [ Fixed typos, improved layout. ] Signed-off-by: Ingo Molnar <mingo@kernel.org> Change-Id: Iba93bbda71125ce14519aa30b1beac2bf3c52e1e Fixes: Change-Id: I5d1a9f273111c0bb7503ca3c9ad2406d12867cb5 ("UPSTREAM: sched/fair: Propagate asynchrous detach") Signed-off-by: Amit Pundir <amit.pundir@linaro.org>	2017-09-23 01:23:27 +00:00
Steve Muckle	faf1269d4b	ANDROID: configs: remove config fragments The kernel config fragments for Android have moved into their own repository located at https://android.googlesource.com/kernel/configs/ Bug: 63994171 Change-Id: I837bac54cb5c90e6a6eb0f6f0ad5c90588c1a46a Signed-off-by: Steve Muckle <smuckle@google.com>	2017-09-15 16:17:26 -07:00
Greg Kroah-Hartman	f7d2974f34	Merge 4.9.50 into android-4.9 Changes in 4.9.50 mtd: nand: mxc: Fix mxc_v1 ooblayout mtd: nand: qcom: fix read failure without complete bootchain mtd: nand: qcom: fix config error for BCH nvme-fabrics: generate spec-compliant UUID NQNs btrfs: resume qgroup rescan on rw remount selftests/x86/fsgsbase: Test selectors 1, 2, and 3 mm/memory.c: fix mem_cgroup_oom_disable() call missing locktorture: Fix potential memory leak with rw lock test ALSA: msnd: Optimize / harden DSP and MIDI loops Bluetooth: Properly check L2CAP config option output buffer length ARM64: dts: marvell: armada-37xx: Fix GIC maintenance interrupt ARM: 8692/1: mm: abort uaccess retries upon fatal signal NFS: Fix 2 use after free issues in the I/O code NFS: Sync the correct byte range during synchronous writes xfs: XFS_IS_REALTIME_INODE() should be false if no rt device present Linux 4.9.50 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>	2017-09-14 09:14:13 -07:00
Yang Shi	d21f3eaa09	locktorture: Fix potential memory leak with rw lock test commit `f4dbba5919` upstream. When running locktorture module with the below commands with kmemleak enabled: $ modprobe locktorture torture_type=rw_lock_irq $ rmmod locktorture The below kmemleak got caught: root@10:~# echo scan > /sys/kernel/debug/kmemleak [ 323.197029] kmemleak: 2 new suspected memory leaks (see /sys/kernel/debug/kmemleak) root@10:~# cat /sys/kernel/debug/kmemleak unreferenced object 0xffffffc07592d500 (size 128): comm "modprobe", pid 368, jiffies 4294924118 (age 205.824s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 c3 7b 02 00 00 00 00 00 .........{...... 00 00 00 00 00 00 00 00 d7 9b 02 00 00 00 00 00 ................ backtrace: [<ffffff80081e5a88>] create_object+0x110/0x288 [<ffffff80086c6078>] kmemleak_alloc+0x58/0xa0 [<ffffff80081d5acc>] __kmalloc+0x234/0x318 [<ffffff80006fa130>] 0xffffff80006fa130 [<ffffff8008083ae4>] do_one_initcall+0x44/0x138 [<ffffff800817e28c>] do_init_module+0x68/0x1cc [<ffffff800811c848>] load_module+0x1a68/0x22e0 [<ffffff800811d340>] SyS_finit_module+0xe0/0xf0 [<ffffff80080836f0>] el0_svc_naked+0x24/0x28 [<ffffffffffffffff>] 0xffffffffffffffff unreferenced object 0xffffffc07592d480 (size 128): comm "modprobe", pid 368, jiffies 4294924118 (age 205.824s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 3b 6f 01 00 00 00 00 00 ........;o...... 00 00 00 00 00 00 00 00 23 6a 01 00 00 00 00 00 ........#j...... backtrace: [<ffffff80081e5a88>] create_object+0x110/0x288 [<ffffff80086c6078>] kmemleak_alloc+0x58/0xa0 [<ffffff80081d5acc>] __kmalloc+0x234/0x318 [<ffffff80006fa22c>] 0xffffff80006fa22c [<ffffff8008083ae4>] do_one_initcall+0x44/0x138 [<ffffff800817e28c>] do_init_module+0x68/0x1cc [<ffffff800811c848>] load_module+0x1a68/0x22e0 [<ffffff800811d340>] SyS_finit_module+0xe0/0xf0 [<ffffff80080836f0>] el0_svc_naked+0x24/0x28 [<ffffffffffffffff>] 0xffffffffffffffff It is because cxt.lwsa and cxt.lrsa don't get freed in module_exit, so free them in lock_torture_cleanup() and free writer_tasks if reader_tasks is failed at memory allocation. Signed-off-by: Yang Shi <yang.shi@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Cc: 石洋 <yang.s@alibaba-inc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-09-13 14:13:36 -07:00
Steve Muckle	9983305173	ANDROID: configs: require SYNC_FILE CONFIG_SYNC_FILE is required in the absence of CONFIG_SYNC. Change-Id: Ia52488a273c9e7b3080defe8cea19afc4ad21aaa Signed-off-by: Steve Muckle <smuckle@google.com>	2017-09-07 08:47:51 +00:00
Greg Kroah-Hartman	85e1c0178a	Merge 4.9.48 into android-4.9 Changes in 4.9.48 irqchip: mips-gic: SYNC after enabling GIC region i2c: ismt: Don't duplicate the receive length for block reads i2c: ismt: Return EMSGSIZE for block reads with bogus length crypto: algif_skcipher - only call put_page on referenced and used pages mm, uprobes: fix multiple free of ->uprobes_state.xol_area mm, madvise: ensure poisoned pages are removed from per-cpu lists ceph: fix readpage from fscache cpumask: fix spurious cpumask_of_node() on non-NUMA multi-node configs cpuset: Fix incorrect memory_pressure control file mapping alpha: uapi: Add support for __SANE_USERSPACE_TYPES__ CIFS: Fix maximum SMB2 header size CIFS: remove endian related sparse warning wl1251: add a missing spin_lock_init() lib/mpi: kunmap after finishing accessing buffer xfrm: policy: check policy direction value drm/ttm: Fix accounting error when fail to get pages for pool kvm: arm/arm64: Force reading uncached stage2 PGD epoll: fix race between ep_poll_callback(POLLFREE) and ep_free()/ep_remove() Linux 4.9.48 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>	2017-09-07 10:32:23 +02:00
Waiman Long	309e4dbfaf	cpuset: Fix incorrect memory_pressure control file mapping commit `1c08c22c87` upstream. The memory_pressure control file was incorrectly set up without a private value (0, by default). As a result, this control file was treated like memory_migrate on read. By adding back the FILE_MEMORY_PRESSURE private value, the correct memory pressure value will be returned. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org> Fixes: `7dbdb199d3` ("cgroup: replace cftype->mode with CFTYPE_WORLD_WRITABLE") Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-09-07 08:35:40 +02:00
Eric Biggers	17c564f629	mm, uprobes: fix multiple free of ->uprobes_state.xol_area commit `355627f518` upstream. Commit `7c05126793` ("mm, fork: make dup_mmap wait for mmap_sem for write killable") made it possible to kill a forking task while it is waiting to acquire its ->mmap_sem for write, in dup_mmap(). However, it was overlooked that this introduced an new error path before the new mm_struct's ->uprobes_state.xol_area has been set to NULL after being copied from the old mm_struct by the memcpy in dup_mm(). For a task that has previously hit a uprobe tracepoint, this resulted in the 'struct xol_area' being freed multiple times if the task was killed at just the right time while forking. Fix it by setting ->uprobes_state.xol_area to NULL in mm_init() rather than in uprobe_dup_mmap(). With CONFIG_UPROBE_EVENTS=y, the bug can be reproduced by the same C program given by commit `2b7e8665b4` ("fork: fix incorrect fput of ->exe_file causing use-after-free"), provided that a uprobe tracepoint has been set on the fork_thread() function. For example: $ gcc reproducer.c -o reproducer -lpthread $ nm reproducer \| grep fork_thread 0000000000400719 t fork_thread $ echo "p $PWD/reproducer:0x719" > /sys/kernel/debug/tracing/uprobe_events $ echo 1 > /sys/kernel/debug/tracing/events/uprobes/enable $ ./reproducer Here is the use-after-free reported by KASAN: BUG: KASAN: use-after-free in uprobe_clear_state+0x1c4/0x200 Read of size 8 at addr ffff8800320a8b88 by task reproducer/198 CPU: 1 PID: 198 Comm: reproducer Not tainted 4.13.0-rc7-00015-g36fde05f3fb5 #255 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-20170228_101828-anatol 04/01/2014 Call Trace: dump_stack+0xdb/0x185 print_address_description+0x7e/0x290 kasan_report+0x23b/0x350 __asan_report_load8_noabort+0x19/0x20 uprobe_clear_state+0x1c4/0x200 mmput+0xd6/0x360 do_exit+0x740/0x1670 do_group_exit+0x13f/0x380 get_signal+0x597/0x17d0 do_signal+0x99/0x1df0 exit_to_usermode_loop+0x166/0x1e0 syscall_return_slowpath+0x258/0x2c0 entry_SYSCALL_64_fastpath+0xbc/0xbe ... Allocated by task 199: save_stack_trace+0x1b/0x20 kasan_kmalloc+0xfc/0x180 kmem_cache_alloc_trace+0xf3/0x330 __create_xol_area+0x10f/0x780 uprobe_notify_resume+0x1674/0x2210 exit_to_usermode_loop+0x150/0x1e0 prepare_exit_to_usermode+0x14b/0x180 retint_user+0x8/0x20 Freed by task 199: save_stack_trace+0x1b/0x20 kasan_slab_free+0xa8/0x1a0 kfree+0xba/0x210 uprobe_clear_state+0x151/0x200 mmput+0xd6/0x360 copy_process.part.8+0x605f/0x65d0 _do_fork+0x1a5/0xbd0 SyS_clone+0x19/0x20 do_syscall_64+0x22f/0x660 return_from_SYSCALL_64+0x0/0x7a Note: without KASAN, you may instead see a "Bad page state" message, or simply a general protection fault. Link: http://lkml.kernel.org/r/20170830033303.17927-1-ebiggers3@gmail.com Fixes: `7c05126793` ("mm, fork: make dup_mmap wait for mmap_sem for write killable") Signed-off-by: Eric Biggers <ebiggers@google.com> Reported-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-09-07 08:35:39 +02:00

1 2 3 4 5 ...

23929 Commits