linux

mirror of https://github.com/hardkernel/linux.git synced 2026-04-02 19:23:01 +09:00

Author	SHA1	Message	Date
John Stultz	5311c740c0	UPSTREAM: time: Clean up CLOCK_MONOTONIC_RAW time handling (cherry pick from commit `fc6eead7c1`) Now that we fixed the sub-ns handling for CLOCK_MONOTONIC_RAW, remove the duplicitive tk->raw_time.tv_nsec, which can be stored in tk->tkr_raw.xtime_nsec (similarly to how its handled for monotonic time). Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Miroslav Lichvar <mlichvar@redhat.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Stephen Boyd <stephen.boyd@linaro.org> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Daniel Mentz <danielmentz@google.com> Tested-by: Daniel Mentz <danielmentz@google.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Bug: 20045882 Bug: 63737556 Change-Id: I243827d21b08703a09d2d2fe738a9258be224582	2017-11-29 13:44:16 -08:00
Greg Kroah-Hartman	f108c7d9b5	Merge 4.9.58 into android-4.9 Changes in 4.9.58 MIPS: Fix minimum alignment requirement of IRQ stack Revert "bsg-lib: don't free job in bsg_prepare_job" xen-netback: Use GFP_ATOMIC to allocate hash locking/lockdep: Add nest_lock integrity test watchdog: kempld: fix gcc-4.3 build irqchip/crossbar: Fix incorrect type of local variables initramfs: finish fput() before accessing any binary from initramfs mac80211_hwsim: check HWSIM_ATTR_RADIO_NAME length ALSA: hda: Add Geminilake HDMI codec ID qed: Don't use attention PTT for configuring BW mac80211: fix power saving clients handling in iwlwifi net/mlx4_en: fix overflow in mlx4_en_init_timestamp() staging: vchiq_2835_arm: Make cache-line-size a required DT property netfilter: nf_ct_expect: Change __nf_ct_expect_check() return value. iio: adc: xilinx: Fix error handling f2fs: do SSR for data when there is enough free space sched/fair: Update rq clock before changing a task's CPU affinity Btrfs: send, fix failure to rename top level inode due to name collision f2fs: do not wait for writeback in write_begin md/linear: shutup lockdep warnning sparc64: Migrate hvcons irq to panicked cpu net/mlx4_core: Fix VF overwrite of module param which disables DMFS on new probed PFs crypto: xts - Add ECB dependency mm/memory_hotplug: set magic number to page->freelist instead of page->lru.next ocfs2/dlmglue: prepare tracking logic to avoid recursive cluster lock slub: do not merge cache if slub_debug contains a never-merge flag scsi: scsi_dh_emc: return success in clariion_std_inquiry() ASoC: mediatek: add I2C dependency for CS42XX8 drm/amdgpu: refuse to reserve io mem for split VRAM buffers net: mvpp2: release reference to txq_cpu[] entry after unmapping qede: Prevent index problems in loopback test qed: Reserve doorbell BAR space for present CPUs qed: Read queue state before releasing buffer i2c: at91: ensure state is restored after suspending ceph: don't update_dentry_lease unless we actually got one ceph: fix bogus endianness change in ceph_ioctl_set_layout ceph: clean up unsafe d_parent accesses in build_dentry_path uapi: fix linux/rds.h userspace compilation errors uapi: fix linux/mroute6.h userspace compilation errors IB/hfi1: Use static CTLE with Preset 6 for integrated HFIs IB/hfi1: Allocate context data on memory node target/iscsi: Fix unsolicited data seq_end_offset calculation hrtimer: Catch invalid clockids again nfsd/callback: Cleanup callback cred on shutdown powerpc/perf: Add restrictions to PMC5 in power9 DD1 drm/nouveau/gr/gf100-: fix ccache error logging regulator: core: Resolve supplies before disabling unused regulators btmrvl: avoid double-disable_irq() race EDAC, mce_amd: Print IPID and Syndrome on a separate line cpufreq: CPPC: add ACPI_PROCESSOR dependency usb: dwc3: gadget: Correct ISOC DATA PIDs for short packets Linux 4.9.58 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>	2017-10-23 09:35:27 +02:00
Marc Zyngier	0c92e73293	hrtimer: Catch invalid clockids again [ Upstream commit `336a9cde10` ] commit `82e88ff1ea` ("hrtimer: Revert CLOCK_MONOTONIC_RAW support") removed unfortunately a sanity check in the hrtimer code which was part of that MONOTONIC_RAW patch series. It would have caught the bogus usage of CLOCK_MONOTONIC_RAW in the wireless code. So bring it back. It is way too easy to take any random clockid and feed it to the hrtimer subsystem. At best, it gets mapped to a monotonic base, but it would be better to just catch illegal values as early as possible. Detect invalid clockids, map them to CLOCK_MONOTONIC and emit a warning. [ tglx: Replaced the BUG by a WARN and gracefully map to CLOCK_MONOTONIC ] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Tomasz Nowicki <tn@semihalf.com> Cc: Christoffer Dall <christoffer.dall@linaro.org> Link: http://lkml.kernel.org/r/1452879670-16133-3-git-send-email-marc.zyngier@arm.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-21 17:21:38 +02:00
Greg Kroah-Hartman	379e3b2a6d	Merge 4.9.53 into android-4.9 Changes in 4.9.53 cifs: release cifs root_cred after exit_cifs cifs: release auth_key.response for reconnect. fs/proc: Report eip/esp in /prod/PID/stat for coredumping mac80211: fix VLAN handling with TXQs mac80211_hwsim: Use proper TX power mac80211: flush hw_roc_start work before cancelling the ROC genirq: Make sparse_irq_lock protect what it should protect KVM: PPC: Book3S: Fix race and leak in kvm_vm_ioctl_create_spapr_tce() KVM: PPC: Book3S HV: Protect updates to spapr_tce_tables list tracing: Fix trace_pipe behavior for instance traces tracing: Erase irqsoff trace with empty write md/raid5: fix a race condition in stripe batch md/raid5: preserve STRIPE_ON_UNPLUG_LIST in break_stripe_batch_list scsi: scsi_transport_iscsi: fix the issue that iscsi_if_rx doesn't parse nlmsg properly drm/radeon: disable hard reset in hibernate for APUs crypto: drbg - fix freeing of resources crypto: talitos - Don't provide setkey for non hmac hashing algs. crypto: talitos - fix sha224 crypto: talitos - fix hashing security/keys: properly zero out sensitive key material in big_key security/keys: rewrite all of big_key crypto KEYS: fix writing past end of user-supplied buffer in keyring_read() KEYS: prevent creating a different user's keyrings KEYS: prevent KEYCTL_READ on negative key powerpc/pseries: Fix parent_dn reference leak in add_dt_node() powerpc/tm: Flush TM only if CPU has TM feature powerpc/ftrace: Pass the correct stack pointer for DYNAMIC_FTRACE_WITH_REGS s390/mm: fix write access check in gup_huge_pmd() PM: core: Fix device_pm_check_callbacks() Fix SMB3.1.1 guest authentication to Samba SMB3: Warn user if trying to sign connection that authenticated as guest SMB: Validate negotiate (to protect against downgrade) even if signing off SMB3: Don't ignore O_SYNC/O_DSYNC and O_DIRECT flags vfs: Return -ENXIO for negative SEEK_HOLE / SEEK_DATA offsets nl80211: check for the required netlink attributes presence bsg-lib: don't free job in bsg_prepare_job iw_cxgb4: remove the stid on listen create failure iw_cxgb4: put ep reference in pass_accept_req() selftests/seccomp: Support glibc 2.26 siginfo_t.h seccomp: fix the usage of get/put_seccomp_filter() in seccomp_get_filter() arm64: Make sure SPsel is always set arm64: fault: Route pte translation faults via do_translation_fault KVM: VMX: extract __pi_post_block KVM: VMX: avoid double list add with VT-d posted interrupts KVM: VMX: simplify and fix vmx_vcpu_pi_load kvm/x86: Handle async PF in RCU read-side critical sections KVM: VMX: Do not BUG() on out-of-bounds guest IRQ kvm: nVMX: Don't allow L2 to access the hardware CR8 xfs: validate bdev support for DAX inode flag etnaviv: fix gem object list corruption PCI: Fix race condition with driver_override btrfs: fix NULL pointer dereference from free_reloc_roots() btrfs: propagate error to btrfs_cmp_data_prepare caller btrfs: prevent to set invalid default subvolid x86/mm: Fix fault error path using unsafe vma pointer x86/fpu: Don't let userspace set bogus xcomp_bv gfs2: Fix debugfs glocks dump timer/sysclt: Restrict timer migration sysctl values to 0 and 1 KVM: VMX: do not change SN bit in vmx_update_pi_irte() KVM: VMX: remove WARN_ON_ONCE in kvm_vcpu_trigger_posted_interrupt cxl: Fix driver use count KVM: VMX: use cmpxchg64 video: fbdev: aty: do not leak uninitialized padding in clk to userspace swiotlb-xen: implement xen_swiotlb_dma_mmap callback Linux 4.9.53 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>	2017-10-05 10:37:37 +02:00
Myungho Jung	4c00015385	timer/sysclt: Restrict timer migration sysctl values to 0 and 1 commit `b94bf594cf` upstream. timer_migration sysctl acts as a boolean switch, so the allowed values should be restricted to 0 and 1. Add the necessary extra fields to the sysctl table entry to enforce that. [ tglx: Rewrote changelog ] Signed-off-by: Myungho Jung <mhjungk@gmail.com> Link: http://lkml.kernel.org/r/1492640690-3550-1-git-send-email-mhjungk@gmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Kazuhiro Hayashi <kazuhiro3.hayashi@toshiba.co.jp> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-10-05 09:44:04 +02:00
Greg Kroah-Hartman	a3840b1234	Merge 4.9.46 into android-4.9 Changes in 4.9.46 sparc64: remove unnecessary log message af_key: do not use GFP_KERNEL in atomic contexts dccp: purge write queue in dccp_destroy_sock() dccp: defer ccid_hc_tx_delete() at dismantle time ipv4: fix NULL dereference in free_fib_info_rcu() net_sched/sfq: update hierarchical backlog when drop packet net_sched: remove warning from qdisc_hash_add bpf: fix bpf_trace_printk on 32 bit archs openvswitch: fix skb_panic due to the incorrect actions attrlen ptr_ring: use kmalloc_array() ipv4: better IP_MAX_MTU enforcement nfp: fix infinite loop on umapping cleanup sctp: fully initialize the IPv6 address in sctp_v6_to_addr() tipc: fix use-after-free ipv6: reset fn->rr_ptr when replacing route ipv6: repair fib6 tree in failure case tcp: when rearming RTO, if RTO time is in past then fire RTO ASAP net/mlx4_core: Enable 4K UAR if SRIOV module parameter is not enabled irda: do not leak initialized list.dev to userspace net: sched: fix NULL pointer dereference when action calls some targets net_sched: fix order of queue length updates in qdisc_replace() bpf, verifier: add additional patterns to evaluate_reg_imm_alu bpf: adjust verifier heuristics bpf, verifier: fix alu ops against map_value{, _adj} register types bpf: fix mixed signed/unsigned derived min/max value bounds bpf/verifier: fix min/max handling in BPF_SUB Input: trackpoint - add new trackpoint firmware ID Input: elan_i2c - add ELAN0602 ACPI ID to support Lenovo Yoga310 Input: ALPS - fix two-finger scroll breakage in right side on ALPS touchpad KVM: s390: sthyi: fix sthyi inline assembly KVM: s390: sthyi: fix specification exception detection KVM: x86: block guest protection keys unless the host has them enabled ALSA: usb-audio: Add delay quirk for H650e/Jabra 550a USB headsets ALSA: core: Fix unexpected error at replacing user TLV ALSA: hda - Add stereo mic quirk for Lenovo G50-70 (17aa:3978) ALSA: firewire: fix NULL pointer dereference when releasing uninitialized data of iso-resource ARCv2: PAE40: Explicitly set MSB counterpart of SLC region ops addresses mm, shmem: fix handling /sys/kernel/mm/transparent_hugepage/shmem_enabled i2c: designware: Fix system suspend mm/madvise.c: fix freeing of locked page with MADV_FREE fork: fix incorrect fput of ->exe_file causing use-after-free mm/memblock.c: reversed logic in memblock_discard() drm: Release driver tracking before making the object available again drm/atomic: If the atomic check fails, return its value first drm: rcar-du: Fix crash in encoder failure error path drm: rcar-du: Fix display timing controller parameter drm: rcar-du: Fix H/V sync signal polarity configuration tracing: Call clear_boot_tracer() at lateinit_sync tracing: Fix kmemleak in tracing_map_array_free() tracing: Fix freeing of filter in create_filter() when set_str is false kbuild: linker script do not match C names unless LD_DEAD_CODE_DATA_ELIMINATION is configured cifs: Fix df output for users with quota limits cifs: return ENAMETOOLONG for overlong names in cifs_open()/cifs_lookup() nfsd: Limit end of page list when decoding NFSv4 WRITE ftrace: Check for null ret_stack on profile function graph entry function perf/core: Fix group {cpu,task} validation perf probe: Fix --funcs to show correct symbols for offline module perf/x86/intel/rapl: Make package handling more robust timers: Fix excessive granularity of new timers after a nohz idle x86/mm: Fix use-after-free of ldt_struct net: sunrpc: svcsock: fix NULL-pointer exception Revert "leds: handle suspend/resume in heartbeat trigger" netfilter: nat: fix src map lookup Bluetooth: hidp: fix possible might sleep error in hidp_session_thread Bluetooth: cmtp: fix possible might sleep error in cmtp_session Bluetooth: bnep: fix possible might sleep error in bnep_session Revert "android: binder: Sanity check at binder ioctl" binder: use group leader instead of open thread binder: Use wake up hint for synchronous transactions. ANDROID: binder: fix proc->tsk check. iio: imu: adis16480: Fix acceleration scale factor for adis16480 iio: hid-sensor-trigger: Fix the race with user space powering up sensors staging: rtl8188eu: add RNX-N150NUB support Clarify (and fix) MAX_LFS_FILESIZE macros ntb_transport: fix qp count bug ntb_transport: fix bug calculating num_qps_mw NTB: ntb_test: fix bug printing ntb_perf results ntb: no sleep in ntb_async_tx_submit ntb: ntb_test: ensure the link is up before trying to configure the mws ntb: transport shouldn't disable link due to bogus values in SPADs ACPI: ioapic: Clear on-stack resource before using it ACPI / APEI: Add missing synchronize_rcu() on NOTIFY_SCI removal ACPI: EC: Fix regression related to wrong ECDT initialization order powerpc/mm: Ensure cpumask update is ordered Linux 4.9.46 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>	2017-08-30 15:24:10 +02:00
Nicholas Piggin	70b3fd5ce2	timers: Fix excessive granularity of new timers after a nohz idle commit `2fe59f507a` upstream. When a timer base is idle, it is forwarded when a new timer is added to ensure that granularity does not become excessive. When not idle, the timer tick is expected to increment the base. However there are several problems: - If an existing timer is modified, the base is forwarded only after the index is calculated. - The base is not forwarded by add_timer_on. - There is a window after a timer is restarted from a nohz idle, after it is marked not-idle and before the timer tick on this CPU, where a timer may be added but the ancient base does not get forwarded. These result in excessive granularity (a 1 jiffy timeout can blow out to 100s of jiffies), which cause the rcu lockup detector to trigger, among other things. Fix this by keeping track of whether the timer base has been idle since it was last run or forwarded, and if so then forward it before adding a new timer. There is still a case where mod_timer optimises the case of a pending timer mod with the same expiry time, where the timer can see excessive granularity relative to the new, shorter interval. A comment is added, but it's not changed because it is an important fastpath for networking. This has been tested and found to fix the RCU softlockup messages. Testing was also done with tracing to measure requested versus achieved wakeup latencies for all non-deferrable timers in an idle system (with no lockup watchdogs running). Wakeup latency relative to absolute latency is calculated (note this suffers from round-up skew at low absolute times) and analysed: max avg std upstream 506.0 1.20 4.68 patched 2.0 1.08 0.15 The bug was noticed due to the lockup detector Kconfig changes dropping it out of people's .configs and resulting in larger base clk skew When the lockup detectors are enabled, no CPU can go idle for longer than 4 seconds, which limits the granularity errors. Sub-optimal timer behaviour is observable on a smaller scale in that case: max avg std upstream 9.0 1.05 0.19 patched 2.0 1.04 0.11 Fixes: Fixes: `a683f390b9` ("timers: Forward the wheel clock whenever possible") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: David Miller <davem@davemloft.net> Cc: dzickus@redhat.com Cc: sfr@canb.auug.org.au Cc: mpe@ellerman.id.au Cc: Stephen Boyd <sboyd@codeaurora.org> Cc: linuxarm@huawei.com Cc: abdhalee@linux.vnet.ibm.com Cc: John Stultz <john.stultz@linaro.org> Cc: akpm@linux-foundation.org Cc: paulmck@linux.vnet.ibm.com Cc: torvalds@linux-foundation.org Link: http://lkml.kernel.org/r/20170822084348.21436-1-npiggin@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-08-30 10:21:51 +02:00
Greg Kroah-Hartman	02f29ab1b9	Merge 4.9.42 into android-4.9 Changes in 4.9.42 parisc: Handle vma's whose context is not current in flush_cache_range cgroup: create dfl_root files on subsys registration cgroup: fix error return value from cgroup_subtree_control() libata: array underflow in ata_find_dev() workqueue: restore WQ_UNBOUND/max_active==1 to be ordered iwlwifi: dvm: prevent an out of bounds access brcmfmac: fix memleak due to calling brcmf_sdiod_sgtable_alloc() twice NFSv4: Fix EXCHANGE_ID corrupt verifier issue mmc: sdhci-of-at91: force card detect value for non removable devices device property: Make dev_fwnode() public mmc: core: Fix access to HS400-ES devices mm, mprotect: flush TLB if potentially racing with a parallel reclaim leaving stale TLB entries cpuset: fix a deadlock due to incomplete patching of cpusets_enabled() ALSA: hda - Fix speaker output from VAIO VPCL14M1R drm/amdgpu: Fix undue fallthroughs in golden registers initialization ASoC: do not close shared backend dailink KVM: async_pf: make rcu irq exit if not triggered from idle task mm/page_alloc: Remove kernel address exposure in free_reserved_area() timers: Fix overflow in get_next_timer_interrupt powerpc/tm: Fix saving of TM SPRs in core dump powerpc/64: Fix __check_irq_replay missing decrementer interrupt iommu/amd: Enable ga_log_intr when enabling guest_mode gpiolib: skip unwanted events, don't convert them to opposite edge ext4: fix SEEK_HOLE/SEEK_DATA for blocksize < pagesize ext4: fix overflow caused by missing cast in ext4_resize_fs() ARM: dts: armada-38x: Fix irq type for pca955 ARM: dts: tango4: Request RGMII RX and TX clock delays media: platform: davinci: return -EINVAL for VPFE_CMD_S_CCDC_RAW_PARAMS ioctl iscsi-target: Fix initial login PDU asynchronous socket close OOPs mmc: dw_mmc: Use device_property_read instead of of_property_read mmc: core: Use device_property_read instead of of_property_read media: lirc: LIRC_GET_REC_RESOLUTION should return microseconds f2fs: sanity check checkpoint segno and blkoff Btrfs: fix early ENOSPC due to delalloc saa7164: fix double fetch PCIe access condition tcp_bbr: cut pacing rate only if filled pipe tcp_bbr: introduce bbr_bw_to_pacing_rate() helper tcp_bbr: introduce bbr_init_pacing_rate_from_rtt() helper tcp_bbr: remove sk_pacing_rate=0 transient during init tcp_bbr: init pacing rate on first RTT sample ipv4: ipv6: initialize treq->txhash in cookie_v[46]_check() net: Zero terminate ifr_name in dev_ifname(). ipv6: avoid overflow of offset in ip6_find_1stfragopt net: dsa: b53: Add missing ARL entries for BCM53125 ipv4: initialize fib_trie prior to register_netdev_notifier call. rtnetlink: allocate more memory for dev_set_mac_address() mcs7780: Fix initialization when CONFIG_VMAP_STACK is enabled openvswitch: fix potential out of bound access in parse_ct packet: fix use-after-free in prb_retire_rx_blk_timer_expired() ipv6: Don't increase IPSTATS_MIB_FRAGFAILS twice in ip6_fragment() net: ethernet: nb8800: Handle all 4 RGMII modes identically dccp: fix a memleak that dccp_ipv6 doesn't put reqsk properly dccp: fix a memleak that dccp_ipv4 doesn't put reqsk properly dccp: fix a memleak for dccp_feat_init err process sctp: don't dereference ptr before leaving _sctp_walk_{params, errors}() sctp: fix the check for _sctp_walk_params and _sctp_walk_errors net/mlx5: Consider tx_enabled in all modes on remap net/mlx5: Fix command bad flow on command entry allocation failure net/mlx5e: Fix outer_header_zero() check size net/mlx5e: Fix wrong delay calculation for overflow check scheduling net/mlx5e: Schedule overflow check work to mlx5e workqueue net: phy: Correctly process PHY_HALTED in phy_stop_machine() xen-netback: correctly schedule rate-limited queues sparc64: Measure receiver forward progress to avoid send mondo timeout sparc64: Fix exception handling in UltraSPARC-III memcpy. wext: handle NULL extra data in iwe_stream_add_point better sh_eth: fix EESIPR values for SH77{34\|63} sh_eth: R8A7740 supports packet shecksumming net: phy: dp83867: fix irq generation tg3: Fix race condition in tg3_get_stats64(). x86/boot: Add missing declaration of string functions spi: spi-axi: Free resources on error path ASoC: rt5645: set sel_i2s_pre_div1 to 2 netfilter: use fwmark_reflect in nf_send_reset phy state machine: failsafe leave invalid RUNNING state ipv4: make tcp_notsent_lowat sysctl knob behave as true unsigned int clk/samsung: exynos542x: mark some clocks as critical scsi: qla2xxx: Get mutex lock before checking optrom_state drm/virtio: fix framebuffer sparse warning ARM: dts: sun8i: Support DTB build for NanoPi M1 ARM: dts: sunxi: Change node name for pwrseq pin on Olinuxino-lime2-emmc iw_cxgb4: do not send RX_DATA_ACK CPLs after close/abort nbd: blk_mq_init_queue returns an error code on failure, not NULL virtio_blk: fix panic in initialization error path ARM: 8632/1: ftrace: fix syscall name matching mm, slab: make sure that KMALLOC_MAX_SIZE will fit into MAX_ORDER lib/Kconfig.debug: fix frv build failure signal: protect SIGNAL_UNKILLABLE from unintentional clearing. mm: don't dereference struct page fields of invalid pages net/mlx5: E-Switch, Re-enable RoCE on mode change only after FDB destroy ipv4: Should use consistent conditional judgement for ip fragment in __ip_append_data and ip_finish_output net: account for current skb length when deciding about UFO net: phy: Fix PHY unbind crash workqueue: implicit ordered attribute should be overridable Linux 4.9.42 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>	2017-08-11 13:55:02 -07:00
Matija Glavinic Pecotic	9ef8b23b94	timers: Fix overflow in get_next_timer_interrupt commit `34f41c0316` upstream. For e.g. HZ=100, timer being 430 jiffies in the future, and 32 bit unsigned int, there is an overflow on unsigned int right-hand side of the expression which results with wrong values being returned. Type cast the multiplier to 64bit to avoid that issue. Fixes: `46c8f0b077` ("timers: Fix get_next_timer_interrupt() computation") Signed-off-by: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nokia.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexander Sverdlin <alexander.sverdlin@nokia.com> Cc: khilman@baylibre.com Cc: akpm@linux-foundation.org Link: http://lkml.kernel.org/r/a7900f04-2a21-c9fd-67be-ab334d459ee5@nokia.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-08-11 08:49:30 -07:00
Chris Redpath	595ae4adc5	UPSTREAM: cpufreq: schedutil: Avoid reducing frequency of busy CPUs prematurely The way the schedutil governor uses the PELT metric causes it to underestimate the CPU utilization in some cases. That can be easily demonstrated by running kernel compilation on a Sandy Bridge Intel processor, running turbostat in parallel with it and looking at the values written to the MSR_IA32_PERF_CTL register. Namely, the expected result would be that when all CPUs were 100% busy, all of them would be requested to run in the maximum P-state, but observation shows that this clearly isn't the case. The CPUs run in the maximum P-state for a while and then are requested to run slower and go back to the maximum P-state after a while again. That causes the actual frequency of the processor to visibly oscillate below the sustainable maximum in a jittery fashion which clearly is not desirable. That has been attributed to CPU utilization metric updates on task migration that cause the total utilization value for the CPU to be reduced by the utilization of the migrated task. If that happens, the schedutil governor may see a CPU utilization reduction and will attempt to reduce the CPU frequency accordingly right away. That may be premature, though, for example if the system is generally busy and there are other runnable tasks waiting to be run on that CPU already. This is unlikely to be an issue on systems where cpufreq policies are shared between multiple CPUs, because in those cases the policy utilization is computed as the maximum of the CPU utilization values over the whole policy and if that turns out to be low, reducing the frequency for the policy most likely is a good idea anyway. On systems with one CPU per policy, however, it may affect performance adversely and even lead to increased energy consumption in some cases. On those systems it may be addressed by taking another utilization metric into consideration, like whether or not the CPU whose frequency is about to be reduced has been idle recently, because if that's not the case, the CPU is likely to be busy in the near future and its frequency should not be reduced. To that end, use the counter of idle calls in the timekeeping code. Namely, make the schedutil governor look at that counter for the current CPU every time before its frequency is about to be reduced. If the counter has not changed since the previous iteration of the governor computations for that CPU, the CPU has been busy for all that time and its frequency should not be decreased, so if the new frequency would be lower than the one set previously, the governor will skip the frequency update. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Joel Fernandes <joelaf@google.com> (cherry picked from commit `b7eaf1aab9`) (simple CPUFREQ_RT_DL vs CPUFREQ_DL usage conflicts) Signed-off-by: Chris Redpath <chris.redpath@arm.com> Change-Id: I531ec02c052944ee07a904dc2a25c59948ee762b Signed-off-by: Quentin Perret <quentin.perret@arm.com>	2017-08-03 15:30:48 -07:00
Greg Kroah-Hartman	9ae2c670d8	Merge 4.9.40 into android-4.9 Changes in 4.9.40 disable new gcc-7.1.1 warnings for now ir-core: fix gcc-7 warning on bool arithmetic dm mpath: cleanup -Wbool-operation warning in choose_pgpath() s5p-jpeg: don't return a random width/height thermal: max77620: fix device-node reference imbalance thermal: cpu_cooling: Avoid accessing potentially freed structures ath9k: fix tx99 use after free ath9k: fix tx99 bus error ath9k: fix an invalid pointer dereference in ath9k_rng_stop() NFC: fix broken device allocation NFC: nfcmrvl_uart: add missing tty-device sanity check NFC: nfcmrvl: do not use device-managed resources NFC: nfcmrvl: use nfc-device for firmware download NFC: nfcmrvl: fix firmware-management initialisation nfc: Ensure presence of required attributes in the activate_target handler nfc: Fix the sockaddr length sanitization in llcp_sock_connect NFC: Add sockaddr length checks before accessing sa_family in bind handlers perf intel-pt: Move decoder error setting into one condition perf intel-pt: Improve sample timestamp perf intel-pt: Fix missing stack clear perf intel-pt: Ensure IP is zero when state is INTEL_PT_STATE_NO_IP perf intel-pt: Fix last_ip usage perf intel-pt: Ensure never to set 'last_ip' when packet 'count' is zero perf intel-pt: Use FUP always when scanning for an IP perf intel-pt: Clear FUP flag on error Bluetooth: use constant time memory comparison for secret values wlcore: fix 64K page support btrfs: Don't clear SGID when inheriting ACLs igb: Explicitly select page 0 at initialization ASoC: compress: Derive substream from stream based on direction PM / Domains: Fix unsafe iteration over modified list of device links PM / Domains: Fix unsafe iteration over modified list of domain providers PM / Domains: Fix unsafe iteration over modified list of domains scsi: ses: do not add a device to an enclosure if enclosure_add_links() fails. scsi: Add STARGET_CREATED_REMOVE state to scsi_target_state iscsi-target: Add login_keys_workaround attribute for non RFC initiators xen/scsiback: Fix a TMR related use-after-free powerpc/pseries: Fix passing of pp0 in updatepp() and updateboltedpp() powerpc/64: Fix atomic64_inc_not_zero() to return an int powerpc: Fix emulation of mcrf in emulate_step() powerpc: Fix emulation of mfocrf in emulate_step() powerpc/asm: Mark cr0 as clobbered in mftb() powerpc/mm/radix: Properly clear process table entry af_key: Fix sadb_x_ipsecrequest parsing PCI: Work around poweroff & suspend-to-RAM issue on Macbook Pro 11 PCI: rockchip: Use normal register bank for config accessors PCI/PM: Restore the status of PCI devices across hibernation ipvs: SNAT packet replies only for NATed connections xhci: fix 20000ms port resume timeout xhci: Fix NULL pointer dereference when cleaning up streams for removed host xhci: Bad Ethernet performance plugged in ASM1042A host mxl111sf: Fix driver to use heap allocate buffers for USB messages usb: storage: return on error to avoid a null pointer dereference USB: cdc-acm: add device-id for quirky printer usb: renesas_usbhs: fix usbhsc_resume() for !USBHSF_RUNTIME_PWCTRL usb: renesas_usbhs: gadget: disable all eps when the driver stops md: don't use flush_signals in userspace processes x86/xen: allow userspace access during hypercalls cx88: Fix regression in initial video standard setting libnvdimm, btt: fix btt_rw_page not returning errors libnvdimm: fix badblock range handling of ARS range ext2: Don't clear SGID when inheriting ACLs Raid5 should update rdev->sectors after reshape s390/syscalls: Fix out of bounds arguments access drm/amd/amdgpu: Return error if initiating read out of range on vram drm/radeon/ci: disable mclk switching for high refresh rates (v2) drm/radeon: Fix eDP for single-display iMac10,1 (v2) ipmi: use rcu lock around call to intf->handlers->sender() ipmi:ssif: Add missing unlock in error branch xfs: Don't clear SGID when inheriting ACLs f2fs: sanity check size of nat and sit cache f2fs: Don't clear SGID when inheriting ACLs drm/ttm: Fix use-after-free in ttm_bo_clean_mm ovl: drop CAP_SYS_RESOURCE from saved mounter's credentials vfio: Fix group release deadlock vfio: New external user group/file match nvme-rdma: remove race conditions from IB signalling ftrace: Fix uninitialized variable in match_records() MIPS: Fix mips_atomic_set() retry condition MIPS: Fix mips_atomic_set() with EVA MIPS: Negate error syscall return in trace ubifs: Don't leak kernel memory to the MTD ACPI / EC: Drop EC noirq hooks to fix a regression Revert "ACPI / EC: Enable event freeze mode..." to fix a regression x86/acpi: Prevent out of bound access caused by broken ACPI tables x86/ioapic: Pass the correct data to unmask_ioapic_irq() MIPS: Fix MIPS I ISA /proc/cpuinfo reporting MIPS: Save static registers before sysmips MIPS: Actually decode JALX in `__compute_return_epc_for_insn' MIPS: Fix unaligned PC interpretation in `compute_return_epc' MIPS: math-emu: Prevent wrong ISA mode instruction emulation MIPS: Send SIGILL for BPOSGE32 in `__compute_return_epc_for_insn' MIPS: Rename `sigill_r6' to `sigill_r2r6' in `__compute_return_epc_for_insn' MIPS: Send SIGILL for linked branches in `__compute_return_epc_for_insn' MIPS: Send SIGILL for R6 branches in `__compute_return_epc_for_insn' MIPS: Fix a typo: s/preset/present/ in r2-to-r6 emulation error message Input: i8042 - fix crash at boot time IB/iser: Fix connection teardown race condition IB/core: Namespace is mandatory input for address resolution sunrpc: use constant time memory comparison for mac NFS: only invalidate dentrys that are clearly invalid. udf: Fix deadlock between writeback and udf_setsize() target: Fix COMPARE_AND_WRITE caw_sem leak during se_cmd quiesce iser-target: Avoid isert_conn->cm_id dereference in isert_login_recv_done perf annotate: Fix broken arrow at row 0 connecting jmp instruction to its target Revert "perf/core: Drop kernel samples even though :u is specified" staging: rtl8188eu: add TL-WN722N v2 support staging: comedi: ni_mio_common: fix AO timer off-by-one regression staging: sm750fb: avoid conflicting vesafb staging: lustre: ko2iblnd: check copy_from_iter/copy_to_iter return code ceph: fix race in concurrent readdir RDMA/core: Initialize port_num in qp_attr drm/mst: Fix error handling during MST sideband message reception drm/mst: Avoid dereferencing a NULL mstb in drm_dp_mst_handle_up_req() drm/mst: Avoid processing partially received up/down message transactions mlx5: Avoid that mlx5_ib_sg_to_klms() overflows the klms[] array hfsplus: Don't clear SGID when inheriting ACLs ovl: fix random return value on mount acpi/nfit: Fix memory corruption/Unregister mce decoder on failure of: device: Export of_device_{get_modalias, uvent_modalias} to modules spmi: Include OF based modalias in device uevent reiserfs: Don't clear SGID when inheriting ACLs PM / Domains: defer dev_pm_domain_set() until genpd->attach_dev succeeds if present tracing: Fix kmemleak in instance_rmdir alarmtimer: don't rate limit one-shot timers Linux 4.9.40 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>	2017-07-27 15:24:43 -07:00
Greg Hackmann	91af5f04cd	alarmtimer: don't rate limit one-shot timers Commit `ff86bf0c65` ("alarmtimer: Rate limit periodic intervals") sets a minimum bound on the alarm timer interval. This minimum bound shouldn't be applied if the interval is 0. Otherwise, one-shot timers will be converted into periodic ones. Fixes: `ff86bf0c65` ("alarmtimer: Rate limit periodic intervals") Reported-by: Ben Fennema <fennema@google.com> Signed-off-by: Greg Hackmann <ghackmann@google.com> Cc: stable@vger.kernel.org Cc: John Stultz <john.stultz@linaro.org> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-27 15:08:08 -07:00
Greg Kroah-Hartman	75d78c7eda	Merge 4.9.35 into android-4.9 Changes in 4.9.35 clk: sunxi-ng: a31: Correct lcd1-ch1 clock register offset xen/blkback: fix disconnect while I/Os in flight xen-blkback: don't leak stack data via response ring ALSA: firewire-lib: Fix stall of process context at packet error ALSA: pcm: Don't treat NULL chmap as a fatal error fs/exec.c: account for argv/envp pointers powerpc/perf: Fix oops when kthread execs user process autofs: sanity check status reported with AUTOFS_DEV_IOCTL_FAIL lib/cmdline.c: fix get_options() overflow while parsing ranges perf/x86/intel: Add 1G DTLB load/store miss support for SKL KVM: s390: gaccess: fix real-space designation asce handling for gmap shadows KVM: PPC: Book3S HV: Preserve userspace HTM state properly KVM: PPC: Book3S HV: Context-switch EBB registers properly CIFS: Improve readdir verbosity cxgb4: notify uP to route ctrlq compl to rdma rspq HID: Add quirk for Dell PIXART OEM mouse signal: Only reschedule timers on signals timers have sent powerpc/kprobes: Pause function_graph tracing during jprobes handling powerpc/64s: Handle data breakpoints in Radix mode Input: i8042 - add Fujitsu Lifebook AH544 to notimeout list brcmfmac: add parameter to pass error code in firmware callback brcmfmac: use firmware callback upon failure to load brcmfmac: unbind all devices upon failure in firmware callback time: Fix clock->read(clock) race around clocksource changes time: Fix CLOCK_MONOTONIC_RAW sub-nanosecond accounting arm64/vdso: Fix nsec handling for CLOCK_MONOTONIC_RAW target: Fix kref->refcount underflow in transport_cmd_finish_abort iscsi-target: Fix delayed logout processing greater than SECONDS_FOR_LOGOUT_COMP iscsi-target: Reject immediate data underflow larger than SCSI transfer length drm/radeon: add a PX quirk for another K53TK variant drm/radeon: add a quirk for Toshiba Satellite L20-183 drm/amdgpu/atom: fix ps allocation size for EnableDispPowerGating drm/amdgpu: adjust default display clock rxrpc: Fix several cases where a padded len isn't checked in ticket decode of: Add check to of_scan_flat_dt() before accessing initial_boot_params mtd: spi-nor: fix spansion quad enable usb: gadget: f_fs: avoid out of bounds access on comp_desc rt2x00: avoid introducing a USB dependency in the rt2x00lib module net: phy: Initialize mdio clock at probe function dmaengine: bcm2835: Fix cyclic DMA period splitting spi: double time out tolerance net: phy: fix marvell phy status reading jump label: fix passing kbuild_cflags when checking for asm goto support brcmfmac: fix uninitialized warning in brcmf_usb_probe_phase2() Linux 4.9.35 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>	2017-06-29 14:26:07 +02:00
John Stultz	a53bfdda06	time: Fix CLOCK_MONOTONIC_RAW sub-nanosecond accounting commit `3d88d56c58` upstream. Due to how the MONOTONIC_RAW accumulation logic was handled, there is the potential for a 1ns discontinuity when we do accumulations. This small discontinuity has for the most part gone un-noticed, but since ARM64 enabled CLOCK_MONOTONIC_RAW in their vDSO clock_gettime implementation, we've seen failures with the inconsistency-check test in kselftest. This patch addresses the issue by using the same sub-ns accumulation handling that CLOCK_MONOTONIC uses, which avoids the issue for in-kernel users. Since the ARM64 vDSO implementation has its own clock_gettime calculation logic, this patch reduces the frequency of errors, but failures are still seen. The ARM64 vDSO will need to be updated to include the sub-nanosecond xtime_nsec values in its calculation for this issue to be completely fixed. Signed-off-by: John Stultz <john.stultz@linaro.org> Tested-by: Daniel Mentz <danielmentz@google.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Stephen Boyd <stephen.boyd@linaro.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Miroslav Lichvar <mlichvar@redhat.com> Link: http://lkml.kernel.org/r/1496965462-20003-3-git-send-email-john.stultz@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:00:30 +02:00
John Stultz	02a37ccd63	time: Fix clock->read(clock) race around clocksource changes commit `ceea5e3771` upstream. In tests, which excercise switching of clocksources, a NULL pointer dereference can be observed on AMR64 platforms in the clocksource read() function: u64 clocksource_mmio_readl_down(struct clocksource *c) { return ~(u64)readl_relaxed(to_mmio_clksrc(c)->reg) & c->mask; } This is called from the core timekeeping code via: cycle_now = tkr->read(tkr->clock); tkr->read is the cached tkr->clock->read() function pointer. When the clocksource is changed then tkr->clock and tkr->read are updated sequentially. The code above results in a sequential load operation of tkr->read and tkr->clock as well. If the store to tkr->clock hits between the loads of tkr->read and tkr->clock, then the old read() function is called with the new clock pointer. As a consequence the read() function dereferences a different data structure and the resulting 'reg' pointer can point anywhere including NULL. This problem was introduced when the timekeeping code was switched over to use struct tk_read_base. Before that, it was theoretically possible as well when the compiler decided to reload clock in the code sequence: now = tk->clock->read(tk->clock); Add a helper function which avoids the issue by reading tk_read_base->clock once into a local variable clk and then issue the read function via clk->read(clk). This guarantees that the read() function always gets the proper clocksource pointer handed in. Since there is now no use for the tkr.read pointer, this patch also removes it, and to address stopping the fast timekeeper during suspend/resume, it introduces a dummy clocksource to use rather then just a dummy read function. Signed-off-by: John Stultz <john.stultz@linaro.org> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Stephen Boyd <stephen.boyd@linaro.org> Cc: Miroslav Lichvar <mlichvar@redhat.com> Cc: Daniel Mentz <danielmentz@google.com> Link: http://lkml.kernel.org/r/1496965462-20003-2-git-send-email-john.stultz@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:00:30 +02:00
Greg Kroah-Hartman	7172a93a70	Merge 4.9.34 into android-4.9 Changes in 4.9.34 fs: pass on flags in compat_writev configfs: Fix race between create_link and configfs_rmdir can: gs_usb: fix memory leak in gs_cmd_reset() ila_xlat: add missing hash secret initialization cpufreq: conservative: Allow down_threshold to take values from 1 to 10 vb2: Fix an off by one error in 'vb2_plane_vaddr' mac80211: don't look at the PM bit of BAR frames mac80211/wpa: use constant time memory comparison for MACs drm/amdgpu: Fix overflow of watermark calcs at > 4k resolutions. drm/i915: Fix GVT-g PVINFO version compatibility check usb: musb: dsps: keep VBUS on for host-only mode mac80211: fix CSA in IBSS mode mac80211: fix packet statistics for fast-RX mac80211: fix IBSS presp allocation size mac80211: strictly check mesh address extension mode mac80211: fix dropped counter in multiqueue RX mac80211: don't send SMPS action frame in AP mode when not needed drm/mediatek: fix mtk_hdmi_setup_vendor_specific_infoframe mistake drm/vc4: Fix OOPSes from trying to cache a partially constructed BO. serial: efm32: Fix parity management in 'efm32_uart_console_get_options()' serial: sh-sci: Fix late enablement of AUTORTS x86/mm/32: Set the '__vmalloc_start_set' flag in initmem_init() mfd: omap-usb-tll: Fix inverted bit use for USB TLL mode staging: rtl8188eu: prevent an underflow in rtw_check_beacon_data() staging: iio: tsl2x7x_core: Fix standard deviation calculation iio: st_pressure: Fix data sign iio: proximity: as3935: recalibrate RCO after resume iio: adc: ti_am335x_adc: allocating too much in probe IB/mlx5: Fix kernel to user leak prevention logic usb: gadget: udc: renesas_usb3: fix pm_runtime functions calling usb: gadget: udc: renesas_usb3: fix deadlock by spinlock usb: gadget: udc: renesas_usb3: lock for PN_ registers access USB: hub: fix SS max number of ports usb: core: fix potential memory leak in error path during hcd creation USB: usbip: fix nonconforming hub descriptor pvrusb2: reduce stack usage pvr2_eeprom_analyze() USB: gadget: dummy_hcd: fix hub-descriptor removable fields usb: r8a66597-hcd: select a different endpoint on timeout usb: r8a66597-hcd: decrease timeout ath10k: fix napi crash during rmmod when probe firmware fails misc: mic: double free on ioctl error path drivers/misc/c2port/c2port-duramar2150.c: checking for NULL instead of IS_ERR() usb: xhci: Fix USB 3.1 supported protocol parsing usb: xhci: ASMedia ASM1042A chipset need shorts TX quirk USB: gadget: fix GPF in gadgetfs USB: gadgetfs, dummy-hcd, net2280: fix locking for callbacks mm/memory-failure.c: use compound_head() flags for huge pages swap: cond_resched in swap_cgroup_prepare() iio: imu: inv_mpu6050: add accel lpf setting for chip >= MPU6500 sched/core: Idle_task_exit() shouldn't use switch_mm_irqs_off() genirq: Release resources in __setup_irq() error path alarmtimer: Prevent overflow of relative timers usb: gadget: composite: Fix function used to free memory usb: dwc3: exynos fix axius clock error path to do cleanup MIPS: Fix bnezc/jialc return address calculation MIPS: .its targets depend on vmlinux vTPM: Fix missing NULL check crypto: Work around deallocated stack frame reference gcc bug on sparc. alarmtimer: Rate limit periodic intervals mm: larger stack guard gap, between vmas Allow stack to grow up to address space limit mm: fix new crash in unmapped_area_topdown() Linux 4.9.34 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>	2017-06-27 10:43:56 +02:00
Thomas Gleixner	04651048c7	alarmtimer: Rate limit periodic intervals commit `ff86bf0c65` upstream. The alarmtimer code has another source of potentially rearming itself too fast. Interval timers with a very samll interval have a similar CPU hog effect as the previously fixed overflow issue. The reason is that alarmtimers do not implement the normal protection against this kind of problem which the other posix timer use: timer expires -> queue signal -> deliver signal -> rearm timer This scheme brings the rearming under scheduler control and prevents permanently firing timers which hog the CPU. Bringing this scheme to the alarm timer code is a major overhaul because it lacks all the necessary mechanisms completely. So for a quick fix limit the interval to one jiffie. This is not problematic in practice as alarmtimers are usually backed by an RTC for suspend which have 1 second resolution. It could be therefor argued that the resolution of this clock should be set to 1 second in general, but that's outside the scope of this fix. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Kostya Serebryany <kcc@google.com> Cc: syzkaller <syzkaller@googlegroups.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Dmitry Vyukov <dvyukov@google.com> Link: http://lkml.kernel.org/r/20170530211655.896767100@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:11:17 +02:00
Thomas Gleixner	8ee7f06f4d	alarmtimer: Prevent overflow of relative timers commit `f4781e76f9` upstream. Andrey reported a alartimer related RCU stall while fuzzing the kernel with syzkaller. The reason for this is an overflow in ktime_add() which brings the resulting time into negative space and causes immediate expiry of the timer. The following rearm with a small interval does not bring the timer back into positive space due to the same issue. This results in a permanent firing alarmtimer which hogs the CPU. Use ktime_add_safe() instead which detects the overflow and clamps the result to KTIME_SEC_MAX. Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Kostya Serebryany <kcc@google.com> Cc: syzkaller <syzkaller@googlegroups.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Dmitry Vyukov <dvyukov@google.com> Link: http://lkml.kernel.org/r/20170530211655.802921648@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:11:17 +02:00
Todd Kjos	0455ac9d3c	Merge branch 'upstream-linux-4.9.y' into android-4.9	2017-02-23 14:29:44 -08:00
Sergey Senozhatsky	215d4d62cc	timekeeping: Use deferred printk() in debug code commit `f222449c9d` upstream. We cannot do printk() from tk_debug_account_sleep_time(), because tk_debug_account_sleep_time() is called under tk_core seq lock. The reason why printk() is unsafe there is that console_sem may invoke scheduler (up()->wake_up_process()->activate_task()), which, in turn, can return back to timekeeping code, for instance, via get_time()->ktime_get(), deadlocking the system on tk_core seq lock. [ 48.950592] ====================================================== [ 48.950622] [ INFO: possible circular locking dependency detected ] [ 48.950622] 4.10.0-rc7-next-20170213+ #101 Not tainted [ 48.950622] ------------------------------------------------------- [ 48.950622] kworker/0:0/3 is trying to acquire lock: [ 48.950653] (tk_core){----..}, at: [<c01cc624>] retrigger_next_event+0x4c/0x90 [ 48.950683] but task is already holding lock: [ 48.950683] (hrtimer_bases.lock){-.-...}, at: [<c01cc610>] retrigger_next_event+0x38/0x90 [ 48.950714] which lock already depends on the new lock. [ 48.950714] the existing dependency chain (in reverse order) is: [ 48.950714] -> #5 (hrtimer_bases.lock){-.-...}: [ 48.950744] _raw_spin_lock_irqsave+0x50/0x64 [ 48.950775] lock_hrtimer_base+0x28/0x58 [ 48.950775] hrtimer_start_range_ns+0x20/0x5c8 [ 48.950775] __enqueue_rt_entity+0x320/0x360 [ 48.950805] enqueue_rt_entity+0x2c/0x44 [ 48.950805] enqueue_task_rt+0x24/0x94 [ 48.950836] ttwu_do_activate+0x54/0xc0 [ 48.950836] try_to_wake_up+0x248/0x5c8 [ 48.950836] __setup_irq+0x420/0x5f0 [ 48.950836] request_threaded_irq+0xdc/0x184 [ 48.950866] devm_request_threaded_irq+0x58/0xa4 [ 48.950866] omap_i2c_probe+0x530/0x6a0 [ 48.950897] platform_drv_probe+0x50/0xb0 [ 48.950897] driver_probe_device+0x1f8/0x2cc [ 48.950897] __driver_attach+0xc0/0xc4 [ 48.950927] bus_for_each_dev+0x6c/0xa0 [ 48.950927] bus_add_driver+0x100/0x210 [ 48.950927] driver_register+0x78/0xf4 [ 48.950958] do_one_initcall+0x3c/0x16c [ 48.950958] kernel_init_freeable+0x20c/0x2d8 [ 48.950958] kernel_init+0x8/0x110 [ 48.950988] ret_from_fork+0x14/0x24 [ 48.950988] -> #4 (&rt_b->rt_runtime_lock){-.-...}: [ 48.951019] _raw_spin_lock+0x40/0x50 [ 48.951019] rq_offline_rt+0x9c/0x2bc [ 48.951019] set_rq_offline.part.2+0x2c/0x58 [ 48.951049] rq_attach_root+0x134/0x144 [ 48.951049] cpu_attach_domain+0x18c/0x6f4 [ 48.951049] build_sched_domains+0xba4/0xd80 [ 48.951080] sched_init_smp+0x68/0x10c [ 48.951080] kernel_init_freeable+0x160/0x2d8 [ 48.951080] kernel_init+0x8/0x110 [ 48.951080] ret_from_fork+0x14/0x24 [ 48.951110] -> #3 (&rq->lock){-.-.-.}: [ 48.951110] _raw_spin_lock+0x40/0x50 [ 48.951141] task_fork_fair+0x30/0x124 [ 48.951141] sched_fork+0x194/0x2e0 [ 48.951141] copy_process.part.5+0x448/0x1a20 [ 48.951171] _do_fork+0x98/0x7e8 [ 48.951171] kernel_thread+0x2c/0x34 [ 48.951171] rest_init+0x1c/0x18c [ 48.951202] start_kernel+0x35c/0x3d4 [ 48.951202] 0x8000807c [ 48.951202] -> #2 (&p->pi_lock){-.-.-.}: [ 48.951232] _raw_spin_lock_irqsave+0x50/0x64 [ 48.951232] try_to_wake_up+0x30/0x5c8 [ 48.951232] up+0x4c/0x60 [ 48.951263] __up_console_sem+0x2c/0x58 [ 48.951263] console_unlock+0x3b4/0x650 [ 48.951263] vprintk_emit+0x270/0x474 [ 48.951293] vprintk_default+0x20/0x28 [ 48.951293] printk+0x20/0x30 [ 48.951324] kauditd_hold_skb+0x94/0xb8 [ 48.951324] kauditd_thread+0x1a4/0x56c [ 48.951324] kthread+0x104/0x148 [ 48.951354] ret_from_fork+0x14/0x24 [ 48.951354] -> #1 ((console_sem).lock){-.....}: [ 48.951385] _raw_spin_lock_irqsave+0x50/0x64 [ 48.951385] down_trylock+0xc/0x2c [ 48.951385] __down_trylock_console_sem+0x24/0x80 [ 48.951385] console_trylock+0x10/0x8c [ 48.951416] vprintk_emit+0x264/0x474 [ 48.951416] vprintk_default+0x20/0x28 [ 48.951416] printk+0x20/0x30 [ 48.951446] tk_debug_account_sleep_time+0x5c/0x70 [ 48.951446] __timekeeping_inject_sleeptime.constprop.3+0x170/0x1a0 [ 48.951446] timekeeping_resume+0x218/0x23c [ 48.951477] syscore_resume+0x94/0x42c [ 48.951477] suspend_enter+0x554/0x9b4 [ 48.951477] suspend_devices_and_enter+0xd8/0x4b4 [ 48.951507] enter_state+0x934/0xbd4 [ 48.951507] pm_suspend+0x14/0x70 [ 48.951507] state_store+0x68/0xc8 [ 48.951538] kernfs_fop_write+0xf4/0x1f8 [ 48.951538] __vfs_write+0x1c/0x114 [ 48.951538] vfs_write+0xa0/0x168 [ 48.951568] SyS_write+0x3c/0x90 [ 48.951568] __sys_trace_return+0x0/0x10 [ 48.951568] -> #0 (tk_core){----..}: [ 48.951599] lock_acquire+0xe0/0x294 [ 48.951599] ktime_get_update_offsets_now+0x5c/0x1d4 [ 48.951629] retrigger_next_event+0x4c/0x90 [ 48.951629] on_each_cpu+0x40/0x7c [ 48.951629] clock_was_set_work+0x14/0x20 [ 48.951660] process_one_work+0x2b4/0x808 [ 48.951660] worker_thread+0x3c/0x550 [ 48.951660] kthread+0x104/0x148 [ 48.951690] ret_from_fork+0x14/0x24 [ 48.951690] other info that might help us debug this: [ 48.951690] Chain exists of: tk_core --> &rt_b->rt_runtime_lock --> hrtimer_bases.lock [ 48.951721] Possible unsafe locking scenario: [ 48.951721] CPU0 CPU1 [ 48.951721] ---- ---- [ 48.951721] lock(hrtimer_bases.lock); [ 48.951751] lock(&rt_b->rt_runtime_lock); [ 48.951751] lock(hrtimer_bases.lock); [ 48.951751] lock(tk_core); [ 48.951782] * DEADLOCK * [ 48.951782] 3 locks held by kworker/0:0/3: [ 48.951782] #0: ("events"){.+.+.+}, at: [<c0156590>] process_one_work+0x1f8/0x808 [ 48.951812] #1: (hrtimer_work){+.+...}, at: [<c0156590>] process_one_work+0x1f8/0x808 [ 48.951843] #2: (hrtimer_bases.lock){-.-...}, at: [<c01cc610>] retrigger_next_event+0x38/0x90 [ 48.951843] stack backtrace: [ 48.951873] CPU: 0 PID: 3 Comm: kworker/0:0 Not tainted 4.10.0-rc7-next-20170213+ [ 48.951904] Workqueue: events clock_was_set_work [ 48.951904] [<c0110208>] (unwind_backtrace) from [<c010c224>] (show_stack+0x10/0x14) [ 48.951934] [<c010c224>] (show_stack) from [<c04ca6c0>] (dump_stack+0xac/0xe0) [ 48.951934] [<c04ca6c0>] (dump_stack) from [<c019b5cc>] (print_circular_bug+0x1d0/0x308) [ 48.951965] [<c019b5cc>] (print_circular_bug) from [<c019d2a8>] (validate_chain+0xf50/0x1324) [ 48.951965] [<c019d2a8>] (validate_chain) from [<c019ec18>] (__lock_acquire+0x468/0x7e8) [ 48.951995] [<c019ec18>] (__lock_acquire) from [<c019f634>] (lock_acquire+0xe0/0x294) [ 48.951995] [<c019f634>] (lock_acquire) from [<c01d0ea0>] (ktime_get_update_offsets_now+0x5c/0x1d4) [ 48.952026] [<c01d0ea0>] (ktime_get_update_offsets_now) from [<c01cc624>] (retrigger_next_event+0x4c/0x90) [ 48.952026] [<c01cc624>] (retrigger_next_event) from [<c01e4e24>] (on_each_cpu+0x40/0x7c) [ 48.952056] [<c01e4e24>] (on_each_cpu) from [<c01cafc4>] (clock_was_set_work+0x14/0x20) [ 48.952056] [<c01cafc4>] (clock_was_set_work) from [<c015664c>] (process_one_work+0x2b4/0x808) [ 48.952087] [<c015664c>] (process_one_work) from [<c0157774>] (worker_thread+0x3c/0x550) [ 48.952087] [<c0157774>] (worker_thread) from [<c015d644>] (kthread+0x104/0x148) [ 48.952087] [<c015d644>] (kthread) from [<c0107830>] (ret_from_fork+0x14/0x24) Replace printk() with printk_deferred(), which does not call into the scheduler. Fixes: `0bf43f15db` ("timekeeping: Prints the amounts of time spent during suspend") Reported-and-tested-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: "Rafael J . Wysocki" <rjw@rjwysocki.net> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: John Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/20170215044332.30449-1-sergey.senozhatsky@gmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-02-23 17:44:36 +01:00
Dmitry Shmidt	cd08287396	Merge tag 'v4.9.6' into android-4.9 This is the 4.9.6 stable release Change-Id: I318df4b9d706d50c13fe3969d734117c25fc94bc	2017-01-31 13:55:27 -08:00
Joel Fernandes	72408a6328	UPSTREAM: timekeeping: Add a fast and NMI safe boot clock This boot clock can be used as a tracing clock and will account for suspend time. To keep it NMI safe since we're accessing from tracing, we're not using a separate timekeeper with updates to monotonic clock and boot offset protected with seqlocks. This has the following minor side effects: (1) Its possible that a timestamp be taken after the boot offset is updated but before the timekeeper is updated. If this happens, the new boot offset is added to the old timekeeping making the clock appear to update slightly earlier: CPU 0 CPU 1 timekeeping_inject_sleeptime64() __timekeeping_inject_sleeptime(tk, delta); timestamp(); timekeeping_update(tk, TK_CLEAR_NTP...); (2) On 32-bit systems, the 64-bit boot offset (tk->offs_boot) may be partially updated. Since the tk->offs_boot update is a rare event, this should be a rare occurrence which postprocessing should be able to handle. Bug: b/33184060 Change-Id: If79be2ed9d7a25ac39805b1fd81743026fc96575 Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Joel Fernandes <joelaf@google.com> Signed-off-by: John Stultz <john.stultz@linaro.org>	2017-01-27 13:55:47 -08:00
Thomas Gleixner	cf365b1173	tick/broadcast: Prevent NULL pointer dereference commit `c1a9eeb938` upstream. When a disfunctional timer, e.g. dummy timer, is installed, the tick core tries to setup the broadcast timer. If no broadcast device is installed, the kernel crashes with a NULL pointer dereference in tick_broadcast_setup_oneshot() because the function has no sanity check. Reported-by: Mason <slash.tmp@free.fr> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Anna-Maria Gleixner <anna-maria@linutronix.de> Cc: Richard Cochran <rcochran@linutronix.de> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org>, Cc: Sebastian Frias <sf84@laposte.net> Cc: Thibaud Cornic <thibaud_cornic@sigmadesigns.com> Cc: Robin Murphy <robin.murphy@arm.com> Link: http://lkml.kernel.org/r/1147ef90-7877-e4d2-bb2b-5c4fa8d3144b@free.fr Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-01-12 11:39:45 +01:00
Thomas Gleixner	ca22975afa	timekeeping_Force_unsigned_clocksource_to_nanoseconds_conversion commit `9c1645727b` upstream. The clocksource delta to nanoseconds conversion is using signed math, but the delta is unsigned. This makes the conversion space smaller than necessary and in case of a multiplication overflow the conversion can become negative. The conversion is done with scaled math: s64 nsec_delta = ((s64)clkdelta * clk->mult) >> clk->shift; Shifting a signed integer right obvioulsy preserves the sign, which has interesting consequences: - Time jumps backwards - __iter_div_u64_rem() which is used in one of the calling code pathes will take forever to piecewise calculate the seconds/nanoseconds part. This has been reported by several people with different scenarios: David observed that when stopping a VM with a debugger: "It was essentially the stopped by debugger case. I forget exactly why, but the guest was being explicitly stopped from outside, it wasn't just scheduling lag. I think it was something in the vicinity of 10 minutes stopped." When lifting the stop the machine went dead. The stopped by debugger case is not really interesting, but nevertheless it would be a good thing not to die completely. But this was also observed on a live system by Liav: "When the OS is too overloaded, delta will get a high enough value for the msb of the sum delta * tkr->mult + tkr->xtime_nsec to be set, and so after the shift the nsec variable will gain a value similar to 0xffffffffff000000." Unfortunately this has been reintroduced recently with commit `6bd58f09e1` ("time: Add cycles to nanoseconds translation"). It had been fixed a year ago already in commit `35a4933a89` ("time: Avoid signed overflow in timekeeping_get_ns()"). Though it's not surprising that the issue has been reintroduced because the function itself and the whole call chain uses s64 for the result and the propagation of it. The change in this recent commit is subtle: s64 nsec; - nsec = (d * m + n) >> s: + nsec = d * m + n; + nsec >>= s; d being type of cycle_t adds another level of obfuscation. This wouldn't have happened if the previous change to unsigned computation would have made the 'nsec' variable u64 right away and a follow up patch had cleaned up the whole call chain. There have been patches submitted which basically did a revert of the above patch leaving everything else unchanged as signed. Back to square one. This spawned a admittedly pointless discussion about potential users which rely on the unsigned behaviour until someone pointed out that it had been fixed before. The changelogs of said patches added further confusion as they made finally false claims about the consequences for eventual users which expect signed results. Despite delta being cycle_t, aka. u64, it's very well possible to hand in a signed negative value and the signed computation will happily return the correct result. But nobody actually sat down and analyzed the code which was added as user after the propably unintended signed conversion. Though in sensitive code like this it's better to analyze it proper and make sure that nothing relies on this than hunting the subtle wreckage half a year later. After analyzing all call chains it stands that no caller can hand in a negative value (which actually would work due to the s64 cast) and rely on the signed math to do the right thing. Change the conversion function to unsigned math. The conversion of all call chains is done in a follow up patch. This solves the starvation issue, which was caused by the negative result, but it does not solve the underlying problem. It merily procrastinates it. When the timekeeper update is deferred long enough that the unsigned multiplication overflows, then time going backwards is observable again. It does neither solve the issue of clocksources with a small counter width which will wrap around possibly several times and cause random time stamps to be generated. But those are usually not found on systems used for virtualization, so this is likely a non issue. I took the liberty to claim authorship for this simply because analyzing all callsites and writing the changelog took substantially more time than just making the simple s/s64/u64/ change and ignore the rest. Fixes: `6bd58f09e1` ("time: Add cycles to nanoseconds translation") Reported-by: David Gibson <david@gibson.dropbear.id.au> Reported-by: Liav Rehana <liavr@mellanox.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Parit Bhargava <prarit@redhat.com> Cc: Laurent Vivier <lvivier@redhat.com> Cc: "Christopher S. Hall" <christopher.s.hall@intel.com> Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: John Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/20161208204228.688545601@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-01-09 08:32:17 +01:00
Thomas Gleixner	6bad6bccf2	timers: Prevent base clock corruption when forwarding When a timer is enqueued we try to forward the timer base clock. This mechanism has two issues: 1) Forwarding a remote base unlocked The forwarding function is called from get_target_base() with the current timer base lock held. But if the new target base is a different base than the current base (can happen with NOHZ, sigh!) then the forwarding is done on an unlocked base. This can lead to corruption of base->clk. Solution is simple: Invoke the forwarding after the target base is locked. 2) Possible corruption due to jiffies advancing This is similar to the issue in get_net_timer_interrupt() which was fixed in the previous patch. jiffies can advance between check and assignement and therefore advancing base->clk beyond the next expiry value. So we need to read jiffies into a local variable once and do the checks and assignment with the local copy. Fixes: a683f390b93f("timers: Forward the wheel clock whenever possible") Reported-by: Ashton Holmes <scoopta@gmail.com> Reported-by: Michael Thayer <michael.thayer@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Michal Necasek <michal.necasek@oracle.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: knut.osmundsen@oracle.com Cc: stable@vger.kernel.org Cc: stern@rowland.harvard.edu Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20161022110552.253640125@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-10-25 16:32:50 +02:00
Thomas Gleixner	041ad7bc75	timers: Prevent base clock rewind when forwarding clock Ashton and Michael reported, that kernel versions 4.8 and later suffer from USB timeouts which are caused by the timer wheel rework. This is caused by a bug in the base clock forwarding mechanism, which leads to timers expiring early. The scenario which leads to this is: run_timers() while (jiffies >= base->clk) { collect_expired_timers(); base->clk++; expire_timers(); } So base->clk = jiffies + 1. Now the cpu goes idle: idle() get_next_timer_interrupt() nextevt = __next_time_interrupt(); if (time_after(nextevt, base->clk)) base->clk = jiffies; jiffies has not advanced since run_timers(), so this assignment effectively decrements base->clk by one. base->clk is the index into the timer wheel arrays. So let's assume the following state after the base->clk increment in run_timers(): jiffies = 0 base->clk = 1 A timer gets enqueued with an expiry delta of 63 ticks (which is the case with the USB timeout and HZ=250) so the resulting bucket index is: base->clk + delta = 1 + 63 = 64 The timer goes into the first wheel level. The array size is 64 so it ends up in bucket 0, which is correct as it takes 63 ticks to advance base->clk to index into bucket 0 again. If the cpu goes idle before jiffies advance, then the bug in the forwarding mechanism sets base->clk back to 0, so the next invocation of run_timers() at the next tick will index into bucket 0 and therefore expire the timer 62 ticks too early. Instead of blindly setting base->clk to jiffies we must make the forwarding conditional on jiffies > base->clk, but we cannot use jiffies for this as we might run into the following issue: if (time_after(jiffies, base->clk) { if (time_after(nextevt, base->clk)) base->clk = jiffies; jiffies can increment between the check and the assigment far enough to advance beyond nextevt. So we need to use a stable value for checking. get_next_timer_interrupt() has the basej argument which is the jiffies value snapshot taken in the calling code. So we can just that. Thanks to Ashton for bisecting and providing trace data! Fixes: `a683f390b9` ("timers: Forward the wheel clock whenever possible") Reported-by: Ashton Holmes <scoopta@gmail.com> Reported-by: Michael Thayer <michael.thayer@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Michal Necasek <michal.necasek@oracle.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: knut.osmundsen@oracle.com Cc: stable@vger.kernel.org Cc: stern@rowland.harvard.edu Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20161022110552.175308322@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-10-25 16:32:50 +02:00
Thomas Gleixner	4da9152a43	timers: Lock base for same bucket optimization Linus stumbled over the unlocked modification of the timer expiry value in mod_timer() which is an optimization for timers which stay in the same bucket - due to the bucket granularity - despite their expiry time getting updated. The optimization itself still makes sense even if we take the lock, because in case that the bucket stays the same, we avoid the pointless queue/enqueue dance. Make the check and the modification of timer->expires protected by the base lock and shuffle the remaining code around so we can keep the lock held when we actually have to requeue the timer to a different bucket. Fixes: `f00c0afdfa` ("timers: Implement optimization for same expiry time in mod_timer()") Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1610241711220.4983@nanos Cc: stable@vger.kernel.org Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org>	2016-10-25 16:27:39 +02:00
Thomas Gleixner	b831275a35	timers: Plug locking race vs. timer migration Linus noticed that lock_timer_base() lacks a READ_ONCE() for accessing the timer flags. As a consequence the compiler is allowed to reload the flags between the initial check for TIMER_MIGRATION and the following timer base computation and the spin lock of the base. While this has not been observed (yet), we need to make sure that it never happens. Fixes: `0eeda71bc3` ("timer: Replace timer base by a cpu index") Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1610241711220.4983@nanos Cc: stable@vger.kernel.org Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org>	2016-10-25 16:27:39 +02:00
Tobias Klauser	54e23845e9	alarmtimer: Remove unused but set variable Remove the set but unused variable base in alarm_clock_get to fix the following warning when building with 'W=1': kernel/time/alarmtimer.c: In function ‘alarm_timer_create’: kernel/time/alarmtimer.c:545:21: warning: variable ‘base’ set but not used [-Wunused-but-set-variable] Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Cc: John Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/20161017094702.10873-1-tklauser@distanz.ch Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-10-17 11:59:37 +02:00
Linus Torvalds	9ffc66941d	Merge tag 'gcc-plugins-v4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull gcc plugins update from Kees Cook: "This adds a new gcc plugin named "latent_entropy". It is designed to extract as much possible uncertainty from a running system at boot time as possible, hoping to capitalize on any possible variation in CPU operation (due to runtime data differences, hardware differences, SMP ordering, thermal timing variation, cache behavior, etc). At the very least, this plugin is a much more comprehensive example for how to manipulate kernel code using the gcc plugin internals" * tag 'gcc-plugins-v4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: latent_entropy: Mark functions with __latent_entropy gcc-plugins: Add latent_entropy plugin	2016-10-15 10:03:15 -07:00
Emese Revfy	0766f788eb	latent_entropy: Mark functions with __latent_entropy The __latent_entropy gcc attribute can be used only on functions and variables. If it is on a function then the plugin will instrument it for gathering control-flow entropy. If the attribute is on a variable then the plugin will initialize it with random contents. The variable must be an integer, an integer array type or a structure with integer fields. These specific functions have been selected because they are init functions (to help gather boot-time entropy), are called at unpredictable times, or they have variable loops, each of which provide some level of latent entropy. Signed-off-by: Emese Revfy <re.emese@gmail.com> [kees: expanded commit message] Signed-off-by: Kees Cook <keescook@chromium.org>	2016-10-10 14:51:45 -07:00
John Stultz	58bfea9532	timekeeping: Fix __ktime_get_fast_ns() regression In commit `27727df240` ("Avoid taking lock in NMI path with CONFIG_DEBUG_TIMEKEEPING"), I changed the logic to open-code the timekeeping_get_ns() function, but I forgot to include the unit conversion from cycles to nanoseconds, breaking the function's output, which impacts users like perf. This results in bogus perf timestamps like: swapper 0 [000] 253.427536: 111111111 cpu-clock: ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms]) swapper 0 [000] 254.426573: 111111111 cpu-clock: ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms]) swapper 0 [000] 254.426687: 111111111 cpu-clock: ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms]) swapper 0 [000] 254.426800: 111111111 cpu-clock: ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms]) swapper 0 [000] 254.426905: 111111111 cpu-clock: ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms]) swapper 0 [000] 254.427022: 111111111 cpu-clock: ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms]) swapper 0 [000] 254.427127: 111111111 cpu-clock: ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms]) swapper 0 [000] 254.427239: 111111111 cpu-clock: ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms]) swapper 0 [000] 254.427346: 111111111 cpu-clock: ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms]) swapper 0 [000] 254.427463: 111111111 cpu-clock: ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms]) swapper 0 [000] 255.426572: 111111111 cpu-clock: ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms]) Instead of more reasonable expected timestamps like: swapper 0 [000] 39.953768: 111111111 cpu-clock: ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms]) swapper 0 [000] 40.064839: 111111111 cpu-clock: ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms]) swapper 0 [000] 40.175956: 111111111 cpu-clock: ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms]) swapper 0 [000] 40.287103: 111111111 cpu-clock: ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms]) swapper 0 [000] 40.398217: 111111111 cpu-clock: ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms]) swapper 0 [000] 40.509324: 111111111 cpu-clock: ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms]) swapper 0 [000] 40.620437: 111111111 cpu-clock: ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms]) swapper 0 [000] 40.731546: 111111111 cpu-clock: ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms]) swapper 0 [000] 40.842654: 111111111 cpu-clock: ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms]) swapper 0 [000] 40.953772: 111111111 cpu-clock: ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms]) swapper 0 [000] 41.064881: 111111111 cpu-clock: ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms]) Add the proper use of timekeeping_delta_to_ns() to convert the cycle delta to nanoseconds as needed. Thanks to Brendan and Alexei for finding this quickly after the v4.8 release. Unfortunately the problematic commit has landed in some -stable trees so they'll need this fix as well. Many apologies for this mistake. I'll be looking to add a perf-clock sanity test to the kselftest timers tests soon. Fixes: `27727df240` "timekeeping: Avoid taking lock in NMI path with CONFIG_DEBUG_TIMEKEEPING" Reported-by: Brendan Gregg <bgregg@netflix.com> Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> Tested-and-reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: stable <stable@vger.kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1475636148-26539-1-git-send-email-john.stultz@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-10-05 15:44:46 +02:00
Wanpeng Li	57ccdf449f	tick/nohz: Prevent stopping the tick on an offline CPU can_stop_full_tick() has no check for offline cpus. So it allows to stop the tick on an offline cpu from the interrupt return path, which is wrong and subsequently makes irq_work_needs_cpu() warn about being called for an offline cpu. Commit `f7ea0fd639` ("tick: Don't invoke tick_nohz_stop_sched_tick() if the cpu is offline") added prevention for can_stop_idle_tick(), but forgot to do the same in can_stop_full_tick(). Add it. [ tglx: Massaged changelog ] Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/1473245473-4463-1-git-send-email-wanpeng.li@hotmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-09-13 17:53:52 +02:00
Ingo Molnar	950d8381d9	Merge branch 'linus' into timers/core, to refresh the branch Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-09-08 14:05:16 +02:00
Wanpeng Li	08d0725992	tick/nohz: Fix softlockup on scheduler stalls in kvm guest tick_nohz_start_idle() is prevented to be called if the idle tick can't be stopped since commit `1f3b0f8243` ("tick/nohz: Optimize nohz idle enter"). As a result, after suspend/resume the host machine, full dynticks kvm guest will softlockup: NMI watchdog: BUG: soft lockup - CPU#0 stuck for 26s! [swapper/0:0] Call Trace: default_idle+0x31/0x1a0 arch_cpu_idle+0xf/0x20 default_idle_call+0x2a/0x50 cpu_startup_entry+0x39b/0x4d0 rest_init+0x138/0x140 ? rest_init+0x5/0x140 start_kernel+0x4c1/0x4ce ? set_init_arg+0x55/0x55 ? early_idt_handler_array+0x120/0x120 x86_64_start_reservations+0x24/0x26 x86_64_start_kernel+0x142/0x14f In addition, cat /proc/stat \| grep cpu in guest or host: cpu 398 16 5049 15754 5490 0 1 46 0 0 cpu0 206 5 450 0 0 0 1 14 0 0 cpu1 81 0 3937 3149 1514 0 0 9 0 0 cpu2 45 6 332 6052 2243 0 0 11 0 0 cpu3 65 2 328 6552 1732 0 0 11 0 0 The idle and iowait states are weird 0 for cpu0(housekeeping). The bug is present in both guest and host kernels, and they both have cpu0's idle and iowait states issue, however, host kernel's suspend/resume path etc will touch watchdog to avoid the softlockup. - The watchdog will not be touched in tick_nohz_stop_idle path (need be touched since the scheduler stall is expected) if idle_active flags are not detected. - The idle and iowait states will not be accounted when exit idle loop (resched or interrupt) if idle start time and idle_active flags are not set. This patch fixes it by reverting commit `1f3b0f8243` since can't stop idle tick doesn't mean can't be idle. Fixes: `1f3b0f8243` ("tick/nohz: Optimize nohz idle enter") Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Cc: Sanjeev Yadav<sanjeev.yadav@spreadtrum.com> Cc: Gaurav Jindal<gaurav.jindal@spreadtrum.com> Cc: stable@vger.kernel.org Cc: kvm@vger.kernel.org Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: http://lkml.kernel.org/r/1472798303-4154-1-git-send-email-wanpeng.li@hotmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-09-02 10:25:40 +02:00
Vegard Nossum	979515c564	time: Avoid undefined behaviour in ktime_add_safe() I ran into this: ================================================================================ UBSAN: Undefined behaviour in kernel/time/hrtimer.c:310:16 signed integer overflow: 9223372036854775807 + 50000 cannot be represented in type 'long long int' CPU: 2 PID: 4798 Comm: trinity-c2 Not tainted 4.8.0-rc1+ #91 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014 0000000000000000 ffff88010ce6fb88 ffffffff82344740 0000000041b58ab3 ffffffff84f97a20 ffffffff82344694 ffff88010ce6fbb0 ffff88010ce6fb60 000000000000c350 ffff88010ce6f968 dffffc0000000000 ffffffff857bc320 Call Trace: [<ffffffff82344740>] dump_stack+0xac/0xfc [<ffffffff82344694>] ? _atomic_dec_and_lock+0xc4/0xc4 [<ffffffff8242df78>] ubsan_epilogue+0xd/0x8a [<ffffffff8242e6b4>] handle_overflow+0x202/0x23d [<ffffffff8242e4b2>] ? val_to_string.constprop.6+0x11e/0x11e [<ffffffff8236df71>] ? timerqueue_add+0x151/0x410 [<ffffffff81485c48>] ? hrtimer_start_range_ns+0x3b8/0x1380 [<ffffffff81795631>] ? memset+0x31/0x40 [<ffffffff8242e6fd>] __ubsan_handle_add_overflow+0xe/0x10 [<ffffffff81488ac9>] hrtimer_nanosleep+0x5d9/0x790 [<ffffffff814884f0>] ? hrtimer_init_sleeper+0x80/0x80 [<ffffffff813a9ffb>] ? __might_sleep+0x5b/0x260 [<ffffffff8148be10>] common_nsleep+0x20/0x30 [<ffffffff814906c7>] SyS_clock_nanosleep+0x197/0x210 [<ffffffff81490530>] ? SyS_clock_getres+0x150/0x150 [<ffffffff823c7113>] ? __this_cpu_preempt_check+0x13/0x20 [<ffffffff8162ef60>] ? __context_tracking_exit.part.3+0x30/0x1b0 [<ffffffff81490530>] ? SyS_clock_getres+0x150/0x150 [<ffffffff81007bd3>] do_syscall_64+0x1b3/0x4b0 [<ffffffff845f85aa>] entry_SYSCALL64_slow_path+0x25/0x25 ================================================================================ Add a new ktime_add_unsafe() helper which doesn't check for overflow, but doesn't throw a UBSAN warning when it does overflow either. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: John Stultz <john.stultz@linaro.org>	2016-08-31 14:43:36 -07:00
Vegard Nossum	469e857f37	time: Avoid undefined behaviour in timespec64_add_safe() I ran into this: ================================================================================ UBSAN: Undefined behaviour in kernel/time/time.c:783:2 signed integer overflow: 5273 + 9223372036854771711 cannot be represented in type 'long int' CPU: 0 PID: 17363 Comm: trinity-c0 Not tainted 4.8.0-rc1+ #88 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014 0000000000000000 ffff88011457f8f0 ffffffff82344f50 0000000041b58ab3 ffffffff84f98080 ffffffff82344ea4 ffff88011457f918 ffff88011457f8c8 ffff88011457f8e0 7fffffffffffefff ffff88011457f6d8 dffffc0000000000 Call Trace: [<ffffffff82344f50>] dump_stack+0xac/0xfc [<ffffffff82344ea4>] ? _atomic_dec_and_lock+0xc4/0xc4 [<ffffffff8242f4c8>] ubsan_epilogue+0xd/0x8a [<ffffffff8242fc04>] handle_overflow+0x202/0x23d [<ffffffff8242fa02>] ? val_to_string.constprop.6+0x11e/0x11e [<ffffffff823c7837>] ? debug_smp_processor_id+0x17/0x20 [<ffffffff8131b581>] ? __sigqueue_free.part.13+0x51/0x70 [<ffffffff8146d4e0>] ? rcu_is_watching+0x110/0x110 [<ffffffff8242fc4d>] __ubsan_handle_add_overflow+0xe/0x10 [<ffffffff81476ef8>] timespec64_add_safe+0x298/0x340 [<ffffffff81476c60>] ? timespec_add_safe+0x330/0x330 [<ffffffff812f7990>] ? wait_noreap_copyout+0x1d0/0x1d0 [<ffffffff8184bf18>] poll_select_set_timeout+0xf8/0x170 [<ffffffff8184be20>] ? poll_schedule_timeout+0x2b0/0x2b0 [<ffffffff813aa9bb>] ? __might_sleep+0x5b/0x260 [<ffffffff833c8a87>] __sys_recvmmsg+0x107/0x790 [<ffffffff833c8980>] ? SyS_recvmsg+0x20/0x20 [<ffffffff81486378>] ? hrtimer_start_range_ns+0x3b8/0x1380 [<ffffffff845f8bfb>] ? _raw_spin_unlock_irqrestore+0x3b/0x60 [<ffffffff8148bcea>] ? do_setitimer+0x39a/0x8e0 [<ffffffff813aa9bb>] ? __might_sleep+0x5b/0x260 [<ffffffff833c9110>] ? __sys_recvmmsg+0x790/0x790 [<ffffffff833c91e9>] SyS_recvmmsg+0xd9/0x160 [<ffffffff833c9110>] ? __sys_recvmmsg+0x790/0x790 [<ffffffff823c7853>] ? __this_cpu_preempt_check+0x13/0x20 [<ffffffff8162f680>] ? __context_tracking_exit.part.3+0x30/0x1b0 [<ffffffff833c9110>] ? __sys_recvmmsg+0x790/0x790 [<ffffffff81007bd3>] do_syscall_64+0x1b3/0x4b0 [<ffffffff845f936a>] entry_SYSCALL64_slow_path+0x25/0x25 ================================================================================ Line 783 is this: 783 set_normalized_timespec64(&res, lhs.tv_sec + rhs.tv_sec, 784 lhs.tv_nsec + rhs.tv_nsec); In other words, since lhs.tv_sec and rhs.tv_sec are both time64_t, this is a signed addition which will cause undefined behaviour on overflow. Note that this is not currently a huge concern since the kernel should be built with -fno-strict-overflow by default, but could be a problem in the future, a problem with older compilers, or other compilers than gcc. The easiest way to avoid the overflow is to cast one of the arguments to unsigned (so the addition will be done using unsigned arithmetic). Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: John Stultz <john.stultz@linaro.org>	2016-08-31 14:43:35 -07:00
Ruchi Kandoi	0bf43f15db	timekeeping: Prints the amounts of time spent during suspend In addition to keeping a histogram of suspend times, also print out the time spent in suspend to dmesg. This helps to keep track of suspend time while debugging using kernel logs. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Ruchi Kandoi <kandoiruchi@google.com> [jstultz: Tweaked commit message] Signed-off-by: John Stultz <john.stultz@linaro.org>	2016-08-31 14:43:34 -07:00
Kyle Walker	36374583f9	clocksource: Defer override invalidation unless clock is unstable Clocksources don't get the VALID_FOR_HRES flag until they have been checked by a watchdog. However, when using an override, the clocksource_select logic will clear the override value if the clocksource is not marked VALID_FOR_HRES during that inititial check. When using the boot arguments clocksource=<foo>, this selection can run before the watchdog, and can cause the override to be incorrectly cleared. To address this condition, the override_name is only invalidated for unstable clocksources. Otherwise, the override is left intact until after the watchdog has validated the clocksource as stable/unstable. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Kyle Walker <kwalker@redhat.com> Signed-off-by: John Stultz <john.stultz@linaro.org>	2016-08-31 14:43:33 -07:00
Pratyush Patel	b4d90e9f1e	hrtimer: Spelling fixes Fix a minor spelling error. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Pratyush Patel <pratyushpatel.1995@gmail.com> [jstultz: Added commit message] Signed-off-by: John Stultz <john.stultz@linaro.org>	2016-08-31 14:43:20 -07:00
John Stultz	a4f8f6667f	timekeeping: Cap array access in timekeeping_debug It was reported that hibernation could fail on the 2nd attempt, where the system hangs at hibernate() -> syscore_resume() -> i8237A_resume() -> claim_dma_lock(), because the lock has already been taken. However there is actually no other process would like to grab this lock on that problematic platform. Further investigation showed that the problem is triggered by setting /sys/power/pm_trace to 1 before the 1st hibernation. Since once pm_trace is enabled, the rtc becomes unmeaningful after suspend, and meanwhile some BIOSes would like to adjust the 'invalid' RTC (e.g, smaller than 1970) to the release date of that motherboard during POST stage, thus after resumed, it may seem that the system had a significant long sleep time which is a completely meaningless value. Then in timekeeping_resume -> tk_debug_account_sleep_time, if the bit31 of the sleep time happened to be set to 1, fls() returns 32 and we add 1 to sleep_time_bin[32], which causes an out of bounds array access and therefor memory being overwritten. As depicted by System.map: 0xffffffff81c9d080 b sleep_time_bin 0xffffffff81c9d100 B dma_spin_lock the dma_spin_lock.val is set to 1, which caused this problem. This patch adds a sanity check in tk_debug_account_sleep_time() to ensure we don't index past the sleep_time_bin array. [jstultz: Problem diagnosed and original patch by Chen Yu, I've solved the issue slightly differently, but borrowed his excelent explanation of the issue here.] Fixes: `5c83545f24` "power: Add option to log time spent in suspend" Reported-by: Janek Kozicki <cosurgi@gmail.com> Reported-by: Chen Yu <yu.c.chen@intel.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Cc: linux-pm@vger.kernel.org Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xunlei Pang <xpang@redhat.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: stable <stable@vger.kernel.org> Cc: Zhang Rui <rui.zhang@intel.com> Link: http://lkml.kernel.org/r/1471993702-29148-3-git-send-email-john.stultz@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-08-24 09:34:32 +02:00
John Stultz	27727df240	timekeeping: Avoid taking lock in NMI path with CONFIG_DEBUG_TIMEKEEPING When I added some extra sanity checking in timekeeping_get_ns() under CONFIG_DEBUG_TIMEKEEPING, I missed that the NMI safe __ktime_get_fast_ns() method was using timekeeping_get_ns(). Thus the locking added to the debug checks broke the NMI-safety of __ktime_get_fast_ns(). This patch open-codes the timekeeping_get_ns() logic for __ktime_get_fast_ns(), so can avoid any deadlocks in NMI. Fixes: `4ca22c2648` "timekeeping: Add warnings when overflows or underflows are observed" Reported-by: Steven Rostedt <rostedt@goodmis.org> Reported-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: John Stultz <john.stultz@linaro.org> Cc: stable <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/1471993702-29148-2-git-send-email-john.stultz@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-08-24 09:34:31 +02:00
Chris Metcalf	46c8f0b077	timers: Fix get_next_timer_interrupt() computation The tick_nohz_stop_sched_tick() routine is not properly canceling the sched timer when nothing is pending, because get_next_timer_interrupt() is no longer returning KTIME_MAX in that case. This causes periodic interrupts when none are needed. When determining the next interrupt time, we first use __next_timer_interrupt() to get the first expiring timer in the timer wheel. If no timer is found, we return the base clock value plus NEXT_TIMER_MAX_DELTA to indicate there is no timer in the timer wheel. Back in get_next_timer_interrupt(), we set the "expires" value by converting the timer wheel expiry (in ticks) to a nsec value. But we don't want to do this if the timer wheel expiry value indicates no timer; we want to return KTIME_MAX. Prior to commit `500462a9de` ("timers: Switch to a non-cascading wheel") we checked base->active_timers to see if any timers were active, and if not, we didn't touch the expiry value and so properly returned KTIME_MAX. Now we don't have active_timers. To fix this, we now just check the timer wheel expiry value to see if it is "now + NEXT_TIMER_MAX_DELTA", and if it is, we don't try to compute a new value based on it, but instead simply let the KTIME_MAX value in expires remain. Fixes: `500462a9de` "timers: Switch to a non-cascading wheel" Signed-off-by: Chris Metcalf <cmetcalf@mellanox.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Christoph Lameter <cl@linux.com> Cc: John Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/1470688147-22287-1-git-send-email-cmetcalf@mellanox.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-08-09 09:31:55 +02:00
Linus Torvalds	a6408f6cb6	Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull smp hotplug updates from Thomas Gleixner: "This is the next part of the hotplug rework. - Convert all notifiers with a priority assigned - Convert all CPU_STARTING/DYING notifiers The final removal of the STARTING/DYING infrastructure will happen when the merge window closes. Another 700 hundred line of unpenetrable maze gone :)" * 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (70 commits) timers/core: Correct callback order during CPU hot plug leds/trigger/cpu: Move from CPU_STARTING to ONLINE level powerpc/numa: Convert to hotplug state machine arm/perf: Fix hotplug state machine conversion irqchip/armada: Avoid unused function warnings ARC/time: Convert to hotplug state machine clocksource/atlas7: Convert to hotplug state machine clocksource/armada-370-xp: Convert to hotplug state machine clocksource/exynos_mct: Convert to hotplug state machine clocksource/arm_global_timer: Convert to hotplug state machine rcu: Convert rcutree to hotplug state machine KVM/arm/arm64/vgic-new: Convert to hotplug state machine smp/cfd: Convert core to hotplug state machine x86/x2apic: Convert to CPU hotplug state machine profile: Convert to hotplug state machine timers/core: Convert to hotplug state machine hrtimer: Convert to hotplug state machine x86/tboot: Convert to hotplug state machine arm64/armv8 deprecated: Convert to hotplug state machine hwtracing/coresight-etm4x: Convert to hotplug state machine ...	2016-07-29 13:55:30 -07:00
Linus Torvalds	55392c4c06	Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer updates from Thomas Gleixner: "This update provides the following changes: - The rework of the timer wheel which addresses the shortcomings of the current wheel (cascading, slow search for next expiring timer, etc). That's the first major change of the wheel in almost 20 years since Finn implemted it. - A large overhaul of the clocksource drivers init functions to consolidate the Device Tree initialization - Some more Y2038 updates - A capability fix for timerfd - Yet another clock chip driver - The usual pile of updates, comment improvements all over the place" * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (130 commits) tick/nohz: Optimize nohz idle enter clockevents: Make clockevents_subsys static clocksource/drivers/time-armada-370-xp: Fix return value check timers: Implement optimization for same expiry time in mod_timer() timers: Split out index calculation timers: Only wake softirq if necessary timers: Forward the wheel clock whenever possible timers/nohz: Remove pointless tick_nohz_kick_tick() function timers: Optimize collect_expired_timers() for NOHZ timers: Move __run_timers() function timers: Remove set_timer_slack() leftovers timers: Switch to a non-cascading wheel timers: Reduce the CPU index space to 256k timers: Give a few structs and members proper names hlist: Add hlist_is_singular_node() helper signals: Use hrtimer for sigtimedwait() timers: Remove the deprecated mod_timer_pinned() API timers, net/ipv4/inet: Initialize connection request timers as pinned timers, drivers/tty/mips_ejtag: Initialize the poll timer as pinned timers, drivers/tty/metag_da: Initialize the poll timer as pinned ...	2016-07-25 20:43:12 -07:00
Linus Torvalds	25a0dc4be8	Merge tag 'staging-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging and IIO driver updates from Greg KH: "Here is the big Staging and IIO driver update for 4.8-rc1. We ended up adding more code than removing, again, but it's not all that bad. Lots of cleanups all over the staging tree, and new IIO drivers, full details in the shortlog. All of these have been in linux-next for a while with no reported issues" * tag 'staging-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (417 commits) drivers:iio:accel:mma8452: removed unwanted return statements drivers:iio:accel:mma8452: added cleanup provision in case of failure. iio: Add iio.git tree to MAINTAINERS iio:st_pressure: clean useless static channel initializers iio:st_pressure:lps22hb: temperature support iio:st_pressure:lps22hb: open drain support iio:st_pressure: temperature triggered buffering iio:st_pressure: document sampling gains iio:st_pressure: align storagebits on power of 2 iio:st_sensors: align on storagebits boundaries staging:iio:lis3l02dq drop separate driver iio: accel: st_accel: Add lis3l02dq support iio: adc: add missing of_node references to iio_dev iio: adc: ti-ads1015: add indio_dev->dev.of_node reference iio: potentiometer: Fix typo in Kconfig iio: potentiometer: mcp4531: Add device tree binding iio: potentiometer: mcp4531: Add device tree binding documentation iio: potentiometer: mcp4531: Add support for MCP454x, MCP456x, MCP464x and MCP466x iio:imu:mpu6050: icm20608 initial support iio: adc: max1363: Add device tree binding ...	2016-07-24 16:55:23 -07:00
Gaurav Jindal	1f3b0f8243	tick/nohz: Optimize nohz idle enter tick_nohz_start_idle is called before checking whether the idle tick can be stopped. If the tick cannot be stopped, calling tick_nohz_start_idle() is pointless and just wasting CPU cycles. Only invoke tick_nohz_start_idle() when can_stop_idle_tick() returns true. A short one minute observation of the effect on ARM64 shows a reduction of calls by 1.5% thus optimizing the idle entry sequence. [tglx: Massaged changelog ] Co-developed-by: Sanjeev Yadav<sanjeev.yadav@spreadtrum.com> Signed-off-by: Gaurav Jindal<gaurav.jindal@spreadtrum.com> Link: http://lkml.kernel.org/r/20160714120416.GB21099@gaurav.jindal@spreadtrum.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-07-19 13:48:24 +02:00
Ben Dooks	775be50626	clockevents: Make clockevents_subsys static The clockevents_subsys struct is used for sysfs support and is not declared or used outside the file it is defined in. Fix the following warning by making it static: kernel/time/clockevents.c:648:17: warning: symbol 'clockevents_subsys' was not declared. Should it be static? Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Cc: linux-kernel@lists.codethink.co.uk Link: http://lkml.kernel.org/r/1466178974-7105-1-git-send-email-ben.dooks@codethink.co.uk Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-07-19 10:48:06 +02:00
Richard Cochran	24f73b9971	timers/core: Convert to hotplug state machine When tearing down, call timers_dead_cpu() before notify_dead(). There is a hidden dependency between: - timers - block multiqueue - rcutree If timers_dead_cpu() comes later than blk_mq_queue_reinit_notify() that latter function causes a RCU stall. Signed-off-by: Richard Cochran <rcochran@linutronix.de> Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: John Stultz <john.stultz@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160713153337.566790058@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-07-15 10:41:42 +02:00
Thomas Gleixner	27590dc17b	hrtimer: Convert to hotplug state machine Split out the clockevents callbacks instead of piggybacking them on hrtimers. This gets rid of a POST_DEAD user. See commit: `54e88fad22` ("sched: Make sure timers have migrated before killing the migration_thread") We just move the callback state to the proper place in the state machine. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160713153337.485419196@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-07-15 10:41:37 +02:00

1 2 3 4 5 ...

1378 Commits