Commit Graph

721742 Commits

Author SHA1 Message Date
Andy Zhou
cd8a6c3369 openvswitch: Add meter action support
Implements OVS kernel meter action support.

Signed-off-by: Andy Zhou <azhou@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-13 10:37:07 +09:00
Andy Zhou
96fbc13d7e openvswitch: Add meter infrastructure
OVS kernel datapath so far does not support Openflow meter action.
This is the first stab at adding kernel datapath meter support.
This implementation supports only drop band type.

Signed-off-by: Andy Zhou <azhou@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-13 10:37:07 +09:00
Andy Zhou
9602c01e57 openvswitch: export get_dp() API.
Later patches will invoke get_dp() outside of datapath.c. Export it.

Signed-off-by: Andy Zhou <azhou@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-13 10:37:07 +09:00
Andy Zhou
5794040647 openvswitch: Add meter netlink definitions
Meter has its own netlink family. Define netlink messages and attributes
for communicating with the user space programs.

Signed-off-by: Andy Zhou <azhou@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-13 10:37:07 +09:00
David S. Miller
aef1e0d5dd Merge branch 'dsa-b53-Support-prepended-Broadcom-tags'
Florian Fainelli says:

====================
net: dsa: b53: Support prepended Broadcom tags

This patch series adds support for prepended 4-bytes Broadcom tags that we
already support. This type of tag will typically be used when interfaced to
a SoC like BCM58xx (NorthStar Plus) which supports a Flow Accelerator (WIP).
In that case, we need to support a slightly different tagging format.

The first patch does a bit of re-factoring and passes a port index to
the get_tag_protocol() function since at least two different drivers need
that type of information (mt7530, b53) to support tagging or not.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-13 10:34:55 +09:00
Florian Fainelli
1160603960 net: dsa: b53: Support prepended Broadcom tags
On BCM58xx devices (Northstar Plus), there is an accelerator attached to
port 8 which would only work if we use prepended Broadcom tags. Resolve
that difference in our get_tag_protocol() function by setting the
appropriate tagging protocol in that case. We need to change
b53_brcm_hdr_setup() a little bit now since we can deal with two types
of Broadcom tags.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-13 10:34:54 +09:00
Florian Fainelli
b74b70c449 net: dsa: Support prepended Broadcom tag
Add a new type: DSA_TAG_PROTO_PREPEND which allows us to support for the
4-bytes Broadcom tag that we already support, but in a format where it
is pre-pended to the packet instead of located between the MAC SA and
the Ethertyper (DSA_TAG_PROTO_BRCM).

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-13 10:34:54 +09:00
Florian Fainelli
f7c39e3d1e net: dsa: tag_brcm: Prepare for supporting prepended tag
In preparation for supporting the same Broadcom tag format, but instead
of inserted between the MAC SA and EtherType, prepended to the Ethernet
frame, restructure the code a little bit to make that possible and take
an offset parameter.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-13 10:34:54 +09:00
Florian Fainelli
5ed4e3eb02 net: dsa: Pass a port to get_tag_protocol()
A number of drivers want to check whether the configured CPU port is a
possible configuration for enabling tagging, pass down the CPU port
number so they verify that.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-13 10:34:54 +09:00
Andrew Morton
ee9d3429c0 net/sched/sch_red.c: work around gcc-4.4.4 anon union initializer issue
gcc-4.4.4 (at lest) has issues with initializers and anonymous unions:

net/sched/sch_red.c: In function 'red_dump_offload':
net/sched/sch_red.c:282: error: unknown field 'stats' specified in initializer
net/sched/sch_red.c:282: warning: initialization makes integer from pointer without a cast
net/sched/sch_red.c:283: error: unknown field 'stats' specified in initializer
net/sched/sch_red.c:283: warning: initialization makes integer from pointer without a cast
net/sched/sch_red.c: In function 'red_dump_stats':
net/sched/sch_red.c:352: error: unknown field 'xstats' specified in initializer
net/sched/sch_red.c:352: warning: initialization makes integer from pointer without a cast

Work around this.

Fixes: 602f3baf22 ("net_sch: red: Add offload ability to RED qdisc")
Cc: Nogah Frankel <nogahf@mellanox.com>
Cc: Jiri Pirko <jiri@mellanox.com>
Cc: Simon Horman <simon.horman@netronome.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-13 10:33:07 +09:00
Slava Shwartsman
a1b8714593 net/mlx4: Use Kconfig flag to remove support of old gen2 Mellanox devices
Since Mellanox focus is on newer adapters, we would like to have the
ability to disable the support for old gen2 adapters.

This can be done by turning off the MLX4_CORE_GEN2 Kconfig flag.
We keep it turned on by default.

Signed-off-by: Slava Shwartsman <slavash@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-13 10:27:51 +09:00
Jason A. Donenfeld
0642840b8b af_netlink: ensure that NLMSG_DONE never fails in dumps
The way people generally use netlink_dump is that they fill in the skb
as much as possible, breaking when nla_put returns an error. Then, they
get called again and start filling out the next skb, and again, and so
forth. The mechanism at work here is the ability for the iterative
dumping function to detect when the skb is filled up and not fill it
past the brim, waiting for a fresh skb for the rest of the data.

However, if the attributes are small and nicely packed, it is possible
that a dump callback function successfully fills in attributes until the
skb is of size 4080 (libmnl's default page-sized receive buffer size).
The dump function completes, satisfied, and then, if it happens to be
that this is actually the last skb, and no further ones are to be sent,
then netlink_dump will add on the NLMSG_DONE part:

  nlh = nlmsg_put_answer(skb, cb, NLMSG_DONE, sizeof(len), NLM_F_MULTI);

It is very important that netlink_dump does this, of course. However, in
this example, that call to nlmsg_put_answer will fail, because the
previous filling by the dump function did not leave it enough room. And
how could it possibly have done so? All of the nla_put variety of
functions simply check to see if the skb has enough tailroom,
independent of the context it is in.

In order to keep the important assumptions of all netlink dump users, it
is therefore important to give them an skb that has this end part of the
tail already reserved, so that the call to nlmsg_put_answer does not
fail. Otherwise, library authors are forced to find some bizarre sized
receive buffer that has a large modulo relative to the common sizes of
messages received, which is ugly and buggy.

This patch thus saves the NLMSG_DONE for an additional message, for the
case that things are dangerously close to the brim. This requires
keeping track of the errno from ->dump() across calls.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-13 10:17:13 +09:00
David S. Miller
907a4425f7 Merge branch 'netem-add-nsec-scheduling-and-slot-feature'
Dave Taht says:

====================
netem: add nsec scheduling and slot feature

This patch series converts netem away from the old "ticks" interface and
userspace API, and adds support for a new "slot" feature intended to
emulate bursty macs such as WiFi and LTE better.

Changes since v2:
Use u64 for packet_len_sched_time()
Use simpler max(time_to_send,q->slot.slot_next)

Changes since v1:
Always pass new nanosecond APIs to userspace
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-13 10:15:47 +09:00
Dave Taht
836af83b54 netem: support delivering packets in delayed time slots
Slotting is a crude approximation of the behaviors of shared media such
as cable, wifi, and LTE, which gather up a bunch of packets within a
varying delay window and deliver them, relative to that, nearly all at
once.

It works within the existing loss, duplication, jitter and delay
parameters of netem. Some amount of inherent latency must be specified,
regardless.

The new "slot" parameter specifies a minimum and maximum delay between
transmission attempts.

The "bytes" and "packets" parameters can be used to limit the amount of
information transferred per slot.

Examples of use:

tc qdisc add dev eth0 root netem delay 200us \
         slot 800us 10ms bytes 64k packets 42

A more correct example, using stacked netem instances and a packet limit
to emulate a tail drop wifi queue with slots and variable packet
delivery, with a 200Mbit isochronous underlying rate, and 20ms path
delay:

tc qdisc add dev eth0 root handle 1: netem delay 20ms rate 200mbit \
         limit 10000
tc qdisc add dev eth0 parent 1:1 handle 10:1 netem delay 200us \
         slot 800us 10ms bytes 64k packets 42 limit 512

Signed-off-by: Dave Taht <dave.taht@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-13 10:15:47 +09:00
Dave Taht
99803171ef netem: add uapi to express delay and jitter in nanoseconds
netem userspace has long relied on a horrible /proc/net/psched hack
to translate the current notion of "ticks" to nanoseconds.

Expressing latency and jitter instead, in well defined nanoseconds,
increases the dynamic range of emulated delays and jitter in netem.

It will also ease a transition where reducing a tick to nsec
equivalence would constrain the max delay in prior versions of
netem to only 4.3 seconds.

Signed-off-by: Dave Taht <dave.taht@gmail.com>
Suggested-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-13 10:15:47 +09:00
Dave Taht
112f9cb656 netem: convert to qdisc_watchdog_schedule_ns
Upgrade the internal netem scheduler to use nanoseconds rather than
ticks throughout.

Convert to and from the std "ticks" userspace api automatically,
while allowing for finer grained scheduling to take place.

Signed-off-by: Dave Taht <dave.taht@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-13 10:15:47 +09:00
Francesco Ruggeri
338d182fa5 ipv6: try not to take rtnl_lock in ip6mr_sk_done
Avoid traversing the list of mr6_tables (which requires the
rtnl_lock) in ip6mr_sk_done(), when we know in advance that
a match will not be found.
This can happen when rawv6_close()/ip6mr_sk_done() is invoked
on non-mroute6 sockets.
This patch helps reduce rtnl_lock contention when destroying
a large number of net namespaces, each having a non-mroute6
raw socket.

v2: same patch, only fixed subject line and expanded comment.

Signed-off-by: Francesco Ruggeri <fruggeri@arista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-13 10:13:04 +09:00
Colin Ian King
07842561a8 net: realtek: r8169: remove redundant assignment to giga_ctrl
The variable giga_ctrl is being assigned to zero however this is
never read and hence the assignment is redundant, so remove it.
Cleans up clang warning:

drivers/net/ethernet/realtek/r8169.c:1978:3: warning: Value stored
to 'giga_ctrl' is never read

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-13 10:02:33 +09:00
Linus Torvalds
277642dcca modules: make sysfs attribute files readable by owner only
This code goes back to the historical bitkeeper tree commit 3f7b0672086
("Module section offsets in /sys/module"), where Jonathan Corbet wanted
to show people how to debug loadable modules.

See

    https://lwn.net/Articles/88052/

from June 2004.

To expose the required load address information, Jonathan added the
sections subdirectory for every module in /sys/modules, and made them
S_IRUGO - readable by everybody.

It was a more innocent time, plus those S_IRxxx macro names are a lot
more confusing than the octal numbers are, so maybe it wasn't even
intentional.  But here we are, thirteen years later, and I'll just change
it to S_IRUSR instead.

Let's see if anybody even notices.

Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-12 17:00:53 -08:00
Egil Hjelmeland
4b33709d89 net: dsa: lan9303: Documentation: Add missing word "Mbps"
Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-13 09:59:29 +09:00
Egil Hjelmeland
30482e4e28 net: dsa: lan9303: Fix lan9303_alr_del_port()
Fix embarrassing bug in lan9303_alr_del_port(): Instead of zeroing
entr->mac_addr, I destroyed the next cache entry. Affected .port_fdb_del and
.port_mdb_del.

Fixes: 0620427ea0 ("net: dsa: lan9303: Add fdb/mdb manipulation")
Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-13 09:59:07 +09:00
Rafael J. Wysocki
990a848d53 Merge branches 'pm-devfreq' and 'pm-tools'
* pm-devfreq:
  PM / devfreq: Define the constant governor name
  PM / devfreq: Remove unneeded conditional statement
  PM / devfreq: Show the all available frequencies
  PM / devfreq: Change return type of devfreq_set_freq_table()
  PM / devfreq: Use the available min/max frequency
  Revert "PM / devfreq: Add show_one macro to delete the duplicate code"
  PM / devfreq: Set min/max_freq when adding the devfreq device

* pm-tools:
  tools/power/cpupower: add libcpupower.so.0.0.1 to .gitignore
  tools/power/cpupower: Add 64 bit library detection
  MAINTAINERS: add maintainer for tools/power/cpupower
  cpupower: Fix no-rounding MHz frequency output
2017-11-13 01:41:39 +01:00
Rafael J. Wysocki
1efef68262 Merge branch 'pm-core'
* pm-core:
  ACPI / PM: Take SMART_SUSPEND driver flag into account
  PCI / PM: Take SMART_SUSPEND driver flag into account
  PCI / PM: Drop unnecessary invocations of pcibios_pm_ops callbacks
  PM / core: Add SMART_SUSPEND driver flag
  PCI / PM: Use the NEVER_SKIP driver flag
  PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags
  PM / core: Convert timers to use timer_setup()
  PM / core: Fix kerneldoc comments of four functions
  PM / core: Drop legacy class suspend/resume operations
2017-11-13 01:41:26 +01:00
Rafael J. Wysocki
05d658b5b5 Merge branch 'pm-sleep'
* pm-sleep:
  freezer: Fix typo in freezable_schedule_timeout() comment
  PM / s2idle: Clear the events_check_enabled flag
  PM / sleep: Remove pm_complete_with_resume_check()
  PM: ARM: locomo: Drop suspend and resume bus type callbacks
  PM: Use a more common logging style
  PM: Document rules on using pm_runtime_resume() in system suspend callbacks
2017-11-13 01:41:20 +01:00
Rafael J. Wysocki
794c33555f Merge branch 'acpi-pm'
* acpi-pm:
  ACPI / PM: Fix acpi_pm_notifier_lock vs flush_workqueue() deadlock
  ACPI / LPSS: Consolidate runtime PM and system sleep handling
  ACPI / PM: Combine device suspend routines
  ACPI / LPIT: Add Low Power Idle Table (LPIT) support
  ACPI / PM: Split code validating need for runtime resume in ->prepare()
  ACPI / PM: Restore acpi_subsys_complete()
  ACPI / PM: Combine two identical device resume routines
  ACPI / PM: Remove stale function header
2017-11-13 01:41:11 +01:00
Rafael J. Wysocki
28da43956b Merge branches 'pm-cpufreq-sched' and 'pm-opp'
* pm-cpufreq-sched:
  cpufreq: schedutil: Reset cached_raw_freq when not in sync with next_freq

* pm-opp:
  PM / OPP: Add dev_pm_opp_{un}register_get_pstate_helper()
  PM / OPP: Support updating performance state of device's power domain
  PM / OPP: add missing of_node_put() for of_get_cpu_node()
  PM / OPP: Rename dev_pm_opp_register_put_opp_helper()
  PM / OPP: Add missing of_node_put(np)
  PM / OPP: Move error message to debug level
  PM / OPP: Use snprintf() to avoid kasprintf() and kfree()
  PM / OPP: Move the OPP directory out of power/
2017-11-13 01:40:52 +01:00
Rafael J. Wysocki
eb5fcc3134 Merge branches 'acpi-ec', 'acpi-button', 'acpi-sysfs', 'acpi-lpss' and 'acpi-cppc'
* acpi-ec:
  ACPI / EC: Fix regression related to triggering source of EC event handling

* acpi-button:
  ACPI / button: Delay acpi_lid_initialize_state() until first user space open

* acpi-sysfs:
  ACPI / sysfs: Make function param_set_trace_method_name() static

* acpi-lpss:
  ACPI / LPSS: Remove redundant initialization of clk

* acpi-cppc:
  ACPI / CPPC: Make CPPC ACPI driver aware of PCC subspace IDs
  mailbox: PCC: Move the MAX_PCC_SUBSPACES definition to header file
2017-11-13 01:37:55 +01:00
Rafael J. Wysocki
85595ada6c Merge branches 'acpi-pmic', 'acpi-apei' and 'acpi-x86'
* acpi-pmic:
  ACPI / PMIC: Add TI PMIC TPS68470 operation region driver

* acpi-apei:
  APEI / ERST: use 64-bit timestamps
  ACPI / APEI: Remove arch_apei_flush_tlb_one()
  arm64: mm: Remove arch_apei_flush_tlb_one()
  ACPI / APEI: Remove ghes_ioremap_area
  ACPI / APEI: Replace ioremap_page_range() with fixmap
  ACPI / APEI: remove the unused dead-code for SEA/NMI notification type
  ACPI / APEI: adjust a local variable type in ghes_ioremap_pfn_irq()

* acpi-x86:
  ACPI / x86: Extend KIOX000A quirk to cover all affected BIOS versions
2017-11-13 01:37:17 +01:00
Rafael J. Wysocki
60764eb379 Merge branch 'acpica'
* acpica:
  ACPICA: Update version to 20170831
  ACPICA: Update acpi_get_timer for 64-bit interface to acpi_hw_read
  ACPICA: String conversions: Update to add new behaviors
  ACPICA: String conversions: Cleanup/format comments. No functional changes
  ACPICA: Restructure/cleanup all string-to-integer conversion functions
  ACPICA: Header support for the PDTT ACPI table
  ACPICA: acpiexec: Add testability of deferred table verification
  ACPICA: Hardware: Enable 64-bit support of hardware accesses
2017-11-13 01:36:58 +01:00
Rafael J. Wysocki
60af981c78 Merge branch 'pm-cpufreq'
* pm-cpufreq: (22 commits)
  cpufreq: stats: Handle the case when trans_table goes beyond PAGE_SIZE
  cpufreq: arm_big_little: make cpufreq_arm_bL_ops structures const
  cpufreq: arm_big_little: make function arguments and structure pointer const
  cpufreq: pxa: convert to clock API
  cpufreq: speedstep-lib: mark expected switch fall-through
  cpufreq: ti-cpufreq: add missing of_node_put()
  cpufreq: dt: Remove support for Exynos4212 SoCs
  cpufreq: imx6q: Move speed grading check to cpufreq driver
  cpufreq: ti-cpufreq: kfree opp_data when failure
  cpufreq: SPEAr: pr_err() strings should end with newlines
  cpufreq: powernow-k8: pr_err() strings should end with newlines
  cpufreq: dt-platdev: drop socionext,uniphier-ld6b from whitelist
  arm64: wire cpu-invariant accounting support up to the task scheduler
  arm64: wire frequency-invariant accounting support up to the task scheduler
  arm: wire cpu-invariant accounting support up to the task scheduler
  arm: wire frequency-invariant accounting support up to the task scheduler
  drivers base/arch_topology: allow inlining cpu-invariant accounting support
  drivers base/arch_topology: provide frequency-invariant accounting support
  cpufreq: dt: invoke frequency-invariance setter function
  cpufreq: arm_big_little: invoke frequency-invariance setter function
  ...
2017-11-13 01:34:49 +01:00
Rafael J. Wysocki
622ade3a2f Merge branch 'pm-cpuidle'
* pm-cpuidle:
  intel_idle: Graceful probe failure when MWAIT is disabled
  cpuidle: Avoid assignment in if () argument
  cpuidle: Clean up cpuidle_enable_device() error handling a bit
  cpuidle: ladder: Add per CPU PM QoS resume latency support
  ARM: cpuidle: Refactor rollback operations if init fails
  ARM: cpuidle: Correct driver unregistration if init fails
  intel_idle: replace conditionals with static_cpu_has(X86_FEATURE_ARAT)
  cpuidle: fix broadcast control when broadcast can not be entered

 Conflicts:
	drivers/idle/intel_idle.c
2017-11-13 01:34:14 +01:00
Rafael J. Wysocki
4762573b93 Merge branch 'pm-qos'
* pm-qos:
  PM / QoS: Fix device resume latency framework
  PM / QoS: Drop PM_QOS_FLAG_REMOTE_WAKEUP
2017-11-13 01:33:48 +01:00
Rafael J. Wysocki
29aaf90875 Merge branch 'pm-domains'
* pm-domains:
  PM / Domains: Fix genpd to deal with drivers returning 1 from ->prepare()
  PM / domains: Rework governor code to be more consistent
  PM / Domains: Remove gpd_dev_ops.active_wakeup() callback
  soc: rockchip: power-domain: Use GENPD_FLAG_ACTIVE_WAKEUP
  soc: mediatek: Use GENPD_FLAG_ACTIVE_WAKEUP
  ARM: shmobile: pm-rmobile: Use GENPD_FLAG_ACTIVE_WAKEUP
  PM / Domains: Allow genpd users to specify default active wakeup behavior
  PM / Domains: Add support to select performance-state of domains
  PM / Domains: Rename genpd internals from pm_genpd_* to genpd_*
2017-11-13 01:33:35 +01:00
Linus Torvalds
9d5604101e Merge branch 'kallsyms-restrictions'
Merge /proc/kallsyms pointer value restrictions.

Instead of using %pK, and making it about root access (at the wrong
time, no less), make the whole choice of whether to show the actual
pointer value be very explicit to the kallsyms code.

In particular, we can now default to not doing so, and yet avoid
annoying kernel profiling by actually looking at whether kernel
profiling is allowed or not (by default it is not).

This is all mostly preparation for the real "let's stop leaking kernel
addresses" work that Tobin Harding is working on.

Small steps.

* kallsyms-restrictions:
  stop using '%pK' for /proc/kallsyms pointer values
2017-11-12 16:33:33 -08:00
Rafael J. Wysocki
040e8a4a4c Merge branches 'pm-pci', 'pm-avs' and 'pm-docs'
* pm-pci:
  PCI / PM: Add dev_dbg() to print device suspend power states
  PCI / PM: Do not resume any devices in pci_pm_prepare()

* pm-avs:
  PM / AVS: Use %pS printk format for direct addresses

* pm-docs:
  PM: docs: Fix formatting typo in devices.rst
2017-11-13 01:32:25 +01:00
Naveen N. Rao
46725b17f1 powerpc/signal: Properly handle return value from uprobe_deny_signal()
When a uprobe is installed on an instruction that we currently do not
emulate, we copy the instruction into a xol buffer and single step
that instruction. If that instruction generates a fault, we abort the
single stepping before invoking the signal handler. Once the signal
handler is done, the uprobe trap is hit again since the instruction is
retried and the process repeats.

We use uprobe_deny_signal() to detect if the xol instruction triggered
a signal. If so, we clear TIF_SIGPENDING and set TIF_UPROBE so that the
signal is not handled until after the single stepping is aborted. In
this case, uprobe_deny_signal() returns true and get_signal() ends up
returning 0. However, in do_signal(), we are not looking at the return
value, but depending on ksig.sig for further action, all with an
uninitialized ksig that is not touched in this scenario. Fix the same
by initializing ksig.sig to 0.

Fixes: 129b69df9c ("powerpc: Use get_signal() signal_setup_done()")
Cc: stable@vger.kernel.org # v3.17+
Reported-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-13 10:53:05 +11:00
Michal Suchanek
dcdc46794b powerpc/fadump: use kstrtoint to handle sysfs store
Currently sysfs store handlers in fadump use if buf[0] == 'char'.

This means input "100foo" is interpreted as '1' and "01" as '0'.

Change to kstrtoint so leading zeroes and the like is handled in
expected way.

Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Acked-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
Signed-off-by: Michal Suchanek <a class="moz-txt-link-rfc2396E" href="mailto:msuchanek@suse.de">&lt;msuchanek@suse.de&gt;</a></pre>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-13 10:51:38 +11:00
Dan Williams
0e7f074145 acpi, nfit: validate commands against the device type
Fix occasions in acpi_nfit_ctl where we check the command type without
validating whether we are parsing dimm vs bus level commands. Where the
command numbers alias between dimms and bus we can make the wrong
assumption just checking the raw command number. For example, with a
simple nfit_test mock up of the clear-error command we trigger the
following:

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000094
    IP: acpi_nfit_ctl+0x29b/0x930 [nfit]
    [..]
    Call Trace:
     nfit_test_probe+0xb85/0xc09 [nfit_test]
     platform_drv_probe+0x3b/0xa0
     ? platform_drv_probe+0x3b/0xa0
     driver_probe_device+0x29c/0x450
     ? test_alloc+0x180/0x180 [nfit_test]
     __driver_attach+0xe3/0xf0
     ? driver_probe_device+0x450/0x450
     bus_for_each_dev+0x73/0xc0
     driver_attach+0x1e/0x20
     bus_add_driver+0x173/0x270
     driver_register+0x60/0xe0
     __platform_driver_register+0x36/0x40
     nfit_test_init+0x2a1/0x1000 [nfit_test]

Fixes: 4b27db7e26 ("acpi, nfit: add support for the _LSI, _LSR, and...")
Reported-by: Vishal Verma <vishal.l.verma@intel.com>
Tested-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-11-12 15:18:57 -08:00
Rasmus Villemoes
ffc661c99f genirq: Fix type of shifting literal 1 in __setup_irq()
If ffz() ever returns a value >= 31 then the following shift is undefined
behaviour because the literal 1 which gets shifted is treated as signed
integer.

In practice, the bug is probably harmless, since the first undefined shift
count is 31 which results - ignoring UB - in (int)(0x80000000). This gets
sign extended so bit 32-63 will be set as well and all subsequent
__setup_irq() calls would just end up hitting the -EBUSY branch.

However, a sufficiently aggressive optimizer may use the UB of 1<<31
to decide that doesn't happen, and hence elide the sign-extension
code, so that subsequent calls can indeed get ffz > 31.

In any case, the right thing to do is to make the literal 1UL.

[ tglx: For this to happen a single interrupt would have to be shared by 32
  	devices. Hardware like that does not exist and would have way more
  	problems than that. ]

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20171030213548.16831-1-linux@rasmusvillemoes.dk
2017-11-12 23:25:40 +01:00
Rasmus Villemoes
306eb5a38d irqdomain: Drop pointless NULL check in virq_debug_show_one
data has been already derefenced unconditionally, so it's pointless to do a
NULL pointer check on it afterwards. Drop it.

[ tglx: Depersonify changelog. ]

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Link: https://lkml.kernel.org/r/20171112212904.28574-1-linux@rasmusvillemoes.dk
2017-11-12 23:25:40 +01:00
Wen Yaxng
6714796edc genirq/proc: Return proper error code when irq_set_affinity() fails
write_irq_affinity() returns the number of written bytes, which means
success, unconditionally whether the actual irq_set_affinity() call
succeeded or not.

Add proper error handling and pass the error code returned from
irq_set_affinity() back to user space in case of failure.

[ tglx: Fixed coding style and massaged changelog ]

Signed-off-by: Wen Yang <wen.yang99@zte.com.cn>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jiang Biao <jiang.biao2@zte.com.cn>
Cc: zhong.weidong@zte.com.cn
Link: https://lkml.kernel.org/r/1510106103-184761-1-git-send-email-wen.yang99@zte.com.cn
2017-11-12 23:25:39 +01:00
Randy Dunlap
ba1029c9cb modpost: detect modules without a MODULE_LICENSE
Partially revert commit 2fa3656829 ("kbuild: soften MODULE_LICENSE
check") so that modpost detects modules that do not have a
MODULE_LICENSE.

Sam's commit also changed the fatal error to a warning, which I am
leaving as is.

This gives advance notice of when a module has no license and will taint
the kernel if the module is loaded.

This produces the following warnings on x86_64 allmodconfig:

    MODPOST 6520 modules
  WARNING: modpost: missing MODULE_LICENSE() in drivers/auxdisplay/img-ascii-lcd.o
  WARNING: modpost: missing MODULE_LICENSE() in drivers/gpio/gpio-ath79.o
  WARNING: modpost: missing MODULE_LICENSE() in drivers/gpio/gpio-iop.o
  WARNING: modpost: missing MODULE_LICENSE() in drivers/iio/accel/kxsd9-i2c.o
  WARNING: modpost: missing MODULE_LICENSE() in drivers/iio/adc/qcom-vadc-common.o
  WARNING: modpost: missing MODULE_LICENSE() in drivers/media/platform/mtk-vcodec/mtk-vcodec-common.o
  WARNING: modpost: missing MODULE_LICENSE() in drivers/media/platform/soc_camera/soc_scale_crop.o
  WARNING: modpost: missing MODULE_LICENSE() in drivers/mtd/nand/denali_pci.o
  WARNING: modpost: missing MODULE_LICENSE() in drivers/net/phy/cortina.o
  WARNING: modpost: missing MODULE_LICENSE() in drivers/pinctrl/pxa/pinctrl-pxa2xx.o
  WARNING: modpost: missing MODULE_LICENSE() in drivers/power/reset/zx-reboot.o
  WARNING: modpost: missing MODULE_LICENSE() in drivers/rpmsg/qcom_glink_native.o
  WARNING: modpost: missing MODULE_LICENSE() in drivers/staging/comedi/drivers/ni_atmio.o
  WARNING: modpost: missing MODULE_LICENSE() in net/9p/9pnet_xen.o
  WARNING: modpost: missing MODULE_LICENSE() in sound/soc/codecs/snd-soc-pcm512x-spi.o

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-12 13:47:40 -08:00
Oliver O'Halloran
6c44741d75 powerpc/lib: Implement UACCESS_FLUSHCACHE API
Implement the architecture specific portitions of the UACCESS_FLUSHCACHE
API. This provides functions for the copy_user_flushcache iterator that
ensure that when the copy is finished the destination buffer contains
a copy of the original and that the destination buffer is clean in the
processor caches.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-13 08:00:31 +11:00
Oliver O'Halloran
32ce3862af powerpc/lib: Implement PMEM API
Implement the architecture specific cache maintence functions that make
up the "PMEM API". Currently the writeback and invalidate functions
are the same since the function of the DCBST (data cache block store)
instruction is typically interpreted as "writeback to the point of
coherency" rather than to memory. As a result implementing the API
requires a full cache flush rather than just a cache write back. This
will probably change in the not-too-distant future.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-13 08:00:30 +11:00
Alistair Popple
1b2c2b1238 powerpc/powernv/npu: Don't explicitly flush nmmu tlb
The nest mmu required an explicit flush as a tlbi would not flush it in the
same way as the core. However an alternate firmware fix exists which should
eliminate the need for this flush, so instead add a device-tree property
(ibm,nmmu-flush) on the NVLink2 PHB to enable it only if required.

Signed-off-by: Alistair Popple <alistair@popple.id.au>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-13 08:00:30 +11:00
Alistair Popple
2a31ad093b powerpc/powernv/npu: Use flush_all_mm() instead of flush_tlb_mm()
With the optimisations introduced by commit a46cc7a908 ("powerpc/mm/radix:
Improve TLB/PWC flushes"), flush_tlb_mm() no longer flushes the page walk
cache with radix. Switch to using flush_all_mm() to ensure the pwc and tlb
are properly flushed on the nmmu.

Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-13 08:00:29 +11:00
Vaidyanathan Srinivasan
8d4e10e9ed powerpc/powernv/idle: Round up latency and residency values
On PowerNV platforms, firmware provides exit latency and
target residency for each of the idle states in nano
seconds.  Cpuidle framework expects the values in micro
seconds.  Round up to nearest micro seconds to avoid errors
in cases where the values are defined as fractional micro
seconds.

Default idle state of 'snooze' has exit latency of zero.  If
other states have fractional micro second exit latency, they
would get rounded down to zero micro second and make cpuidle
framework choose deeper idle state when snooze loop is the
right choice.

Reported-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-13 08:00:29 +11:00
Linus Torvalds
bebc6082da Linux 4.14 2017-11-12 10:46:13 -08:00
Linus Torvalds
152bbb43b3 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
 "A set of small fixes:

   - make KGDB work again which got broken by the conversion of WARN()
     to #UD. The WARN fixup needs to run before the notifier callchain,
     otherwise KGDB tries to handle it and crashes.

   - disable KASAN in the ORC unwinder to prevent false positive KASAN
     warnings

   - prevent default mapping above 47bit when 5 level page tables are
     enabled

   - make the delay calibration optimization work correctly, which had
     the conditionals the wrong way around and was operating on data
     which was not yet updated.

   - remove the bogus X86_TRAP_BP trap init from the default IDT init
     table, which broke 32bit int3 handling by overwriting the correct
     int3 setup.

   - replace this_cpu* with boot_cpu_data access in the preemptible
     oprofile init code"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/debug: Handle warnings before the notifier chain, to fix KGDB crash
  x86/mm: Fix ELF_ET_DYN_BASE for 5-level paging
  x86/idt: Remove X86_TRAP_BP initialization in idt_setup_traps()
  x86/oprofile/ppro: Do not use __this_cpu*() in preemptible context
  x86/unwind: Disable KASAN checking in the ORC unwinder
  x86/smpboot: Make optimization of delay calibration work correctly
2017-11-12 10:12:41 -08:00
Linus Torvalds
69581c7472 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf tool fixes from Thomas Gleixner:
 "A small set of fixes for perf tool:

   - synchronize the i915 drm header to avoid the 'out of date' warning

   - make sure that perf trace cleans up its temporary files on exit

   - unbreak the build with newer flex versions

   - add missing braces in the eBPF parsing rules"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  tooling/headers: Sync the tools/include/uapi/drm/i915_drm.h UAPI header
  perf trace: Call machine__exit() at exit
  perf tools: Fix eBPF event specification parsing
  perf tools: Add "reject" option for parse-events.l
2017-11-12 09:43:53 -08:00