linux

mirror of https://github.com/hardkernel/linux.git synced 2026-05-09 21:55:48 +09:00

Author	SHA1	Message	Date
David S. Miller	7eca9cc539	Merge tag 'rxrpc-rewrite-20170607-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs David Howells says: ==================== rxrpc: Tx length parameter Here's a set of patches that allows someone initiating a client call with AF_RXRPC to indicate upfront the total amount of data that will be transmitted. This will allow AF_RXRPC to encrypt directly from source buffer to packet rather than having to copy into the buffer and only encrypt when it's full (the encrypted portion of the packet starts with a length and so we can't encrypt until we know what the length will be). The three patches are: (1) Provide a means of finding out what control message types are actually supported. EINVAL is reported if an unsupported cmsg type is seen, so we don't want to set the new cmsg unless we know it will be accepted. (2) Consolidate some stuff into a struct to reduce the parameter count on the function that parses the cmsg buffer. (3) Introduce the RXRPC_TX_LENGTH cmsg. This can be provided on the first sendmsg() that contributes data to a client call request or a service call reply. If provided, the user must provide exactly that amount of data or an error will be incurred. Changes in version 2: (*) struct rxrpc_send_params::tx_total_len should be s64 not u64. Thanks to Julia Lawall for reporting this. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-08 11:41:41 -04:00
Jiri Pirko	a5fcf8a6c9	net: propagate tc filter chain index down the ndo_setup_tc call We need to push the chain index down to the drivers, so they have the information to which chain the rule belongs. For now, no driver supports multichain offload, so only chain 0 is supported. This is needed to prevent chain squashes during offload for now. Later this will be used to implement multichain offload. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-08 09:55:53 -04:00
David Howells	e754eba685	rxrpc: Provide a cmsg to specify the amount of Tx data for a call Provide a control message that can be specified on the first sendmsg() of a client call or the first sendmsg() of a service response to indicate the total length of the data to be transmitted for that call. Currently, because the length of the payload of an encrypted DATA packet is encrypted in front of the data, the packet cannot be encrypted until we know how much data it will hold. By specifying the length at the beginning of the transmit phase, each DATA packet length can be set before we start loading data from userspace (where several sendmsg() calls may contribute to a particular packet). An error will be returned if too little or too much data is presented in the Tx phase. Signed-off-by: David Howells <dhowells@redhat.com>	2017-06-07 17:15:46 +01:00
David Howells	515559ca21	rxrpc: Provide a getsockopt call to query what cmsgs types are supported Provide a getsockopt() call that can query what cmsg types are supported by AF_RXRPC.	2017-06-07 17:15:46 +01:00
David S. Miller	216fe8f021	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Just some simple overlapping changes in marvell PHY driver and the DSA core code. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-06 22:20:08 -04:00
Russell King	c125ca0918	net: phy: add XAUI and 10GBASE-KR PHY connection types XAUI allows XGMII to reach an extended distance by using a XGXS layer at each end of the MAC to PHY link, operating over four Serdes lanes. 10GBASE-KR is a single lane Serdes backplane ethernet connection method with autonegotiation on the link. Some PHYs use this to connect to the ethernet interface at 10G speeds, switching to other connection types when utilising slower speeds. 10GBASE-KR is also used for XFI and SFI to connect to XFP and SFP fiber modules. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-06 21:14:13 -04:00
Russell King	002ba7058a	net: phy: hook up clause 45 autonegotiation restart genphy_restart_aneg() can only restart autonegotiation on clause 22 PHYs. Add a phy_restart_aneg() function which selects between the clause 22 and clause 45 restart functionality depending on the PHY type and whether the Clause 45 PHY supports the Clause 22 register set. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-06 21:14:13 -04:00
Russell King	5acde34a5a	net: phy: add 802.3 clause 45 support to phylib Add generic helpers for 802.3 clause 45 PHYs for >= 10Gbps support. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-06 21:14:13 -04:00
Linus Torvalds	b29794ec95	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Made TCP congestion control documentation match current reality, from Anmol Sarma. 2) Various build warning and failure fixes from Arnd Bergmann. 3) Fix SKB list leak in ipv6_gso_segment(). 4) Use after free in ravb driver, from Eugeniu Rosca. 5) Don't use udp_poll() in ping protocol driver, from Eric Dumazet. 6) Don't crash in PCI error recovery of cxgb4 driver, from Guilherme Piccoli. 7) _SRC_NAT_DONE_BIT needs to be cleared using atomics, from Liping Zhang. 8) Use after free in vxlan deletion, from Mark Bloch. 9) Fix ordering of NAPI poll enabled in ethoc driver, from Max Filippov. 10) Fix stmmac hangs with TSO, from Niklas Cassel. 11) Fix crash in CALIPSO ipv6, from Richard Haines. 12) Clear nh_flags properly on mpls link up. From Roopa Prabhu. 13) Fix regression in sk_err socket error queue handling, noticed by ping applications. From Soheil Hassas Yeganeh. 14) Update mlx4/mlx5 MAINTAINERS information. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (78 commits) net: stmmac: fix a broken u32 less than zero check net: stmmac: fix completely hung TX when using TSO net: ethoc: enable NAPI before poll may be scheduled net: bridge: fix a null pointer dereference in br_afspec ravb: Fix use-after-free on `ifconfig eth0 down` net/ipv6: Fix CALIPSO causing GPF with datagram support net: stmmac: ensure jumbo_frm error return is correctly checked for -ve value Revert "sit: reload iphdr in ipip6_rcv" i40e/i40evf: proper update of the page_offset field i40e: Fix state flags for bit set and clean operations of PF iwlwifi: fix host command memory leaks iwlwifi: fix min API version for 7265D, 3168, 8000 and 8265 iwlwifi: mvm: clear new beacon command template struct iwlwifi: mvm: don't fail when removing a key from an inexisting sta iwlwifi: pcie: only use d0i3 in suspend/resume if system_pm is set to d0i3 iwlwifi: mvm: fix firmware debug restart recording iwlwifi: tt: move ucode_loaded check under mutex iwlwifi: mvm: support ibss in dqa mode iwlwifi: mvm: Fix command queue number on d0i3 flow iwlwifi: mvm: rs: start using LQ command color ...	2017-06-06 14:30:17 -07:00
David Rientjes	abb2ea7dfd	compiler, clang: suppress warning for unused static inline functions GCC explicitly does not warn for unused static inline functions for -Wunused-function. The manual states: Warn whenever a static function is declared but not defined or a non-inline static function is unused. Clang does warn for static inline functions that are unused. It turns out that suppressing the warnings avoids potentially complex #ifdef directives, which also reduces LOC. Suppress the warning for clang. Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-06-06 14:09:22 -07:00
Martin KaFai Lau	1e27097690	bpf: Add BPF_OBJ_GET_INFO_BY_FD A single BPF_OBJ_GET_INFO_BY_FD cmd is used to obtain the info for both bpf_prog and bpf_map. The kernel can figure out the fd is associated with a bpf_prog or bpf_map. The suggested struct bpf_prog_info and struct bpf_map_info are not meant to be a complete list and it is not the goal of this patch. New fields can be added in the future patch. The focus of this patch is to create the interface, BPF_OBJ_GET_INFO_BY_FD cmd for exposing the bpf_prog's and bpf_map's info. The obj's info, which will be extended (and get bigger) over time, is separated from the bpf_attr to avoid bloating the bpf_attr. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Alexei Starovoitov <ast@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-06 15:41:24 -04:00
Martin KaFai Lau	783d28dd11	bpf: Add jited_len to struct bpf_prog Add jited_len to struct bpf_prog. It will be useful for the struct bpf_prog_info which will be added in the later patch. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Alexei Starovoitov <ast@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-06 15:41:24 -04:00
Martin KaFai Lau	f3f1c054c2	bpf: Introduce bpf_map ID This patch generates an unique ID for each created bpf_map. The approach is similar to the earlier patch for bpf_prog ID. It is worth to note that the bpf_map's ID and bpf_prog's ID are in two independent ID spaces and both have the same valid range: [1, INT_MAX). Signed-off-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Alexei Starovoitov <ast@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-06 15:41:22 -04:00
Martin KaFai Lau	dc4bb0e235	bpf: Introduce bpf_prog ID This patch generates an unique ID for each BPF_PROG_LOAD-ed prog. It is worth to note that each BPF_PROG_LOAD-ed prog will have a different ID even they have the same bpf instructions. The ID is generated by the existing idr_alloc_cyclic(). The ID is ranged from [1, INT_MAX). It is allocated in cyclic manner, so an ID will get reused every 2 billion BPF_PROG_LOAD. The bpf_prog_alloc_id() is done after bpf_prog_select_runtime() because the jit process may have allocated a new prog. Hence, we need to ensure the value of pointer 'prog' will not be changed any more before storing the prog to the prog_idr. After bpf_prog_select_runtime(), the prog is read-only. Hence, the id is stored in 'struct bpf_prog_aux'. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Alexei Starovoitov <ast@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-06 15:41:22 -04:00
yuval.shaia@oracle.com	f8fe997546	net: phy: Delete unused function phy_ethtool_gset It's unused, so remove it. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-06 15:12:28 -04:00
David S. Miller	bb36314054	Merge tag 'rxrpc-rewrite-20170606' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs David Howells says: ==================== rxrpc: Support service upgrade Here's a set of patches that allow AF_RXRPC to support the AuriStor service upgrade facility. This allows the server to change the service ID requested to an upgraded service if the client requests it upon the initiation of a connection. This is used by the AuriStor AFS-compatible servers to implement IPv6 handling and improved facilities by providing improved volume location, volume, protection, file and cache management services. Note that certain parts of the AFS protocol carry hard-coded IPv4 addresses. The reason AuriStor does it this way is that probing the improved service ID first will not incur an ABORT or any other response on some servers if the server is not listening on it - and so one have to employ a timeout. This is implemented in the server by allowing an AF_RXRPC server to call bind() twice on a socket to allow it to listen on two service IDs and then call setsockopt() to instruct the server to upgrade one into the other if the client requests it (by setting userStatus to 1 on the first DATA packet on a connection). If the upgrade occurs, all further operations on that connection are done with the new service ID. AF_RXRPC has to handle this automatically as connections are not exposed to userspace. Clients can request this facility by setting an RXRPC_UPGRADE_SERVICE command in the sendmsg() control buffer and then observing the resultant service ID in the msg_addr returned by recvmsg(). This should only be used to probe the service. Clients should then use the returned service ID in all subsequent communications with that server. Note that the kernel will not retain this information should the connection expire from its cache. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-06 12:05:57 -04:00
Linus Torvalds	ba7b2387ad	Merge branch 'for-4.12-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fixes from Tejun Heo: "Two cgroup fixes. One to address RCU delay of cpuset removal affecting userland visible behaviors. The other fixes a race condition between controller disable and cgroup removal" * 'for-4.12-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cpuset: consider dying css as offline cgroup: Prevent kill_css() from being called more than once	2017-06-05 15:37:03 -07:00
yuval.shaia@oracle.com	82c01a84d5	net/{mii, smsc}: Make mii_ethtool_get_link_ksettings and smc_netdev_get_ecmd return void Make return value void since functions never returns meaningfull value. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-05 11:00:42 -04:00
David Howells	4e255721d1	rxrpc: Add service upgrade support for client connections Make it possible for a client to use AuriStor's service upgrade facility. The client does this by adding an RXRPC_UPGRADE_SERVICE control message to the first sendmsg() of a call. This takes no parameters. When recvmsg() starts returning data from the call, the service ID field in the returned msg_name will reflect the result of the upgrade attempt. If the upgrade was ignored, srx_service will match what was set in the sendmsg(); if the upgrade happened the srx_service will be altered to indicate the service the server upgraded to. Note that: (1) The choice of upgrade service is up to the server (2) Further client calls to the same server that would share a connection are blocked if an upgrade probe is in progress. (3) This should only be used to probe the service. Clients should then use the returned service ID in all subsequent communications with that server (and not set the upgrade). Note that the kernel will not retain this information should the connection expire from its cache. (4) If a server that supports upgrading is replaced by one that doesn't, whilst a connection is live, and if the replacement is running, say, OpenAFS 1.6.4 or older or an older IBM AFS, then the replacement server will not respond to packets sent to the upgraded connection. At this point, calls will time out and the server must be reprobed. Signed-off-by: David Howells <dhowells@redhat.com>	2017-06-05 14:30:49 +01:00
David Howells	4722974d90	rxrpc: Implement service upgrade Implement AuriStor's service upgrade facility. There are three problems that this is meant to deal with: (1) Various of the standard AFS RPC calls have IPv4 addresses in their requests and/or replies - but there's no room for including IPv6 addresses. (2) Definition of IPv6-specific RPC operations in the standard operation sets has not yet been achieved. (3) One could envision the creation a new service on the same port that as the original service. The new service could implement improved operations - and the client could try this first, falling back to the original service if it's not there. Unfortunately, certain servers ignore packets addressed to a service they don't implement and don't respond in any way - not even with an ABORT. This means that the client must then wait for the call timeout to occur. What service upgrade does is to see if the connection is marked as being 'upgradeable' and if so, change the service ID in the server and thus the request and reply formats. Note that the upgrade isn't mandatory - a server that supports only the original call set will ignore the upgrade request. In the protocol, the procedure is then as follows: (1) To request an upgrade, the first DATA packet in a new connection must have the userStatus set to 1 (this is normally 0). The userStatus value is normally ignored by the server. (2) If the server doesn't support upgrading, the reply packets will contain the same service ID as for the first request packet. (3) If the server does support upgrading, all future reply packets on that connection will contain the new service ID and the new service ID will be applied to all further calls on that connection as well. (4) The RPC op used to probe the upgrade must take the same request data as the shadow call in the upgrade set (but may return a different reply). GetCapability RPC ops were added to all standard sets for just this purpose. Ops where the request formats differ cannot be used for probing. (5) The client must wait for completion of the probe before sending any further RPC ops to the same destination. It should then use the service ID that recvmsg() reported back in all future calls. (6) The shadow service must have call definitions for all the operation IDs defined by the original service. To support service upgrading, a server should: (1) Call bind() twice on its AF_RXRPC socket before calling listen(). Each bind() should supply a different service ID, but the transport addresses must be the same. This allows the server to receive requests with either service ID. (2) Enable automatic upgrading by calling setsockopt(), specifying RXRPC_UPGRADEABLE_SERVICE and passing in a two-member array of unsigned shorts as the argument: unsigned short optval[2]; This specifies a pair of service IDs. They must be different and must match the service IDs bound to the socket. Member 0 is the service ID to upgrade from and member 1 is the service ID to upgrade to. Signed-off-by: David Howells <dhowells@redhat.com>	2017-06-05 14:30:49 +01:00
Talat Batheesh	6dc06c08be	net/mlx4: Fix the check in attaching steering rules Our previous patch (cited below) introduced a regression for RAW Eth QPs. Fix it by checking if the QP number provided by user-space exists, hence allowing steering rules to be added for valid QPs only. Fixes: `89c557687a` ("net/mlx4_en: Avoid adding steering rules with invalid ring") Reported-by: Or Gerlitz <gerlitz.or@gmail.com> Signed-off-by: Talat Batheesh <talatb@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Acked-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 23:10:05 -04:00
Mintz, Yuval	cbb8a12c08	qed: VF XDP support The final addition on the qed front - - VFs would now require their PFs to provide multiple CIDs - Based on the availability of connections from PF, determine whether XDP is feasible and share it with qede via dev_info. Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 23:08:31 -04:00
Mintz, Yuval	08bc8f15e6	qed: Multiple qzone queues for VFs This adds the infrastructure for supporting VFs that want to open multiple transmission queues on the same queue-zone. At this point, there are no VFs that actually request this functionality, but later patches would remedy that. a. VF and PF would communicate the capability during ACQUIRE; Legacy VFs would continue on behaving as they do today b. PF would communicate number of supported CIDs to the VF and would enforce said limitation c. Whenever VF passes a request for a given queue configuration it would also pass an associated index within said queue-zone Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 23:08:31 -04:00
Mintz, Yuval	f604b17d7f	qed*: L2 interface to use the SB structures directly Part of an effort of a cleaner seperation between qed and the protocol drivers, the L2 interface is to use the SB structure for initialization purposes opaquely. Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 23:08:30 -04:00
Jason A. Donenfeld	48a1df6533	skbuff: return -EMSGSIZE in skb_to_sgvec to prevent overflow This is a defense-in-depth measure in response to bugs like `4d6fa57b4d` ("macsec: avoid heap overflow in skb_to_sgvec"). There's not only a potential overflow of sglist items, but also a stack overflow potential, so we fix this by limiting the amount of recursion this function is allowed to do. Not actually providing a bounded base case is a future disaster that we can easily avoid here. As a small matter of house keeping, we take this opportunity to move the documentation comment over the actual function the documentation is for. While this could be implemented by using an explicit stack of skbuffs, when implementing this, the function complexity increased considerably, and I don't think such complexity and bloat is actually worth it. So, instead I built this and tested it on x86, x86_64, ARM, ARM64, and MIPS, and measured the stack usage there. I also reverted the recent MIPS changes that give it a separate IRQ stack, so that I could experience some worst-case situations. I found that limiting it to 24 layers deep yielded a good stack usage with room for safety, as well as being much deeper than any driver actually ever creates. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "David S. Miller" <davem@davemloft.net> Cc: David Howells <dhowells@redhat.com> Cc: Sabrina Dubroca <sd@queasysnail.net> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 23:01:47 -04:00
Alexei Starovoitov	f91840a32d	perf, bpf: Add BPF support to all perf_event types Allow BPF_PROG_TYPE_PERF_EVENT program types to attach to all perf_event types, including HW_CACHE, RAW, and dynamic pmu events. Only tracepoint/kprobe events are treated differently which require BPF_PROG_TYPE_TRACEPOINT/BPF_PROG_TYPE_KPROBE program types accordingly. Also add support for reading all event counters using bpf_perf_event_read() helper. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 21:58:01 -04:00
Linus Torvalds	55cbdaf639	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull rdma fixes from Doug Ledford: "For the most part this is just a minor -rc cycle for the rdma subsystem. Even given that this is all of the -rc patches since the merge window closed, it's still only about 25 patches: - Multiple i40iw, nes, iw_cxgb4, hfi1, qib, mlx4, mlx5 fixes - A few upper layer protocol fixes (IPoIB, iSER, SRP) - A modest number of core fixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (26 commits) RDMA/SA: Fix kernel panic in CMA request handler flow RDMA/umem: Fix missing mmap_sem in get umem ODP call RDMA/core: not to set page dirty bit if it's already set. RDMA/uverbs: Declare local function static and add brackets to sizeof RDMA/netlink: Reduce exposure of RDMA netlink functions RDMA/srp: Fix NULL deref at srp_destroy_qp() RDMA/IPoIB: Limit the ipoib_dev_uninit_default scope RDMA/IPoIB: Replace netdev_priv with ipoib_priv for ipoib_get_link_ksettings RDMA/qedr: add null check before pointer dereference RDMA/mlx5: set UMR wqe fence according to HCA cap net/mlx5: Define interface bits for fencing UMR wqe RDMA/mlx4: Fix MAD tunneling when SRIOV is enabled RDMA/qib,hfi1: Fix MR reference count leak on write with immediate RDMA/hfi1: Defer setting VL15 credits to link-up interrupt RDMA/hfi1: change PCI bar addr assignments to Linux API functions RDMA/hfi1: fix array termination by appending NULL to attr array RDMA/iw_cxgb4: fix the calculation of ipv6 header size RDMA/iw_cxgb4: calculate t4_eq_status_entries properly RDMA/iw_cxgb4: Avoid touch after free error in ARP failure handlers RDMA/nes: ACK MPA Reply frame ...	2017-06-04 10:41:32 -07:00
Michal Hocko	864b9a393d	mm: consider memblock reservations for deferred memory initialization sizing We have seen an early OOM killer invocation on ppc64 systems with crashkernel=4096M: kthreadd invoked oom-killer: gfp_mask=0x16040c0(GFP_KERNEL\|__GFP_COMP\|__GFP_NOTRACK), nodemask=7, order=0, oom_score_adj=0 kthreadd cpuset=/ mems_allowed=7 CPU: 0 PID: 2 Comm: kthreadd Not tainted 4.4.68-1.gd7fe927-default #1 Call Trace: dump_stack+0xb0/0xf0 (unreliable) dump_header+0xb0/0x258 out_of_memory+0x5f0/0x640 __alloc_pages_nodemask+0xa8c/0xc80 kmem_getpages+0x84/0x1a0 fallback_alloc+0x2a4/0x320 kmem_cache_alloc_node+0xc0/0x2e0 copy_process.isra.25+0x260/0x1b30 _do_fork+0x94/0x470 kernel_thread+0x48/0x60 kthreadd+0x264/0x330 ret_from_kernel_thread+0x5c/0xa4 Mem-Info: active_anon:0 inactive_anon:0 isolated_anon:0 active_file:0 inactive_file:0 isolated_file:0 unevictable:0 dirty:0 writeback:0 unstable:0 slab_reclaimable:5 slab_unreclaimable:73 mapped:0 shmem:0 pagetables:0 bounce:0 free:0 free_pcp:0 free_cma:0 Node 7 DMA free:0kB min:0kB low:0kB high:0kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:52428800kB managed:110016kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:320kB slab_unreclaimable:4672kB kernel_stack:1152kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes lowmem_reserve[]: 0 0 0 0 Node 7 DMA: 064kB 0128kB 0256kB 0512kB 01024kB 02048kB 04096kB 08192kB 0*16384kB = 0kB 0 total pagecache pages 0 pages in swap cache Swap cache stats: add 0, delete 0, find 0/0 Free swap = 0kB Total swap = 0kB 819200 pages RAM 0 pages HighMem/MovableOnly 817481 pages reserved 0 pages cma reserved 0 pages hwpoisoned the reason is that the managed memory is too low (only 110MB) while the rest of the the 50GB is still waiting for the deferred intialization to be done. update_defer_init estimates the initial memoty to initialize to 2GB at least but it doesn't consider any memory allocated in that range. In this particular case we've had Reserving 4096MB of memory at 128MB for crashkernel (System RAM: 51200MB) so the low 2GB is mostly depleted. Fix this by considering memblock allocations in the initial static initialization estimation. Move the max_initialise to reset_deferred_meminit and implement a simple memblock_reserved_memory helper which iterates all reserved blocks and sums the size of all that start below the given address. The cumulative size is than added on top of the initial estimation. This is still not ideal because reset_deferred_meminit doesn't consider holes and so reservation might be above the initial estimation whihch we ignore but let's make the logic simpler until we really need to handle more complicated cases. Fixes: `3a80a7fa79` ("mm: meminit: initialise a subset of struct pages if CONFIG_DEFERRED_STRUCT_PAGE_INIT is set") Link: http://lkml.kernel.org/r/20170531104010.GI27783@dhcp22.suse.cz Signed-off-by: Michal Hocko <mhocko@suse.com> Acked-by: Mel Gorman <mgorman@suse.de> Tested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: <stable@vger.kernel.org> [4.2+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-06-02 15:07:38 -07:00
James Morse	9a291a7c94	mm/hugetlb: report -EHWPOISON not -EFAULT when FOLL_HWPOISON is specified KVM uses get_user_pages() to resolve its stage2 faults. KVM sets the FOLL_HWPOISON flag causing faultin_page() to return -EHWPOISON when it finds a VM_FAULT_HWPOISON. KVM handles these hwpoison pages as a special case. (check_user_page_hwpoison()) When huge pages are involved, this doesn't work so well. get_user_pages() calls follow_hugetlb_page(), which stops early if it receives VM_FAULT_HWPOISON from hugetlb_fault(), eventually returning -EFAULT to the caller. The step to map this to -EHWPOISON based on the FOLL_ flags is missing. The hwpoison special case is skipped, and -EFAULT is returned to user-space, causing Qemu or kvmtool to exit. Instead, move this VM_FAULT_ to errno mapping code into a header file and use it from faultin_page() and follow_hugetlb_page(). With this, KVM works as expected. This isn't a problem for arm64 today as we haven't enabled MEMORY_FAILURE, but I can't see any reason this doesn't happen on x86 too, so I think this should be a fix. This doesn't apply earlier than stable's v4.11.1 due to all sorts of cleanup. [james.morse@arm.com: add vm_fault_to_errno() call to faultin_page()] suggested. Link: http://lkml.kernel.org/r/20170525171035.16359-1-james.morse@arm.com [akpm@linux-foundation.org: coding-style fixes] Link: http://lkml.kernel.org/r/20170524160900.28786-1-james.morse@arm.com Signed-off-by: James Morse <james.morse@arm.com> Acked-by: Punit Agrawal <punit.agrawal@arm.com> Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com> Cc: <stable@vger.kernel.org> [4.11.1+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-06-02 15:07:38 -07:00
Matthias Kaehlcke	60b0a8c3d2	frv: declare jiffies to be located in the .data section Commit `7c30f352c8` ("jiffies.h: declare jiffies and jiffies_64 with ____cacheline_aligned_in_smp") removed a section specification from the jiffies declaration that caused conflicts on some platforms. Unfortunately this change broke the build for frv: kernel/built-in.o: In function `__do_softirq': (.text+0x6460): relocation truncated to fit: R_FRV_GPREL12 against symbol `jiffies' defined in ABS section in .tmp_vmlinux1 kernel/built-in.o: In function `__do_softirq': (.text+0x6574): relocation truncated to fit: R_FRV_GPREL12 against symbol `jiffies' defined in ABS section in .tmp_vmlinux1 kernel/built-in.o: In function `pwq_activate_delayed_work': workqueue.c:(.text+0x15b9c): relocation truncated to fit: R_FRV_GPREL12 against symbol `jiffies' defined in ABS section in .tmp_vmlinux1 ... Add __jiffy_arch_data to the declaration of jiffies and use it on frv to include the section specification. For all other platforms __jiffy_arch_data (currently) has no effect. Fixes: `7c30f352c8` ("jiffies.h: declare jiffies and jiffies_64 with ____cacheline_aligned_in_smp") Link: http://lkml.kernel.org/r/20170516221333.177280-1-mka@chromium.org Signed-off-by: Matthias Kaehlcke <mka@chromium.org> Reported-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: David Howells <dhowells@redhat.com> Cc: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-06-02 15:07:37 -07:00
Michal Hocko	1bde33e051	include/linux/gfp.h: fix ___GFP_NOLOCKDEP value Igor Stoppa has noticed that __GFP_NOLOCKDEP can use a lower bit. At the time commit `7e7844226f` ("lockdep: allow to disable reclaim lockup detection") was written we still had __GFP_OTHER_NODE but I have removed it in commit `41b6167e8f` ("mm: get rid of __GFP_OTHER_NODE") and forgot to lower the bit value. The current value is outside of __GFP_BITS_SHIFT so it cannot be used actually. Fixes: `7e7844226f` ("lockdep: allow to disable reclaim lockup detection") Signed-off-by: Michal Hocko <mhocko@suse.com> Reported-by: Igor Stoppa <igor.stoppa@nokia.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-06-02 15:07:37 -07:00
David S. Miller	6e7da286e3	Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 40GbE Intel Wired LAN Driver Updates 2017-06-01 This series contains updates to i40e, i40evf and the "new" AVF virtchnl. This is the introduction of the Intel(R) Ethernet Adaptive Virtual Function driver code and device ID, as presented at the NetDEV 1.2 conference in 2016. http://netdevconf.org/1.2/session.html?anjali-singhai The idea is to convert the interface between the i40evf driver and the parent i40e PF driver to be generic, as the i40evf driver should in the future be able to run on top of other Intel PF drivers, and negotiate any features beyond a "base expected" set. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-02 13:47:29 -04:00
Mintz, Yuval	dc4528e9e8	qed: Add support for changing iSCSI mac Enhance API between qedi and qed, allowing qedi to inform device's firmware when the iSCSI mac is to be changed. Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-02 10:33:03 -04:00
Mintz, Yuval	20675b37ee	qed: Support NVM-image reading API Storage drivers require images from the nvram in boot-from-SAN scenarios. This provides the necessary API between qed and the protocol drivers to perform such reads. Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-02 10:33:03 -04:00
Mintz, Yuval	3c5da94278	qed: Share additional information with qedf Share several new tidbits with qedf: - wwpn & wwnn - Absolute pf-id [this one is actually meant for qedi as well] - Number of available CQs While we're at it, now that qedf will be aware of the available CQs we can add some validation on the inputs it provides. Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-02 10:33:03 -04:00
Zhang Shengju	3a5f8997dc	team: add macro MODULE_ALIAS_TEAM_MODE for team mode alias Add a new macro MODULE_ALIAS_TEAM_MODE to unify and simplify the declaration of team mode alias. Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-02 10:20:49 -04:00
Linus Torvalds	3b1e342be2	Merge tag 'nfsd-4.12-1' of git://linux-nfs.org/~bfields/linux Pull nfsd fixes from Bruce Fields: "Revert patch accidentally included in the merge window pull request, and fix a crash that was likely a result of buggy client behavior" * tag 'nfsd-4.12-1' of git://linux-nfs.org/~bfields/linux: nfsd4: fix null dereference on replay nfsd: Revert "nfsd: check for oversized NFSv2/v3 arguments"	2017-06-01 16:24:48 -07:00
Sridhar Samudrala	73556269aa	virtchnl: Add compile time static asserts to validate structure sizes This uses preprocessor tricks to make sure that a divide by zero occurs if a struct changes size outside the expected number of bytes. Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-01 14:26:23 -07:00
Sridhar Samudrala	a33c83c435	virtchnl: Add pad fields to a couple of structures This removes holes and makes structure sizes consistent across 32 and 64 bit builds. Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-01 14:23:55 -07:00
Jesse Brandeburg	735e35c56b	i40e/virtchnl: move function to virtchnl This moves a function that is needed for the virtchnl interface from the i40e PF driver over to the virtchnl.h file. It was manually verified that the function in question is unchanged except for the function name and function header, which explains the slight difference in the number of lines removed/added. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-01 14:22:53 -07:00
Jesse Brandeburg	ff3f4cc267	virtchnl: finish conversion to virtchnl interface This patch implements the complete version of the virtchnl.h file with final renames, and fixes the related code in i40e and i40evf. It also expands comments, and adds details on the usage of certain fields. In addition, due to the changes a couple of casts are needed to prevent errors found by sparse after renaming some fields. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-01 14:21:27 -07:00
Jesse Brandeburg	f0adc6e831	i40evf/virtchnl: whitespace cleanups This patch fixes up a bunch of whitespace issues introduced by the previous automated change of name from i40e to virtchnl. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-01 14:19:14 -07:00
Jesse Brandeburg	764430ce6f	i40e/virtchnl: refactor code for validate checks This change updates the arguments passed to the validate function and fixes the caller, as well as uses the new return values added to virtchnl.h One other minor tweak, remove a duplicate set to zero of valid_len. This is in preparation for moving the function to virtchnl.h. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-01 14:17:02 -07:00
Jesse Brandeburg	eedcfef85b	virtchnl: convert to new macros As part of the conversion, change the arguments to VF_IS_V1[01] macros and move them to virtchnl.h Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-01 14:15:21 -07:00
Jesse Brandeburg	310a2ad92e	virtchnl: rename i40e to generic virtchnl This morphs all the i40e and i40evf references to/in virtchnl.h to be generic, using only automated methods. Updates all the callers to use the new names. A followup patch provides separate clean ups for messy line conversions from these "automatic" changes, to make them more reviewable. Was executed with the following sed script: sed -i -f transform_script drivers/net/ethernet/intel/i40e/i40e_client.c sed -i -f transform_script drivers/net/ethernet/intel/i40e/i40e_prototype.h sed -i -f transform_script drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c sed -i -f transform_script drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h sed -i -f transform_script drivers/net/ethernet/intel/i40evf/i40e_common.c sed -i -f transform_script drivers/net/ethernet/intel/i40evf/i40e_prototype.h sed -i -f transform_script drivers/net/ethernet/intel/i40evf/i40evf.h sed -i -f transform_script drivers/net/ethernet/intel/i40evf/i40evf_client.c sed -i -f transform_script drivers/net/ethernet/intel/i40evf/i40evf_main.c sed -i -f transform_script drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c sed -i -f transform_script include/linux/avf/virtchnl.h transform_script: ----8<---- s/I40E_VIRTCHNL_SUPPORTED_QTYPES/SAVE_ME_SUPPORTED_QTYPES/g s/I40E_VIRTCHNL_VF_CAP/SAVE_ME_VF_CAP/g s/I40E_VIRTCHNL_/VIRTCHNL_/g s/i40e_virtchnl_/virtchnl_/g s/i40e_vfr_/virtchnl_vfr_/g s/I40E_VFR_/VIRTCHNL_VFR_/g s/VIRTCHNL_OP_ADD_ETHER_ADDRESS/VIRTCHNL_OP_ADD_ETH_ADDR/g s/VIRTCHNL_OP_DEL_ETHER_ADDRESS/VIRTCHNL_OP_DEL_ETH_ADDR/g s/VIRTCHNL_OP_FCOE/VIRTCHNL_OP_RSVD/g s/SAVE_ME_SUPPORTED_QTYPES/I40E_VIRTCHNL_SUPPORTED_QTYPES/g s/SAVE_ME_VF_CAP/I40E_VIRTCHNL_VF_CAP/g ----8<---- Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-01 14:08:53 -07:00
Max Gurtovoy	1410a90ae4	net/mlx5: Define interface bits for fencing UMR wqe HW can implement UMR wqe re-transmission in various ways. Thus, add HCA cap to distinguish the needed fence for UMR to make sure that the wqe wouldn't fail on mkey checks. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Acked-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-06-01 17:05:04 -04:00
Jesse Brandeburg	681bdf80cf	i40e/i40evf: create and use new unified header file This moves a header for i40evf to include/linux/avf/virtchnl.h. The directory name AVF is an acronym for the Intel(R) Adaptive Virtual Function. This first step creates the new file, which is a rename of drivers/net/ethernet/intel/i40evf/i40e_virtchnl.h to include/linux/avf/virtchnl.h, and should show up in git as a rename when using git log --follow. To keep things building after the move, the changes to the i40evf driver are made to point to the new include file location. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-01 14:04:42 -07:00
LABBE Corentin	9f93ac8d40	net-next: stmmac: Add dwmac-sun8i The dwmac-sun8i is a heavy hacked version of stmmac hardware by allwinner. In fact the only common part is the descriptor management and the first register function. Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-01 14:53:04 -04:00
LABBE Corentin	ec33d71de7	net-next: stmmac: add optional setup function Instead of adding more ifthen logic for adding a new mac_device_info setup function, it is easier to add a function pointer to the function needed. Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-01 14:53:03 -04:00
Mintz, Yuval	726fdbe9fa	qed: Encapsulate interrupt counters in struct We already have an API struct that contains interrupt-related numbers. Use it to encapsulate all information relating to the status of SBs as (used\|free). Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-01 12:17:18 -04:00

1 2 3 4 5 ...

55717 Commits