linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-07 19:30:30 +09:00

Author	SHA1	Message	Date
Taehee Yoo	0a5eca2c94	gtp: fix Illegal context switch in RCU read-side critical section. [ Upstream commit `3f167e1921` ] ipv4_pdp_add() is called in RCU read-side critical section. So GFP_KERNEL should not be used in the function. This patch make ipv4_pdp_add() to use GFP_ATOMIC instead of GFP_KERNEL. Test commands: gtp-link add gtp1 & gtp-tunnel add gtp1 v1 100 200 1.1.1.1 2.2.2.2 Splat looks like: [ 130.618881] ============================= [ 130.626382] WARNING: suspicious RCU usage [ 130.626994] 5.2.0-rc6+ #50 Not tainted [ 130.627622] ----------------------------- [ 130.628223] ./include/linux/rcupdate.h:266 Illegal context switch in RCU read-side critical section! [ 130.629684] [ 130.629684] other info that might help us debug this: [ 130.629684] [ 130.631022] [ 130.631022] rcu_scheduler_active = 2, debug_locks = 1 [ 130.632136] 4 locks held by gtp-tunnel/1025: [ 130.632925] #0: 000000002b93c8b7 (cb_lock){++++}, at: genl_rcv+0x15/0x40 [ 130.634159] #1: 00000000f17bc999 (genl_mutex){+.+.}, at: genl_rcv_msg+0xfb/0x130 [ 130.635487] #2: 00000000c644ed8e (rtnl_mutex){+.+.}, at: gtp_genl_new_pdp+0x18c/0x1150 [gtp] [ 130.636936] #3: 0000000007a1cde7 (rcu_read_lock){....}, at: gtp_genl_new_pdp+0x187/0x1150 [gtp] [ 130.638348] [ 130.638348] stack backtrace: [ 130.639062] CPU: 1 PID: 1025 Comm: gtp-tunnel Not tainted 5.2.0-rc6+ #50 [ 130.641318] Call Trace: [ 130.641707] dump_stack+0x7c/0xbb [ 130.642252] ___might_sleep+0x2c0/0x3b0 [ 130.642862] kmem_cache_alloc_trace+0x1cd/0x2b0 [ 130.643591] gtp_genl_new_pdp+0x6c5/0x1150 [gtp] [ 130.644371] genl_family_rcv_msg+0x63a/0x1030 [ 130.645074] ? mutex_lock_io_nested+0x1090/0x1090 [ 130.645845] ? genl_unregister_family+0x630/0x630 [ 130.646592] ? debug_show_all_locks+0x2d0/0x2d0 [ 130.647293] ? check_flags.part.40+0x440/0x440 [ 130.648099] genl_rcv_msg+0xa3/0x130 [ ... ] Fixes: `459aa660eb` ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:17 +02:00
Taehee Yoo	e117a04133	gtp: fix suspicious RCU usage [ Upstream commit `e198987e7d` ] gtp_encap_enable_socket() and gtp_encap_destroy() are not protected by rcu_read_lock(). and it's not safe to write sk->sk_user_data. This patch make these functions to use lock_sock() instead of rcu_dereference_sk_user_data(). Test commands: gtp-link add gtp1 Splat looks like: [ 83.238315] ============================= [ 83.239127] WARNING: suspicious RCU usage [ 83.239702] 5.2.0-rc6+ #49 Not tainted [ 83.240268] ----------------------------- [ 83.241205] drivers/net/gtp.c:799 suspicious rcu_dereference_check() usage! [ 83.243828] [ 83.243828] other info that might help us debug this: [ 83.243828] [ 83.246325] [ 83.246325] rcu_scheduler_active = 2, debug_locks = 1 [ 83.247314] 1 lock held by gtp-link/1008: [ 83.248523] #0: 0000000017772c7f (rtnl_mutex){+.+.}, at: __rtnl_newlink+0x5f5/0x11b0 [ 83.251503] [ 83.251503] stack backtrace: [ 83.252173] CPU: 0 PID: 1008 Comm: gtp-link Not tainted 5.2.0-rc6+ #49 [ 83.253271] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 83.254562] Call Trace: [ 83.254995] dump_stack+0x7c/0xbb [ 83.255567] gtp_encap_enable_socket+0x2df/0x360 [gtp] [ 83.256415] ? gtp_find_dev+0x1a0/0x1a0 [gtp] [ 83.257161] ? memset+0x1f/0x40 [ 83.257843] gtp_newlink+0x90/0xa21 [gtp] [ 83.258497] ? __netlink_ns_capable+0xc3/0xf0 [ 83.259260] __rtnl_newlink+0xb9f/0x11b0 [ 83.260022] ? rtnl_link_unregister+0x230/0x230 [ ... ] Fixes: `1e3a3abd8b` ("gtp: make GTP sockets in gtp_newlink optional") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:17 +02:00
csonsino	202de90df2	Bluetooth: validate BLE connection interval updates [ Upstream commit `c49a8682fc` ] Problem: The Linux Bluetooth stack yields complete control over the BLE connection interval to the remote device. The Linux Bluetooth stack provides access to the BLE connection interval min and max values through /sys/kernel/debug/bluetooth/hci0/ conn_min_interval and /sys/kernel/debug/bluetooth/hci0/conn_max_interval. These values are used for initial BLE connections, but the remote device has the ability to request a connection parameter update. In the event that the remote side requests to change the connection interval, the Linux kernel currently only validates that the desired value is within the acceptable range in the Bluetooth specification (6 - 3200, corresponding to 7.5ms - 4000ms). There is currently no validation that the desired value requested by the remote device is within the min/max limits specified in the conn_min_interval/conn_max_interval configurations. This essentially leads to Linux yielding complete control over the connection interval to the remote device. The proposed patch adds a verification step to the connection parameter update mechanism, ensuring that the desired value is within the min/max bounds of the current connection. If the desired value is outside of the current connection min/max values, then the connection parameter update request is rejected and the negative response is returned to the remote device. Recall that the initial connection is established using the local conn_min_interval/conn_max_interval values, so this allows the Linux administrator to retain control over the BLE connection interval. The one downside that I see is that the current default Linux values for conn_min_interval and conn_max_interval typically correspond to 30ms and 50ms respectively. If this change were accepted, then it is feasible that some devices would no longer be able to negotiate to their desired connection interval values. This might be remedied by setting the default Linux conn_min_interval and conn_max_interval values to the widest supported range (6 - 3200 / 7.5ms - 4000ms). This could lead to the same behavior as the current implementation, where the remote device could request to change the connection interval value to any value that is permitted by the Bluetooth specification, and Linux would accept the desired value. Signed-off-by: Carey Sonsino <csonsino@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:17 +02:00
Taehee Yoo	ca33af18b5	gtp: add missing gtp_encap_disable_sock() in gtp_encap_enable() [ Upstream commit `e30155fd23` ] If an invalid role is sent from user space, gtp_encap_enable() will fail. Then, it should call gtp_encap_disable_sock() but current code doesn't. It makes memory leak. Fixes: `91ed81f9ab` ("gtp: support SGSN-side tunnels") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:17 +02:00
Matias Karhumaa	0fdb922d0e	Bluetooth: Check state in l2cap_disconnect_rsp [ Upstream commit `28261da8a2` ] Because of both sides doing L2CAP disconnection at the same time, it was possible to receive L2CAP Disconnection Response with CID that was already freed. That caused problems if CID was already reused and L2CAP Connection Request with same CID was sent out. Before this patch kernel deleted channel context regardless of the state of the channel. Example where leftover Disconnection Response (frame #402) causes local device to delete L2CAP channel which was not yet connected. This in turn confuses remote device's stack because same CID is re-used without properly disconnecting. Btmon capture before patch: snip > ACL Data RX: Handle 43 flags 0x02 dlen 8 #394 [hci1] 10.748949 Channel: 65 len 4 [PSM 3 mode 0] {chan 2} RFCOMM: Disconnect (DISC) (0x43) Address: 0x03 cr 1 dlci 0x00 Control: 0x53 poll/final 1 Length: 0 FCS: 0xfd < ACL Data TX: Handle 43 flags 0x00 dlen 8 #395 [hci1] 10.749062 Channel: 65 len 4 [PSM 3 mode 0] {chan 2} RFCOMM: Unnumbered Ack (UA) (0x63) Address: 0x03 cr 1 dlci 0x00 Control: 0x73 poll/final 1 Length: 0 FCS: 0xd7 < ACL Data TX: Handle 43 flags 0x00 dlen 12 #396 [hci1] 10.749073 L2CAP: Disconnection Request (0x06) ident 17 len 4 Destination CID: 65 Source CID: 65 > HCI Event: Number of Completed Packets (0x13) plen 5 #397 [hci1] 10.752391 Num handles: 1 Handle: 43 Count: 1 > HCI Event: Number of Completed Packets (0x13) plen 5 #398 [hci1] 10.753394 Num handles: 1 Handle: 43 Count: 1 > ACL Data RX: Handle 43 flags 0x02 dlen 12 #399 [hci1] 10.756499 L2CAP: Disconnection Request (0x06) ident 26 len 4 Destination CID: 65 Source CID: 65 < ACL Data TX: Handle 43 flags 0x00 dlen 12 #400 [hci1] 10.756548 L2CAP: Disconnection Response (0x07) ident 26 len 4 Destination CID: 65 Source CID: 65 < ACL Data TX: Handle 43 flags 0x00 dlen 12 #401 [hci1] 10.757459 L2CAP: Connection Request (0x02) ident 18 len 4 PSM: 1 (0x0001) Source CID: 65 > ACL Data RX: Handle 43 flags 0x02 dlen 12 #402 [hci1] 10.759148 L2CAP: Disconnection Response (0x07) ident 17 len 4 Destination CID: 65 Source CID: 65 = bluetoothd: 00:1E:AB:4C:56:54: error updating services: Input/o.. 10.759447 > HCI Event: Number of Completed Packets (0x13) plen 5 #403 [hci1] 10.759386 Num handles: 1 Handle: 43 Count: 1 > ACL Data RX: Handle 43 flags 0x02 dlen 12 #404 [hci1] 10.760397 L2CAP: Connection Request (0x02) ident 27 len 4 PSM: 3 (0x0003) Source CID: 65 < ACL Data TX: Handle 43 flags 0x00 dlen 16 #405 [hci1] 10.760441 L2CAP: Connection Response (0x03) ident 27 len 8 Destination CID: 65 Source CID: 65 Result: Connection successful (0x0000) Status: No further information available (0x0000) < ACL Data TX: Handle 43 flags 0x00 dlen 27 #406 [hci1] 10.760449 L2CAP: Configure Request (0x04) ident 19 len 19 Destination CID: 65 Flags: 0x0000 Option: Maximum Transmission Unit (0x01) [mandatory] MTU: 1013 Option: Retransmission and Flow Control (0x04) [mandatory] Mode: Basic (0x00) TX window size: 0 Max transmit: 0 Retransmission timeout: 0 Monitor timeout: 0 Maximum PDU size: 0 > HCI Event: Number of Completed Packets (0x13) plen 5 #407 [hci1] 10.761399 Num handles: 1 Handle: 43 Count: 1 > ACL Data RX: Handle 43 flags 0x02 dlen 16 #408 [hci1] 10.762942 L2CAP: Connection Response (0x03) ident 18 len 8 Destination CID: 66 Source CID: 65 Result: Connection successful (0x0000) Status: No further information available (0x0000) snip Similar case after the patch: snip > ACL Data RX: Handle 43 flags 0x02 dlen 8 #22702 [hci0] 1664.411056 Channel: 65 len 4 [PSM 3 mode 0] {chan 3} RFCOMM: Disconnect (DISC) (0x43) Address: 0x03 cr 1 dlci 0x00 Control: 0x53 poll/final 1 Length: 0 FCS: 0xfd < ACL Data TX: Handle 43 flags 0x00 dlen 8 #22703 [hci0] 1664.411136 Channel: 65 len 4 [PSM 3 mode 0] {chan 3} RFCOMM: Unnumbered Ack (UA) (0x63) Address: 0x03 cr 1 dlci 0x00 Control: 0x73 poll/final 1 Length: 0 FCS: 0xd7 < ACL Data TX: Handle 43 flags 0x00 dlen 12 #22704 [hci0] 1664.411143 L2CAP: Disconnection Request (0x06) ident 11 len 4 Destination CID: 65 Source CID: 65 > HCI Event: Number of Completed Pac.. (0x13) plen 5 #22705 [hci0] 1664.414009 Num handles: 1 Handle: 43 Count: 1 > HCI Event: Number of Completed Pac.. (0x13) plen 5 #22706 [hci0] 1664.415007 Num handles: 1 Handle: 43 Count: 1 > ACL Data RX: Handle 43 flags 0x02 dlen 12 #22707 [hci0] 1664.418674 L2CAP: Disconnection Request (0x06) ident 17 len 4 Destination CID: 65 Source CID: 65 < ACL Data TX: Handle 43 flags 0x00 dlen 12 #22708 [hci0] 1664.418762 L2CAP: Disconnection Response (0x07) ident 17 len 4 Destination CID: 65 Source CID: 65 < ACL Data TX: Handle 43 flags 0x00 dlen 12 #22709 [hci0] 1664.421073 L2CAP: Connection Request (0x02) ident 12 len 4 PSM: 1 (0x0001) Source CID: 65 > ACL Data RX: Handle 43 flags 0x02 dlen 12 #22710 [hci0] 1664.421371 L2CAP: Disconnection Response (0x07) ident 11 len 4 Destination CID: 65 Source CID: 65 > HCI Event: Number of Completed Pac.. (0x13) plen 5 #22711 [hci0] 1664.424082 Num handles: 1 Handle: 43 Count: 1 > HCI Event: Number of Completed Pac.. (0x13) plen 5 #22712 [hci0] 1664.425040 Num handles: 1 Handle: 43 Count: 1 > ACL Data RX: Handle 43 flags 0x02 dlen 12 #22713 [hci0] 1664.426103 L2CAP: Connection Request (0x02) ident 18 len 4 PSM: 3 (0x0003) Source CID: 65 < ACL Data TX: Handle 43 flags 0x00 dlen 16 #22714 [hci0] 1664.426186 L2CAP: Connection Response (0x03) ident 18 len 8 Destination CID: 66 Source CID: 65 Result: Connection successful (0x0000) Status: No further information available (0x0000) < ACL Data TX: Handle 43 flags 0x00 dlen 27 #22715 [hci0] 1664.426196 L2CAP: Configure Request (0x04) ident 13 len 19 Destination CID: 65 Flags: 0x0000 Option: Maximum Transmission Unit (0x01) [mandatory] MTU: 1013 Option: Retransmission and Flow Control (0x04) [mandatory] Mode: Basic (0x00) TX window size: 0 Max transmit: 0 Retransmission timeout: 0 Monitor timeout: 0 Maximum PDU size: 0 > ACL Data RX: Handle 43 flags 0x02 dlen 16 #22716 [hci0] 1664.428804 L2CAP: Connection Response (0x03) ident 12 len 8 Destination CID: 66 Source CID: 65 Result: Connection successful (0x0000) Status: No further information available (0x0000) snip Fix is to check that channel is in state BT_DISCONN before deleting the channel. This bug was found while fuzzing Bluez's OBEX implementation using Synopsys Defensics. Reported-by: Matti Kamunen <matti.kamunen@synopsys.com> Reported-by: Ari Timonen <ari.timonen@synopsys.com> Signed-off-by: Matias Karhumaa <matias.karhumaa@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:16 +02:00
Seeteena Thoufeek	3b57b7a3a8	perf tests: Fix record+probe_libc_inet_pton.sh for powerpc64 [ Upstream commit `bff5a556c1` ] 'probe libc's inet_pton & backtrace it with ping' testcase sometimes fails on powerpc because distro ping binary does not have symbol information and thus it prints "[unknown]" function name in the backtrace. Accept "[unknown]" as valid function name for powerpc as well. # perf test -v "probe libc's inet_pton & backtrace it with ping" Before: 59: probe libc's inet_pton & backtrace it with ping : --- start --- test child forked, pid 79695 ping 79718 [077] 96483.787025: probe_libc:inet_pton: (7fff83a754c8) 7fff83a754c8 __GI___inet_pton+0x8 (/usr/lib64/power9/libc-2.28.so) 7fff83a2b7a0 gaih_inet.constprop.7+0x1020 (/usr/lib64/power9/libc-2.28.so) 7fff83a2c170 getaddrinfo+0x160 (/usr/lib64/power9/libc-2.28.so) 1171830f4 [unknown] (/usr/bin/ping) FAIL: expected backtrace entry ".\+0x[[:xdigit:]]+[[:space:]]$./bin/ping.*$$" got "1171830f4 [unknown] (/usr/bin/ping)" test child finished with -1 ---- end ---- probe libc's inet_pton & backtrace it with ping: FAILED! After: 59: probe libc's inet_pton & backtrace it with ping : --- start --- test child forked, pid 79085 ping 79108 [045] 96400.214177: probe_libc:inet_pton: (7fffbb9654c8) 7fffbb9654c8 __GI___inet_pton+0x8 (/usr/lib64/power9/libc-2.28.so) 7fffbb91b7a0 gaih_inet.constprop.7+0x1020 (/usr/lib64/power9/libc-2.28.so) 7fffbb91c170 getaddrinfo+0x160 (/usr/lib64/power9/libc-2.28.so) 132e830f4 [unknown] (/usr/bin/ping) test child finished with 0 ---- end ---- probe libc's inet_pton & backtrace it with ping: Ok Signed-off-by: Seeteena Thoufeek <s1seetee@linux.vnet.ibm.com> Reviewed-by: Kim Phillips <kim.phillips@amd.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Hendrik Brueckner <brueckner@linux.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan@linux.ibm.com> Fixes: `1632936480` ("perf tests: Fix record+probe_libc_inet_pton.sh without ping's debuginfo") Link: http://lkml.kernel.org/r/1561630614-3216-1-git-send-email-s1seetee@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:16 +02:00
Josua Mayer	c814f618b7	Bluetooth: 6lowpan: search for destination address in all peers [ Upstream commit `b188b03270` ] Handle overlooked case where the target address is assigned to a peer and neither route nor gateway exist. For one peer, no checks are performed to see if it is meant to receive packets for a given address. As soon as there is a second peer however, checks are performed to deal with routes and gateways for handling complex setups with multiple hops to a target address. This logic assumed that no route and no gateway imply that the destination address can not be reached, which is false in case of a direct peer. Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Tested-by: Michael Scott <mike@foundries.io> Signed-off-by: Josua Mayer <josua.mayer@jm0.eu> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:16 +02:00
João Paulo Rechi Vita	c82c4910e9	Bluetooth: Add new 13d3:3501 QCA_ROME device [ Upstream commit `881cec4f6b` ] Without the QCA ROME setup routine this adapter fails to establish a SCO connection. T: Bus=01 Lev=01 Prnt=01 Port=04 Cnt=01 Dev#= 2 Spd=12 MxCh= 0 D: Ver= 1.10 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=13d3 ProdID=3501 Rev=00.01 C: #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=100mA I: If#=0x0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb I: If#=0x1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb Signed-off-by: João Paulo Rechi Vita <jprvita@endlessm.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:16 +02:00
João Paulo Rechi Vita	1cbce19bd6	Bluetooth: Add new 13d3:3491 QCA_ROME device [ Upstream commit `44d34af2e4` ] Without the QCA ROME setup routine this adapter fails to establish a SCO connection. T: Bus=01 Lev=01 Prnt=01 Port=08 Cnt=01 Dev#= 2 Spd=12 MxCh= 0 D: Ver= 1.10 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=13d3 ProdID=3491 Rev=00.01 C: #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=100mA I: If#=0x0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb I: If#=0x1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb Signed-off-by: João Paulo Rechi Vita <jprvita@endlessm.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:16 +02:00
Tomas Bortoli	578658df21	Bluetooth: hci_bcsp: Fix memory leak in rx_skb [ Upstream commit `4ce9146e03` ] Syzkaller found that it is possible to provoke a memory leak by never freeing rx_skb in struct bcsp_struct. Fix by freeing in bcsp_close() Signed-off-by: Tomas Bortoli <tomasbortoli@gmail.com> Reported-by: syzbot+98162c885993b72f19c4@syzkaller.appspotmail.com Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:16 +02:00
Jiri Olsa	9d47bd2175	tools: bpftool: Fix json dump crash on powerpc [ Upstream commit `aa52bcbe0e` ] Michael reported crash with by bpf program in json mode on powerpc: # bpftool prog -p dump jited id 14 [{ "name": "0xd00000000a9aa760", "insns": [{ "pc": "0x0", "operation": "nop", "operands": [null ] },{ "pc": "0x4", "operation": "nop", "operands": [null ] },{ "pc": "0x8", "operation": "mflr", Segmentation fault (core dumped) The code is assuming char pointers in format, which is not always true at least for powerpc. Fixing this by dumping the whole string into buffer based on its format. Please note that libopcodes code does not check return values from fprintf callback, but as per Jakub suggestion returning -1 on allocation failure so we do the best effort to propagate the error. Fixes: `107f041212` ("tools: bpftool: add JSON output for `bpftool prog dump jited *` command") Reported-by: Michael Petlan <mpetlan@redhat.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:16 +02:00
Geert Uytterhoeven	2ad04d31bb	gpiolib: Fix references to gpiod_[gs]et_*value_cansleep() variants [ Upstream commit `3285170f28` ] Commit `372e722ea4` ("gpiolib: use descriptors internally") renamed the functions to use a "gpiod" prefix, and commit `79a9becda8` ("gpiolib: export descriptor-based GPIO interface") introduced the "raw" variants, but both changes forgot to update the comments. Readd a similar reference to gpiod_set_value(), which was accidentally removed by commit `1e77fc8211` ("gpio: Add missing open drain/source handling to gpiod_set_value_cansleep()"). Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/20190701142738.25219-1-geert+renesas@glider.be Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:16 +02:00
Cong Wang	157d1c7a1a	bonding: validate ip header before check IPPROTO_IGMP [ Upstream commit `9d1bc24b52` ] bond_xmit_roundrobin() checks for IGMP packets but it parses the IP header even before checking skb->protocol. We should validate the IP header with pskb_may_pull() before using iph->protocol. Reported-and-tested-by: syzbot+e5be16aa39ad6e755391@syzkaller.appspotmail.com Fixes: `a2fd940f4c` ("bonding: fix broken multicast with round-robin mode") Cc: Jay Vosburgh <j.vosburgh@gmail.com> Cc: Veaceslav Falico <vfalico@gmail.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:15 +02:00
Jiri Benc	88f751b066	selftests: bpf: fix inlines in test_lwt_seg6local [ Upstream commit `11aca65ec4` ] Selftests are reporting this failure in test_lwt_seg6local.sh: + ip netns exec ns2 ip -6 route add fb00::6 encap bpf in obj test_lwt_seg6local.o sec encap_srh dev veth2 Error fetching program/map! Failed to parse eBPF program: Operation not permitted The problem is __attribute__((always_inline)) alone is not enough to prevent clang from inserting those functions in .text. In that case, .text is not marked as relocateable. See the output of objdump -h test_lwt_seg6local.o: Idx Name Size VMA LMA File off Algn 0 .text 00003530 0000000000000000 0000000000000000 00000040 2**3 CONTENTS, ALLOC, LOAD, READONLY, CODE This causes the iproute bpf loader to fail in bpf_fetch_prog_sec: bpf_has_call_data returns true but bpf_fetch_prog_relo fails as there's no relocateable .text section in the file. To fix this, convert to 'static __always_inline'. v2: Use 'static __always_inline' instead of 'static inline __attribute__((always_inline))' Fixes: `c99a84eac0` ("selftests/bpf: test for seg6local End.BPF action") Signed-off-by: Jiri Benc <jbenc@redhat.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:15 +02:00
Leo Yan	ef5b204336	bpf, libbpf, smatch: Fix potential NULL pointer dereference [ Upstream commit `33bae185f7` ] Based on the following report from Smatch, fix the potential NULL pointer dereference check: tools/lib/bpf/libbpf.c:3493 bpf_prog_load_xattr() warn: variable dereferenced before check 'attr' (see line 3483) 3479 int bpf_prog_load_xattr(const struct bpf_prog_load_attr attr, 3480 struct bpf_object pobj, int prog_fd) 3481 { 3482 struct bpf_object_open_attr open_attr = { 3483 .file = attr->file, 3484 .prog_type = attr->prog_type, ^^^^^^ 3485 }; At the head of function, it directly access 'attr' without checking if it's NULL pointer. This patch moves the values assignment after validating 'attr' and 'attr->file'. Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:15 +02:00
David Howells	0f2f2cebe6	rxrpc: Fix oops in tracepoint [ Upstream commit `99f0eae653` ] If the rxrpc_eproto tracepoint is enabled, an oops will be cause by the trace line that rxrpc_extract_header() tries to emit when a protocol error occurs (typically because the packet is short) because the call argument is NULL. Fix this by using ?: to assume 0 as the debug_id if call is NULL. This can then be induced by: echo -e '\0\0\0\0\0\0\0\0' \| ncat -4u --send-only <addr> 20001 where addr has the following program running on it: #include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> #include <sys/socket.h> #include <arpa/inet.h> #include <linux/rxrpc.h> int main(void) { struct sockaddr_rxrpc srx; int fd; memset(&srx, 0, sizeof(srx)); srx.srx_family = AF_RXRPC; srx.srx_service = 0; srx.transport_type = AF_INET; srx.transport_len = sizeof(srx.transport.sin); srx.transport.sin.sin_family = AF_INET; srx.transport.sin.sin_port = htons(0x4e21); fd = socket(AF_RXRPC, SOCK_DGRAM, AF_INET6); bind(fd, (struct sockaddr *)&srx, sizeof(srx)); sleep(20); return 0; } It results in the following oops. BUG: kernel NULL pointer dereference, address: 0000000000000340 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page ... RIP: 0010:trace_event_raw_event_rxrpc_rx_eproto+0x47/0xac ... Call Trace: <IRQ> rxrpc_extract_header+0x86/0x171 ? rcu_read_lock_sched_held+0x5d/0x63 ? rxrpc_new_skb+0xd4/0x109 rxrpc_input_packet+0xef/0x14fc ? rxrpc_input_data+0x986/0x986 udp_queue_rcv_one_skb+0xbf/0x3d0 udp_unicast_rcv_skb.isra.8+0x64/0x71 ip_protocol_deliver_rcu+0xe4/0x1b4 ip_local_deliver+0xf0/0x154 __netif_receive_skb_one_core+0x50/0x6c netif_receive_skb_internal+0x26b/0x2e9 napi_gro_receive+0xf8/0x1da rtl8169_poll+0x303/0x4c4 net_rx_action+0x10e/0x333 __do_softirq+0x1a5/0x38f irq_exit+0x54/0xc4 do_IRQ+0xda/0xf8 common_interrupt+0xf/0xf </IRQ> ... ? cpuidle_enter_state+0x23c/0x34d cpuidle_enter+0x2a/0x36 do_idle+0x163/0x1ea cpu_startup_entry+0x1d/0x1f start_secondary+0x157/0x172 secondary_startup_64+0xa4/0xb0 Fixes: `a25e21f0bc` ("rxrpc, afs: Use debug_ids rather than pointers in traces") Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:15 +02:00
Phong Tran	ca37b9a746	net: usb: asix: init MAC address buffers [ Upstream commit `78226f6eaa` ] This is for fixing bug KMSAN: uninit-value in ax88772_bind Tested by https://groups.google.com/d/msg/syzkaller-bugs/aFQurGotng4/eB_HlNhhCwAJ Reported-by: syzbot+8a3fc6674bbc3978ed4e@syzkaller.appspotmail.com syzbot found the following crash on: HEAD commit: f75e4cfe kmsan: use kmsan_handle_urb() in urb.c git tree: kmsan console output: https://syzkaller.appspot.com/x/log.txt?x=136d720ea00000 kernel config: https://syzkaller.appspot.com/x/.config?x=602468164ccdc30a dashboard link: https://syzkaller.appspot.com/bug?extid=8a3fc6674bbc3978ed4e compiler: clang version 9.0.0 (/home/glider/llvm/clang 06d00afa61eef8f7f501ebdb4e8612ea43ec2d78) syz repro: https://syzkaller.appspot.com/x/repro.syz?x=12788316a00000 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=120359aaa00000 ================================================================== BUG: KMSAN: uninit-value in is_valid_ether_addr include/linux/etherdevice.h:200 [inline] BUG: KMSAN: uninit-value in asix_set_netdev_dev_addr drivers/net/usb/asix_devices.c:73 [inline] BUG: KMSAN: uninit-value in ax88772_bind+0x93d/0x11e0 drivers/net/usb/asix_devices.c:724 CPU: 0 PID: 3348 Comm: kworker/0:2 Not tainted 5.1.0+ #1 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: usb_hub_wq hub_event Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x191/0x1f0 lib/dump_stack.c:113 kmsan_report+0x130/0x2a0 mm/kmsan/kmsan.c:622 __msan_warning+0x75/0xe0 mm/kmsan/kmsan_instr.c:310 is_valid_ether_addr include/linux/etherdevice.h:200 [inline] asix_set_netdev_dev_addr drivers/net/usb/asix_devices.c:73 [inline] ax88772_bind+0x93d/0x11e0 drivers/net/usb/asix_devices.c:724 usbnet_probe+0x10f5/0x3940 drivers/net/usb/usbnet.c:1728 usb_probe_interface+0xd66/0x1320 drivers/usb/core/driver.c:361 really_probe+0xdae/0x1d80 drivers/base/dd.c:513 driver_probe_device+0x1b3/0x4f0 drivers/base/dd.c:671 __device_attach_driver+0x5b8/0x790 drivers/base/dd.c:778 bus_for_each_drv+0x28e/0x3b0 drivers/base/bus.c:454 __device_attach+0x454/0x730 drivers/base/dd.c:844 device_initial_probe+0x4a/0x60 drivers/base/dd.c:891 bus_probe_device+0x137/0x390 drivers/base/bus.c:514 device_add+0x288d/0x30e0 drivers/base/core.c:2106 usb_set_configuration+0x30dc/0x3750 drivers/usb/core/message.c:2027 generic_probe+0xe7/0x280 drivers/usb/core/generic.c:210 usb_probe_device+0x14c/0x200 drivers/usb/core/driver.c:266 really_probe+0xdae/0x1d80 drivers/base/dd.c:513 driver_probe_device+0x1b3/0x4f0 drivers/base/dd.c:671 __device_attach_driver+0x5b8/0x790 drivers/base/dd.c:778 bus_for_each_drv+0x28e/0x3b0 drivers/base/bus.c:454 __device_attach+0x454/0x730 drivers/base/dd.c:844 device_initial_probe+0x4a/0x60 drivers/base/dd.c:891 bus_probe_device+0x137/0x390 drivers/base/bus.c:514 device_add+0x288d/0x30e0 drivers/base/core.c:2106 usb_new_device+0x23e5/0x2ff0 drivers/usb/core/hub.c:2534 hub_port_connect drivers/usb/core/hub.c:5089 [inline] hub_port_connect_change drivers/usb/core/hub.c:5204 [inline] port_event drivers/usb/core/hub.c:5350 [inline] hub_event+0x48d1/0x7290 drivers/usb/core/hub.c:5432 process_one_work+0x1572/0x1f00 kernel/workqueue.c:2269 process_scheduled_works kernel/workqueue.c:2331 [inline] worker_thread+0x189c/0x2460 kernel/workqueue.c:2417 kthread+0x4b5/0x4f0 kernel/kthread.c:254 ret_from_fork+0x35/0x40 arch/x86/entry/entry_64.S:355 Signed-off-by: Phong Tran <tranmanphong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:15 +02:00
Guilherme G. Piccoli	51216937c3	bnx2x: Prevent ptp_task to be rescheduled indefinitely [ Upstream commit `3c91f25c2f` ] Currently bnx2x ptp worker tries to read a register with timestamp information in case of TX packet timestamping and in case it fails, the routine reschedules itself indefinitely. This was reported as a kworker always at 100% of CPU usage, which was narrowed down to be bnx2x ptp_task. By following the ioctl handler, we could narrow down the problem to an NTP tool (chrony) requesting HW timestamping from bnx2x NIC with RX filter zeroed; this isn't reproducible for example with ptp4l (from linuxptp) since this tool requests a supported RX filter. It seems NIC FW timestamp mechanism cannot work well with RX_FILTER_NONE - driver's PTP filter init routine skips a register write to the adapter if there's not a supported filter request. This patch addresses the problem of bnx2x ptp thread's everlasting reschedule by retrying the register read 10 times; between the read attempts the thread sleeps for an increasing amount of time starting in 1ms to give FW some time to perform the timestamping. If it still fails after all retries, we bail out in order to prevent an unbound resource consumption from bnx2x. The patch also adds an ethtool statistic for accounting the skipped TX timestamp packets and it reduces the priority of timestamping error messages to prevent log flooding. The code was tested using both linuxptp and chrony. Reported-and-tested-by: Przemyslaw Hausman <przemyslaw.hausman@canonical.com> Suggested-by: Sudarsana Reddy Kalluru <skalluru@marvell.com> Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com> Acked-by: Sudarsana Reddy Kalluru <skalluru@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:15 +02:00
Andi Kleen	e358d2ab42	perf stat: Fix group lookup for metric group [ Upstream commit `2f87f33f42` ] The metric group code tries to find a group it added earlier in the evlist. Fix the lookup to handle groups with partially overlaps correctly. When a sub string match fails and we reset the match, we have to compare the first element again. I also renamed the find_evsel function to find_evsel_group to make its purpose clearer. With the earlier changes this fixes: Before: % perf stat -M UPI,IPC sleep 1 ... 1,032,922 uops_retired.retire_slots # 1.1 UPI 1,896,096 inst_retired.any 1,896,096 inst_retired.any 1,177,254 cpu_clk_unhalted.thread After: % perf stat -M UPI,IPC sleep 1 ... 1,013,193 uops_retired.retire_slots # 1.1 UPI 932,033 inst_retired.any 932,033 inst_retired.any # 0.9 IPC 1,091,245 cpu_clk_unhalted.thread Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Fixes: `b18f3e3650` ("perf stat: Support JSON metrics in perf stat") Link: http://lkml.kernel.org/r/20190624193711.35241-4-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:15 +02:00
Andi Kleen	a64e018be7	perf stat: Make metric event lookup more robust [ Upstream commit `145c407c80` ] After setting up metric groups through the event parser, the metricgroup code looks them up again in the event list. Make sure we only look up events that haven't been used by some other metric. The data structures currently cannot handle more than one metric per event. This avoids problems with multiple events partially overlapping. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Link: http://lkml.kernel.org/r/20190624193711.35241-2-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:15 +02:00
Baruch Siach	7343178ccf	bpf: fix uapi bpf_prog_info fields alignment [ Upstream commit `0472301a28` ] Merge commit `1c8c5a9d38` ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next") undid the fix from commit `36f9814a49` ("bpf: fix uapi hole for 32 bit compat applications") by taking the gpl_compatible 1-bit field definition from commit `b85fab0e67` ("bpf: Add gpl_compatible flag to struct bpf_prog_info") as is. That breaks architectures with 16-bit alignment like m68k. Add 31-bit pad after gpl_compatible to restore alignment of following fields. Thanks to Dmitry V. Levin his analysis of this bug history. Signed-off-by: Baruch Siach <baruch@tkos.co.il> Acked-by: Song Liu <songliubraving@fb.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:15 +02:00
Andrei Otcheretianski	af3790a46a	iwlwifi: mvm: Drop large non sta frames [ Upstream commit `ac70499ee9` ] In some buggy scenarios we could possible attempt to transmit frames larger than maximum MSDU size. Since our devices don't know how to handle this, it may result in asserts, hangs etc. This can happen, for example, when we receive a large multicast frame and try to transmit it back to the air in AP mode. Since in a legal scenario this should never happen, drop such frames and warn about it. Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:14 +02:00
Vedang Patel	036184af23	igb: clear out skb->tstamp after reading the txtime [ Upstream commit `1e08511d5d` ] If a packet which is utilizing the launchtime feature (via SO_TXTIME socket option) also requests the hardware transmit timestamp, the hardware timestamp is not delivered to the userspace. This is because the value in skb->tstamp is mistaken as the software timestamp. Applications, like ptp4l, request a hardware timestamp by setting the SOF_TIMESTAMPING_TX_HARDWARE socket option. Whenever a new timestamp is detected by the driver (this work is done in igb_ptp_tx_work() which calls igb_ptp_tx_hwtstamps() in igb_ptp.c[1]), it will queue the timestamp in the ERR_QUEUE for the userspace to read. When the userspace is ready, it will issue a recvmsg() call to collect this timestamp. The problem is in this recvmsg() call. If the skb->tstamp is not cleared out, it will be interpreted as a software timestamp and the hardware tx timestamp will not be successfully sent to the userspace. Look at skb_is_swtx_tstamp() and the callee function __sock_recv_timestamp() in net/socket.c for more details. Signed-off-by: Vedang Patel <vedang.patel@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:14 +02:00
Maxime Chevallier	0024b12b77	net: mvpp2: prs: Don't override the sign bit in SRAM parser shift [ Upstream commit `8ec3ede559` ] The Header Parser allows identifying various fields in the packet headers, used for various kind of filtering and classification steps. This is a re-entrant process, where the offset in the packet header depends on the previous lookup results. This offset is represented in the SRAM results of the TCAM, as a shift to be operated. This shift can be negative in some cases, such as in IPv6 parsing. This commit prevents overriding the sign bit when setting the shift value, which could cause instabilities when parsing IPv6 flows. Fixes: `3f518509de` ("ethernet: Add new driver for Marvell Armada 375 network unit") Suggested-by: Alan Winkowski <walan@marvell.com> Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:14 +02:00
Wen Gong	05592b9b7f	ath10k: destroy sdio workqueue while remove sdio module [ Upstream commit `3ed39f8e74` ] The workqueue need to flush and destory while remove sdio module, otherwise it will have thread which is not destory after remove sdio modules. Tested with QCA6174 SDIO with firmware WLAN.RMH.4.4.1-00007-QCARMSWP-1. Signed-off-by: Wen Gong <wgong@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:14 +02:00
Yunsheng Lin	26d86b29e8	net: hns3: add some error checking in hclge_tm module [ Upstream commit `04f25edb48` ] When hdev->tx_sch_mode is HCLGE_FLAG_VNET_BASE_SCH_MODE, the hclge_tm_schd_mode_vnet_base_cfg calls hclge_tm_pri_schd_mode_cfg with vport->vport_id as pri_id, which is used as index for hdev->tm_info.tc_info, it will cause out of bound access issue if vport_id is equal to or larger than HNAE3_MAX_TC. Also hardware only support maximum speed of HCLGE_ETHER_MAX_RATE. So this patch adds two checks for above cases. Fixes: `848440544b` ("net: hns3: Add support of TX Scheduler & Shaper to HNS3 driver") Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:14 +02:00
Yonglong Liu	ddfdbcccd7	net: hns3: fix a -Wformat-nonliteral compile warning [ Upstream commit `18d219b783` ] When setting -Wformat=2, there is a compiler warning like this: hclge_main.c:xxx:x: warning: format not a string literal and no format arguments [-Wformat-nonliteral] strs[i].desc); ^~~~ This patch adds missing format parameter "%s" to snprintf() to fix it. Fixes: `46a3df9f97` ("Add HNS3 Acceleration Engine & Compatibility Layer Support") Signed-off-by: Yonglong Liu <liuyonglong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:14 +02:00
Coly Li	95d0848094	bcache: fix potential deadlock in cached_def_free() [ Upstream commit `7e865eba00` ] When enable lockdep and reboot system with a writeback mode bcache device, the following potential deadlock warning is reported by lockdep engine. [ 101.536569][ T401] kworker/2:2/401 is trying to acquire lock: [ 101.538575][ T401] 00000000bbf6e6c7 ((wq_completion)bcache_writeback_wq){+.+.}, at: flush_workqueue+0x87/0x4c0 [ 101.542054][ T401] [ 101.542054][ T401] but task is already holding lock: [ 101.544587][ T401] 00000000f5f305b3 ((work_completion)(&cl->work)#2){+.+.}, at: process_one_work+0x21e/0x640 [ 101.548386][ T401] [ 101.548386][ T401] which lock already depends on the new lock. [ 101.548386][ T401] [ 101.551874][ T401] [ 101.551874][ T401] the existing dependency chain (in reverse order) is: [ 101.555000][ T401] [ 101.555000][ T401] -> #1 ((work_completion)(&cl->work)#2){+.+.}: [ 101.557860][ T401] process_one_work+0x277/0x640 [ 101.559661][ T401] worker_thread+0x39/0x3f0 [ 101.561340][ T401] kthread+0x125/0x140 [ 101.562963][ T401] ret_from_fork+0x3a/0x50 [ 101.564718][ T401] [ 101.564718][ T401] -> #0 ((wq_completion)bcache_writeback_wq){+.+.}: [ 101.567701][ T401] lock_acquire+0xb4/0x1c0 [ 101.569651][ T401] flush_workqueue+0xae/0x4c0 [ 101.571494][ T401] drain_workqueue+0xa9/0x180 [ 101.573234][ T401] destroy_workqueue+0x17/0x250 [ 101.575109][ T401] cached_dev_free+0x44/0x120 [bcache] [ 101.577304][ T401] process_one_work+0x2a4/0x640 [ 101.579357][ T401] worker_thread+0x39/0x3f0 [ 101.581055][ T401] kthread+0x125/0x140 [ 101.582709][ T401] ret_from_fork+0x3a/0x50 [ 101.584592][ T401] [ 101.584592][ T401] other info that might help us debug this: [ 101.584592][ T401] [ 101.588355][ T401] Possible unsafe locking scenario: [ 101.588355][ T401] [ 101.590974][ T401] CPU0 CPU1 [ 101.592889][ T401] ---- ---- [ 101.594743][ T401] lock((work_completion)(&cl->work)#2); [ 101.596785][ T401] lock((wq_completion)bcache_writeback_wq); [ 101.600072][ T401] lock((work_completion)(&cl->work)#2); [ 101.602971][ T401] lock((wq_completion)bcache_writeback_wq); [ 101.605255][ T401] [ 101.605255][ T401] * DEADLOCK * [ 101.605255][ T401] [ 101.608310][ T401] 2 locks held by kworker/2:2/401: [ 101.610208][ T401] #0: 00000000cf2c7d17 ((wq_completion)events){+.+.}, at: process_one_work+0x21e/0x640 [ 101.613709][ T401] #1: 00000000f5f305b3 ((work_completion)(&cl->work)#2){+.+.}, at: process_one_work+0x21e/0x640 [ 101.617480][ T401] [ 101.617480][ T401] stack backtrace: [ 101.619539][ T401] CPU: 2 PID: 401 Comm: kworker/2:2 Tainted: G W 5.2.0-rc4-lp151.20-default+ #1 [ 101.623225][ T401] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/13/2018 [ 101.627210][ T401] Workqueue: events cached_dev_free [bcache] [ 101.629239][ T401] Call Trace: [ 101.630360][ T401] dump_stack+0x85/0xcb [ 101.631777][ T401] print_circular_bug+0x19a/0x1f0 [ 101.633485][ T401] __lock_acquire+0x16cd/0x1850 [ 101.635184][ T401] ? __lock_acquire+0x6a8/0x1850 [ 101.636863][ T401] ? lock_acquire+0xb4/0x1c0 [ 101.638421][ T401] ? find_held_lock+0x34/0xa0 [ 101.640015][ T401] lock_acquire+0xb4/0x1c0 [ 101.641513][ T401] ? flush_workqueue+0x87/0x4c0 [ 101.643248][ T401] flush_workqueue+0xae/0x4c0 [ 101.644832][ T401] ? flush_workqueue+0x87/0x4c0 [ 101.646476][ T401] ? drain_workqueue+0xa9/0x180 [ 101.648303][ T401] drain_workqueue+0xa9/0x180 [ 101.649867][ T401] destroy_workqueue+0x17/0x250 [ 101.651503][ T401] cached_dev_free+0x44/0x120 [bcache] [ 101.653328][ T401] process_one_work+0x2a4/0x640 [ 101.655029][ T401] worker_thread+0x39/0x3f0 [ 101.656693][ T401] ? process_one_work+0x640/0x640 [ 101.658501][ T401] kthread+0x125/0x140 [ 101.660012][ T401] ? kthread_create_worker_on_cpu+0x70/0x70 [ 101.661985][ T401] ret_from_fork+0x3a/0x50 [ 101.691318][ T401] bcache: bcache_device_free() bcache0 stopped Here is how the above potential deadlock may happen in reboot/shutdown code path, 1) bcache_reboot() is called firstly in the reboot/shutdown code path, then in bcache_reboot(), bcache_device_stop() is called. 2) bcache_device_stop() sets BCACHE_DEV_CLOSING on d->falgs, then call closure_queue(&d->cl) to invoke cached_dev_flush(). And in turn cached_dev_flush() calls cached_dev_free() via closure_at() 3) In cached_dev_free(), after stopped writebach kthread dc->writeback_thread, the kwork dc->writeback_write_wq is stopping by destroy_workqueue(). 4) Inside destroy_workqueue(), drain_workqueue() is called. Inside drain_workqueue(), flush_workqueue() is called. Then wq->lockdep_map is acquired by lock_map_acquire() in flush_workqueue(). After the lock acquired the rest part of flush_workqueue() just wait for the workqueue to complete. 5) Now we look back at writeback thread routine bch_writeback_thread(), in the main while-loop, write_dirty() is called via continue_at() in read_dirty_submit(), which is called via continue_at() in while-loop level called function read_dirty(). Inside write_dirty() it may be re-called on workqueeu dc->writeback_write_wq via continue_at(). It means when the writeback kthread is stopped in cached_dev_free() there might be still one kworker queued on dc->writeback_write_wq to execute write_dirty() again. 6) Now this kworker is scheduled on dc->writeback_write_wq to run by process_one_work() (which is called by worker_thread()). Before calling the kwork routine, wq->lockdep_map is acquired. 7) But wq->lockdep_map is acquired already in step 4), so a A-A lock (lockdep terminology) scenario happens. Indeed on multiple cores syatem, the above deadlock is very rare to happen, just as the code comments in process_one_work() says, 2263 * AFAICT there is no possible deadlock scenario between the 2264 * flush_work() and complete() primitives (except for single-threaded 2265 * workqueues), so hiding them isn't a problem. But it is still good to fix such lockdep warning, even no one running bcache on single core system. The fix is simple. This patch solves the above potential deadlock by, - Do not destroy workqueue dc->writeback_write_wq in cached_dev_free(). - Flush and destroy dc->writeback_write_wq in writebach kthread routine bch_writeback_thread(), where after quit the thread main while-loop and before cached_dev_put() is called. By this fix, dc->writeback_write_wq will be stopped and destroy before the writeback kthread stopped, so the chance for a A-A locking on wq->lockdep_map is disappeared, such A-A deadlock won't happen any more. Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:13 +02:00
Coly Li	4b7758e9c4	bcache: check c->gc_thread by IS_ERR_OR_NULL in cache_set_flush() [ Upstream commit `b387e9b586` ] When system memory is in heavy pressure, bch_gc_thread_start() from run_cache_set() may fail due to out of memory. In such condition, c->gc_thread is assigned to -ENOMEM, not NULL pointer. Then in following failure code path bch_cache_set_error(), when cache_set_flush() gets called, the code piece to stop c->gc_thread is broken, if (!IS_ERR_OR_NULL(c->gc_thread)) kthread_stop(c->gc_thread); And KASAN catches such NULL pointer deference problem, with the warning information: [ 561.207881] ================================================================== [ 561.207900] BUG: KASAN: null-ptr-deref in kthread_stop+0x3b/0x440 [ 561.207904] Write of size 4 at addr 000000000000001c by task kworker/15:1/313 [ 561.207913] CPU: 15 PID: 313 Comm: kworker/15:1 Tainted: G W 5.0.0-vanilla+ #3 [ 561.207916] Hardware name: Lenovo ThinkSystem SR650 -[7X05CTO1WW]-/-[7X05CTO1WW]-, BIOS -[IVE136T-2.10]- 03/22/2019 [ 561.207935] Workqueue: events cache_set_flush [bcache] [ 561.207940] Call Trace: [ 561.207948] dump_stack+0x9a/0xeb [ 561.207955] ? kthread_stop+0x3b/0x440 [ 561.207960] ? kthread_stop+0x3b/0x440 [ 561.207965] kasan_report+0x176/0x192 [ 561.207973] ? kthread_stop+0x3b/0x440 [ 561.207981] kthread_stop+0x3b/0x440 [ 561.207995] cache_set_flush+0xd4/0x6d0 [bcache] [ 561.208008] process_one_work+0x856/0x1620 [ 561.208015] ? find_held_lock+0x39/0x1d0 [ 561.208028] ? drain_workqueue+0x380/0x380 [ 561.208048] worker_thread+0x87/0xb80 [ 561.208058] ? __kthread_parkme+0xb6/0x180 [ 561.208067] ? process_one_work+0x1620/0x1620 [ 561.208072] kthread+0x326/0x3e0 [ 561.208079] ? kthread_create_worker_on_cpu+0xc0/0xc0 [ 561.208090] ret_from_fork+0x3a/0x50 [ 561.208110] ================================================================== [ 561.208113] Disabling lock debugging due to kernel taint [ 561.208115] irq event stamp: 11800231 [ 561.208126] hardirqs last enabled at (11800231): [<ffffffff83008538>] do_syscall_64+0x18/0x410 [ 561.208127] BUG: unable to handle kernel NULL pointer dereference at 000000000000001c [ 561.208129] #PF error: [WRITE] [ 561.312253] hardirqs last disabled at (11800230): [<ffffffff830052ff>] trace_hardirqs_off_thunk+0x1a/0x1c [ 561.312259] softirqs last enabled at (11799832): [<ffffffff850005c7>] __do_softirq+0x5c7/0x8c3 [ 561.405975] PGD 0 P4D 0 [ 561.442494] softirqs last disabled at (11799821): [<ffffffff831add2c>] irq_exit+0x1ac/0x1e0 [ 561.791359] Oops: 0002 [#1] SMP KASAN NOPTI [ 561.791362] CPU: 15 PID: 313 Comm: kworker/15:1 Tainted: G B W 5.0.0-vanilla+ #3 [ 561.791363] Hardware name: Lenovo ThinkSystem SR650 -[7X05CTO1WW]-/-[7X05CTO1WW]-, BIOS -[IVE136T-2.10]- 03/22/2019 [ 561.791371] Workqueue: events cache_set_flush [bcache] [ 561.791374] RIP: 0010:kthread_stop+0x3b/0x440 [ 561.791376] Code: 00 00 65 8b 05 26 d5 e0 7c 89 c0 48 0f a3 05 ec aa df 02 0f 82 dc 02 00 00 4c 8d 63 20 be 04 00 00 00 4c 89 e7 e8 65 c5 53 00 <f0> ff 43 20 48 8d 7b 24 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 [ 561.791377] RSP: 0018:ffff88872fc8fd10 EFLAGS: 00010286 [ 561.838895] bcache: bch_count_io_errors() nvme0n1: IO error on writing btree. [ 561.838916] bcache: bch_count_io_errors() nvme0n1: IO error on writing btree. [ 561.838934] bcache: bch_count_io_errors() nvme0n1: IO error on writing btree. [ 561.838948] bcache: bch_count_io_errors() nvme0n1: IO error on writing btree. [ 561.838966] bcache: bch_count_io_errors() nvme0n1: IO error on writing btree. [ 561.838979] bcache: bch_count_io_errors() nvme0n1: IO error on writing btree. [ 561.838996] bcache: bch_count_io_errors() nvme0n1: IO error on writing btree. [ 563.067028] RAX: 0000000000000000 RBX: fffffffffffffffc RCX: ffffffff832dd314 [ 563.067030] RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000297 [ 563.067032] RBP: ffff88872fc8fe88 R08: fffffbfff0b8213d R09: fffffbfff0b8213d [ 563.067034] R10: 0000000000000001 R11: fffffbfff0b8213c R12: 000000000000001c [ 563.408618] R13: ffff88dc61cc0f68 R14: ffff888102b94900 R15: ffff88dc61cc0f68 [ 563.408620] FS: 0000000000000000(0000) GS:ffff888f7dc00000(0000) knlGS:0000000000000000 [ 563.408622] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 563.408623] CR2: 000000000000001c CR3: 0000000f48a1a004 CR4: 00000000007606e0 [ 563.408625] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 563.408627] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 563.904795] bcache: bch_count_io_errors() nvme0n1: IO error on writing btree. [ 563.915796] PKRU: 55555554 [ 563.915797] Call Trace: [ 563.915807] cache_set_flush+0xd4/0x6d0 [bcache] [ 563.915812] process_one_work+0x856/0x1620 [ 564.001226] bcache: bch_count_io_errors() nvme0n1: IO error on writing btree. [ 564.033563] ? find_held_lock+0x39/0x1d0 [ 564.033567] ? drain_workqueue+0x380/0x380 [ 564.033574] worker_thread+0x87/0xb80 [ 564.062823] bcache: bch_count_io_errors() nvme0n1: IO error on writing btree. [ 564.118042] ? __kthread_parkme+0xb6/0x180 [ 564.118046] ? process_one_work+0x1620/0x1620 [ 564.118048] kthread+0x326/0x3e0 [ 564.118050] ? kthread_create_worker_on_cpu+0xc0/0xc0 [ 564.167066] bcache: bch_count_io_errors() nvme0n1: IO error on writing btree. [ 564.252441] ret_from_fork+0x3a/0x50 [ 564.252447] Modules linked in: msr rpcrdma sunrpc rdma_ucm ib_iser ib_umad rdma_cm ib_ipoib i40iw configfs iw_cm ib_cm libiscsi scsi_transport_iscsi mlx4_ib ib_uverbs mlx4_en ib_core nls_iso8859_1 nls_cp437 vfat fat intel_rapl skx_edac x86_pkg_temp_thermal coretemp iTCO_wdt iTCO_vendor_support crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel ses raid0 aesni_intel cdc_ether enclosure usbnet ipmi_ssif joydev aes_x86_64 i40e scsi_transport_sas mii bcache md_mod crypto_simd mei_me ioatdma crc64 ptp cryptd pcspkr i2c_i801 mlx4_core glue_helper pps_core mei lpc_ich dca wmi ipmi_si ipmi_devintf nd_pmem dax_pmem nd_btt ipmi_msghandler device_dax pcc_cpufreq button hid_generic usbhid mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect xhci_pci sysimgblt fb_sys_fops xhci_hcd ttm megaraid_sas drm usbcore nfit libnvdimm sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua efivarfs [ 564.299390] bcache: bch_count_io_errors() nvme0n1: IO error on writing btree. [ 564.348360] CR2: 000000000000001c [ 564.348362] ---[ end trace b7f0e5cc7b2103b0 ]--- Therefore, it is not enough to only check whether c->gc_thread is NULL, we should use IS_ERR_OR_NULL() to check both NULL pointer and error value. This patch changes the above buggy code piece in this way, if (!IS_ERR_OR_NULL(c->gc_thread)) kthread_stop(c->gc_thread); Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:13 +02:00
Coly Li	81b88c05bc	bcache: acquire bch_register_lock later in cached_dev_free() [ Upstream commit `80265d8dfd` ] When enable lockdep engine, a lockdep warning can be observed when reboot or shutdown system, [ 3142.764557][ T1] bcache: bcache_reboot() Stopping all devices: [ 3142.776265][ T2649] [ 3142.777159][ T2649] ====================================================== [ 3142.780039][ T2649] WARNING: possible circular locking dependency detected [ 3142.782869][ T2649] 5.2.0-rc4-lp151.20-default+ #1 Tainted: G W [ 3142.785684][ T2649] ------------------------------------------------------ [ 3142.788479][ T2649] kworker/3:67/2649 is trying to acquire lock: [ 3142.790738][ T2649] 00000000aaf02291 ((wq_completion)bcache_writeback_wq){+.+.}, at: flush_workqueue+0x87/0x4c0 [ 3142.794678][ T2649] [ 3142.794678][ T2649] but task is already holding lock: [ 3142.797402][ T2649] 000000004fcf89c5 (&bch_register_lock){+.+.}, at: cached_dev_free+0x17/0x120 [bcache] [ 3142.801462][ T2649] [ 3142.801462][ T2649] which lock already depends on the new lock. [ 3142.801462][ T2649] [ 3142.805277][ T2649] [ 3142.805277][ T2649] the existing dependency chain (in reverse order) is: [ 3142.808902][ T2649] [ 3142.808902][ T2649] -> #2 (&bch_register_lock){+.+.}: [ 3142.812396][ T2649] __mutex_lock+0x7a/0x9d0 [ 3142.814184][ T2649] cached_dev_free+0x17/0x120 [bcache] [ 3142.816415][ T2649] process_one_work+0x2a4/0x640 [ 3142.818413][ T2649] worker_thread+0x39/0x3f0 [ 3142.820276][ T2649] kthread+0x125/0x140 [ 3142.822061][ T2649] ret_from_fork+0x3a/0x50 [ 3142.823965][ T2649] [ 3142.823965][ T2649] -> #1 ((work_completion)(&cl->work)#2){+.+.}: [ 3142.827244][ T2649] process_one_work+0x277/0x640 [ 3142.829160][ T2649] worker_thread+0x39/0x3f0 [ 3142.830958][ T2649] kthread+0x125/0x140 [ 3142.832674][ T2649] ret_from_fork+0x3a/0x50 [ 3142.834915][ T2649] [ 3142.834915][ T2649] -> #0 ((wq_completion)bcache_writeback_wq){+.+.}: [ 3142.838121][ T2649] lock_acquire+0xb4/0x1c0 [ 3142.840025][ T2649] flush_workqueue+0xae/0x4c0 [ 3142.842035][ T2649] drain_workqueue+0xa9/0x180 [ 3142.844042][ T2649] destroy_workqueue+0x17/0x250 [ 3142.846142][ T2649] cached_dev_free+0x52/0x120 [bcache] [ 3142.848530][ T2649] process_one_work+0x2a4/0x640 [ 3142.850663][ T2649] worker_thread+0x39/0x3f0 [ 3142.852464][ T2649] kthread+0x125/0x140 [ 3142.854106][ T2649] ret_from_fork+0x3a/0x50 [ 3142.855880][ T2649] [ 3142.855880][ T2649] other info that might help us debug this: [ 3142.855880][ T2649] [ 3142.859663][ T2649] Chain exists of: [ 3142.859663][ T2649] (wq_completion)bcache_writeback_wq --> (work_completion)(&cl->work)#2 --> &bch_register_lock [ 3142.859663][ T2649] [ 3142.865424][ T2649] Possible unsafe locking scenario: [ 3142.865424][ T2649] [ 3142.868022][ T2649] CPU0 CPU1 [ 3142.869885][ T2649] ---- ---- [ 3142.871751][ T2649] lock(&bch_register_lock); [ 3142.873379][ T2649] lock((work_completion)(&cl->work)#2); [ 3142.876399][ T2649] lock(&bch_register_lock); [ 3142.879727][ T2649] lock((wq_completion)bcache_writeback_wq); [ 3142.882064][ T2649] [ 3142.882064][ T2649] * DEADLOCK * [ 3142.882064][ T2649] [ 3142.885060][ T2649] 3 locks held by kworker/3:67/2649: [ 3142.887245][ T2649] #0: 00000000e774cdd0 ((wq_completion)events){+.+.}, at: process_one_work+0x21e/0x640 [ 3142.890815][ T2649] #1: 00000000f7df89da ((work_completion)(&cl->work)#2){+.+.}, at: process_one_work+0x21e/0x640 [ 3142.894884][ T2649] #2: 000000004fcf89c5 (&bch_register_lock){+.+.}, at: cached_dev_free+0x17/0x120 [bcache] [ 3142.898797][ T2649] [ 3142.898797][ T2649] stack backtrace: [ 3142.900961][ T2649] CPU: 3 PID: 2649 Comm: kworker/3:67 Tainted: G W 5.2.0-rc4-lp151.20-default+ #1 [ 3142.904789][ T2649] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/13/2018 [ 3142.909168][ T2649] Workqueue: events cached_dev_free [bcache] [ 3142.911422][ T2649] Call Trace: [ 3142.912656][ T2649] dump_stack+0x85/0xcb [ 3142.914181][ T2649] print_circular_bug+0x19a/0x1f0 [ 3142.916193][ T2649] __lock_acquire+0x16cd/0x1850 [ 3142.917936][ T2649] ? __lock_acquire+0x6a8/0x1850 [ 3142.919704][ T2649] ? lock_acquire+0xb4/0x1c0 [ 3142.921335][ T2649] ? find_held_lock+0x34/0xa0 [ 3142.923052][ T2649] lock_acquire+0xb4/0x1c0 [ 3142.924635][ T2649] ? flush_workqueue+0x87/0x4c0 [ 3142.926375][ T2649] flush_workqueue+0xae/0x4c0 [ 3142.928047][ T2649] ? flush_workqueue+0x87/0x4c0 [ 3142.929824][ T2649] ? drain_workqueue+0xa9/0x180 [ 3142.931686][ T2649] drain_workqueue+0xa9/0x180 [ 3142.933534][ T2649] destroy_workqueue+0x17/0x250 [ 3142.935787][ T2649] cached_dev_free+0x52/0x120 [bcache] [ 3142.937795][ T2649] process_one_work+0x2a4/0x640 [ 3142.939803][ T2649] worker_thread+0x39/0x3f0 [ 3142.941487][ T2649] ? process_one_work+0x640/0x640 [ 3142.943389][ T2649] kthread+0x125/0x140 [ 3142.944894][ T2649] ? kthread_create_worker_on_cpu+0x70/0x70 [ 3142.947744][ T2649] ret_from_fork+0x3a/0x50 [ 3142.970358][ T2649] bcache: bcache_device_free() bcache0 stopped Here is how the deadlock happens. 1) bcache_reboot() calls bcache_device_stop(), then inside bcache_device_stop() BCACHE_DEV_CLOSING bit is set on d->flags. Then closure_queue(&d->cl) is called to invoke cached_dev_flush(). 2) In cached_dev_flush(), cached_dev_free() is called by continu_at(). 3) In cached_dev_free(), when stopping the writeback kthread of the cached device by kthread_stop(), dc->writeback_thread will be waken up to quite the kthread while-loop, then cached_dev_put() is called in bch_writeback_thread(). 4) Calling cached_dev_put() in writeback kthread may drop dc->count to 0, then dc->detach kworker is scheduled, which is initialized as cached_dev_detach_finish(). 5) Inside cached_dev_detach_finish(), the last line of code is to call closure_put(&dc->disk.cl), which drops the last reference counter of closrure dc->disk.cl, then the callback cached_dev_flush() gets called. Now cached_dev_flush() is called for second time in the code path, the first time is in step 2). And again bch_register_lock will be acquired again, and a A-A lock (lockdep terminology) is happening. The root cause of the above A-A lock is in cached_dev_free(), mutex bch_register_lock is held before stopping writeback kthread and other kworkers. Fortunately now we have variable 'bcache_is_reboot', which may prevent device registration or unregistration during reboot/shutdown time, so it is unncessary to hold bch_register_lock such early now. This is how this patch fixes the reboot/shutdown time A-A lock issue: After moving mutex_lock(&bch_register_lock) to a later location where before atomic_read(&dc->running) in cached_dev_free(), such A-A lock problem can be solved without any reboot time registration race. Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:13 +02:00
Coly Li	d81080a0bc	bcache: check CACHE_SET_IO_DISABLE bit in bch_journal() [ Upstream commit `383ff2183a` ] When too many I/O errors happen on cache set and CACHE_SET_IO_DISABLE bit is set, bch_journal() may continue to work because the journaling bkey might be still in write set yet. The caller of bch_journal() may believe the journal still work but the truth is in-memory journal write set won't be written into cache device any more. This behavior may introduce potential inconsistent metadata status. This patch checks CACHE_SET_IO_DISABLE bit at the head of bch_journal(), if the bit is set, bch_journal() returns NULL immediately to notice caller to know journal does not work. Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:13 +02:00
Coly Li	57cfb755c3	bcache: check CACHE_SET_IO_DISABLE in allocator code [ Upstream commit `e775339e1a` ] If CACHE_SET_IO_DISABLE of a cache set flag is set by too many I/O errors, currently allocator routines can still continue allocate space which may introduce inconsistent metadata state. This patch checkes CACHE_SET_IO_DISABLE bit in following allocator routines, - bch_bucket_alloc() - __bch_bucket_alloc_set() Once CACHE_SET_IO_DISABLE is set on cache set, the allocator routines may reject allocation request earlier to avoid potential inconsistent metadata. Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:13 +02:00
Eiichi Tsukata	e78d1d2344	EDAC: Fix global-out-of-bounds write when setting edac_mc_poll_msec [ Upstream commit `d8655e7630` ] Commit `9da21b1509` ("EDAC: Poll timeout cannot be zero, p2") assumes edac_mc_poll_msec to be unsigned long, but the type of the variable still remained as int. Setting edac_mc_poll_msec can trigger out-of-bounds write. Reproducer: # echo 1001 > /sys/module/edac_core/parameters/edac_mc_poll_msec KASAN report: BUG: KASAN: global-out-of-bounds in edac_set_poll_msec+0x140/0x150 Write of size 8 at addr ffffffffb91b2d00 by task bash/1996 CPU: 1 PID: 1996 Comm: bash Not tainted 5.2.0-rc6+ #23 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-2.fc30 04/01/2014 Call Trace: dump_stack+0xca/0x13e print_address_description.cold+0x5/0x246 __kasan_report.cold+0x75/0x9a ? edac_set_poll_msec+0x140/0x150 kasan_report+0xe/0x20 edac_set_poll_msec+0x140/0x150 ? dimmdev_location_show+0x30/0x30 ? vfs_lock_file+0xe0/0xe0 ? _raw_spin_lock+0x87/0xe0 param_attr_store+0x1b5/0x310 ? param_array_set+0x4f0/0x4f0 module_attr_store+0x58/0x80 ? module_attr_show+0x80/0x80 sysfs_kf_write+0x13d/0x1a0 kernfs_fop_write+0x2bc/0x460 ? sysfs_kf_bin_read+0x270/0x270 ? kernfs_notify+0x1f0/0x1f0 __vfs_write+0x81/0x100 vfs_write+0x1e1/0x560 ksys_write+0x126/0x250 ? __ia32_sys_read+0xb0/0xb0 ? do_syscall_64+0x1f/0x390 do_syscall_64+0xc1/0x390 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7fa7caa5e970 Code: 73 01 c3 48 8b 0d 28 d5 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 99 2d 2c 00 00 75 10 b8 01 00 00 00 04 RSP: 002b:00007fff6acfdfe8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007fa7caa5e970 RDX: 0000000000000005 RSI: 0000000000e95c08 RDI: 0000000000000001 RBP: 0000000000e95c08 R08: 00007fa7cad1e760 R09: 00007fa7cb36a700 R10: 0000000000000073 R11: 0000000000000246 R12: 0000000000000005 R13: 0000000000000001 R14: 00007fa7cad1d600 R15: 0000000000000005 The buggy address belongs to the variable: edac_mc_poll_msec+0x0/0x40 Memory state around the buggy address: ffffffffb91b2c00: 00 00 00 00 fa fa fa fa 00 00 00 00 fa fa fa fa ffffffffb91b2c80: 00 00 00 00 fa fa fa fa 00 00 00 00 fa fa fa fa >ffffffffb91b2d00: 04 fa fa fa fa fa fa fa 04 fa fa fa fa fa fa fa ^ ffffffffb91b2d80: 04 fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00 ffffffffb91b2e00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Fix it by changing the type of edac_mc_poll_msec to unsigned int. The reason why this patch adopts unsigned int rather than unsigned long is msecs_to_jiffies() assumes arg to be unsigned int. We can avoid integer conversion bugs and unsigned int will be large enough for edac_mc_poll_msec. Reviewed-by: James Morse <james.morse@arm.com> Fixes: `9da21b1509` ("EDAC: Poll timeout cannot be zero, p2") Signed-off-by: Eiichi Tsukata <devel@etsukata.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:13 +02:00
Ahmad Masri	e54cc89e6f	wil6210: drop old event after wmi_call timeout [ Upstream commit `1a27600311` ] This change fixes a rare race condition of handling WMI events after wmi_call expires. wmi_recv_cmd immediately handles an event when reply_buf is defined and a wmi_call is waiting for the event. However, in case the wmi_call has already timed-out, there will be no waiting/running wmi_call and the event will be queued in WMI queue and will be handled later in wmi_event_handle. Meanwhile, a new similar wmi_call for the same command and event may be issued. In this case, when handling the queued event we got WARN_ON printed. Fixing this case as a valid timeout and drop the unexpected event. Signed-off-by: Ahmad Masri <amasri@codeaurora.org> Signed-off-by: Maya Erez <merez@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:13 +02:00
Arnd Bergmann	0388597d06	crypto: asymmetric_keys - select CRYPTO_HASH where needed [ Upstream commit `90acc0653d` ] Build testing with some core crypto options disabled revealed a few modules that are missing CRYPTO_HASH: crypto/asymmetric_keys/x509_public_key.o: In function `x509_get_sig_params': x509_public_key.c:(.text+0x4c7): undefined reference to `crypto_alloc_shash' x509_public_key.c:(.text+0x5e5): undefined reference to `crypto_shash_digest' crypto/asymmetric_keys/pkcs7_verify.o: In function `pkcs7_digest.isra.0': pkcs7_verify.c:(.text+0xab): undefined reference to `crypto_alloc_shash' pkcs7_verify.c:(.text+0x1b2): undefined reference to `crypto_shash_digest' pkcs7_verify.c:(.text+0x3c1): undefined reference to `crypto_shash_update' pkcs7_verify.c:(.text+0x411): undefined reference to `crypto_shash_finup' This normally doesn't show up in randconfig tests because there is a large number of other options that select CRYPTO_HASH. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:13 +02:00
Arnd Bergmann	1dea395c9e	crypto: serpent - mark __serpent_setkey_sbox noinline [ Upstream commit `473971187d` ] The same bug that gcc hit in the past is apparently now showing up with clang, which decides to inline __serpent_setkey_sbox: crypto/serpent_generic.c:268:5: error: stack frame size of 2112 bytes in function '__serpent_setkey' [-Werror,-Wframe-larger-than=] Marking it 'noinline' reduces the stack usage from 2112 bytes to 192 and 96 bytes, respectively, and seems to generate more useful object code. Fixes: `c871c10e4e` ("crypto: serpent - improve __serpent_setkey with UBSAN") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:13 +02:00
Mauro S. M. Rodrigues	b346070c72	ixgbe: Check DDM existence in transceiver before access [ Upstream commit `655c914145` ] Some transceivers may comply with SFF-8472 but not implement the Digital Diagnostic Monitoring (DDM) interface described in it. The existence of such area is specified by bit 6 of byte 92, set to 1 if implemented. Currently, due to not checking this bit ixgbe fails trying to read SFP module's eeprom with the follow message: ethtool -m enP51p1s0f0 Cannot get Module EEPROM data: Input/output error Because it fails to read the additional 256 bytes in which it was assumed to exist the DDM data. This issue was noticed using a Mellanox Passive DAC PN 01FT738. The eeprom data was confirmed by Mellanox as correct and present in other Passive DACs in from other manufacturers. Signed-off-by: "Mauro S. M. Rodrigues" <maurosr@linux.vnet.ibm.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:12 +02:00
Ferdinand Blomqvist	0340c621ec	rslib: Fix handling of of caller provided syndrome [ Upstream commit `ef4d6a8556` ] Check if the syndrome provided by the caller is zero, and act accordingly. Signed-off-by: Ferdinand Blomqvist <ferdinand.blomqvist@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190620141039.9874-6-ferdinand.blomqvist@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:12 +02:00
Ferdinand Blomqvist	8ba93c5944	rslib: Fix decoding of shortened codes [ Upstream commit `2034a42d17` ] The decoding of shortenend codes is broken. It only works as expected if there are no erasures. When decoding with erasures, Lambda (the error and erasure locator polynomial) is initialized from the given erasure positions. The pad parameter is not accounted for by the initialisation code, and hence Lambda is initialized from incorrect erasure positions. The fix is to adjust the erasure positions by the supplied pad. Signed-off-by: Ferdinand Blomqvist <ferdinand.blomqvist@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190620141039.9874-3-ferdinand.blomqvist@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:12 +02:00
Nathan Chancellor	dad0b17e4a	xsk: Properly terminate assignment in xskq_produce_flush_desc [ Upstream commit `f7019b7b0a` ] Clang warns: In file included from net/xdp/xsk_queue.c:10: net/xdp/xsk_queue.h:292:2: warning: expression result unused [-Wunused-value] WRITE_ONCE(q->ring->producer, q->prod_tail); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/compiler.h:284:6: note: expanded from macro 'WRITE_ONCE' __u.__val; \ ~~~ ^~~~~ 1 warning generated. The q->prod_tail assignment has a comma at the end, not a semi-colon. Fix that so clang no longer warns and everything works as expected. Fixes: `c497176cb2` ("xsk: add Rx receive functions and poll support") Link: https://github.com/ClangBuiltLinux/linux/issues/544 Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Acked-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Acked-by: Björn Töpel <bjorn.topel@intel.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:12 +02:00
Marek Szyprowski	e69fac59c4	clocksource/drivers/exynos_mct: Increase priority over ARM arch timer [ Upstream commit `6282edb72b` ] Exynos SoCs based on CA7/CA15 have 2 timer interfaces: custom Exynos MCT (Multi Core Timer) and standard ARM Architected Timers. There are use cases, where both timer interfaces are used simultanously. One of such examples is using Exynos MCT for the main system timer and ARM Architected Timers for the KVM and virtualized guests (KVM requires arch timers). Exynos Multi-Core Timer driver (exynos_mct) must be however started before ARM Architected Timers (arch_timer), because they both share some common hardware blocks (global system counter) and turning on MCT is needed to get ARM Architected Timer working properly. To ensure selecting Exynos MCT as the main system timer, increase MCT timer rating. To ensure proper starting order of both timers during suspend/resume cycle, increase MCT hotplug priority over ARM Archictected Timers. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org> Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:12 +02:00
Tejun Heo	12e20eca89	libata: don't request sense data on !ZAC ATA devices [ Upstream commit `ca156e006a` ] ZAC support added sense data requesting on error for both ZAC and ATA devices. This seems to cause erratic error handling behaviors on some SSDs where the device reports sense data availability and then delivers the wrong content making EH take the wrong actions. The failure mode was sporadic on a LITE-ON ssd and couldn't be reliably reproduced. There is no value in requesting sense data from non-ZAC ATA devices while there's a significant risk of introducing EH misbehaviors which are difficult to reproduce and fix. Let's do the sense data dancing only for ZAC devices. Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Masato Suzuki <masato.suzuki@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:12 +02:00
Amadeusz Sławiński	6e6bc34f85	ASoC: Intel: hdac_hdmi: Set ops to NULL on remove [ Upstream commit `0f6ff78540` ] When we unload Skylake driver we may end up calling hdac_component_master_unbind(), it uses acomp->audio_ops, which we set in hdmi_codec_probe(), so we need to set it to NULL in hdmi_codec_remove(), otherwise we will dereference no longer existing pointer. Signed-off-by: Amadeusz Sławiński <amadeuszx.slawinski@linux.intel.com> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:12 +02:00
Kyle Meyer	1182ff2248	perf tools: Increase MAX_NR_CPUS and MAX_CACHES [ Upstream commit `9f94c7f947` ] Attempting to profile 1024 or more CPUs with perf causes two errors: perf record -a [ perf record: Woken up X times to write data ] way too many cpu caches.. [ perf record: Captured and wrote X MB perf.data (X samples) ] perf report -C 1024 Error: failed to set cpu bitmap Requested CPU 1024 too large. Consider raising MAX_NR_CPUS Increasing MAX_NR_CPUS from 1024 to 2048 and redefining MAX_CACHES as MAX_NR_CPUS * 4 returns normal functionality to perf: perf record -a [ perf record: Woken up X times to write data ] [ perf record: Captured and wrote X MB perf.data (X samples) ] perf report -C 1024 ... Signed-off-by: Kyle Meyer <kyle.meyer@hpe.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190620193630.154025-1-meyerk@stormcage.eag.rdlabs.hpecorp.net Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:12 +02:00
Miaoqing Pan	7201cc227d	ath10k: fix PCIE device wake up failed [ Upstream commit `011d4111c8` ] Observed PCIE device wake up failed after ~120 iterations of soft-reboot test. The error message is "ath10k_pci 0000:01:00.0: failed to wake up device : -110" The call trace as below: ath10k_pci_probe -> ath10k_pci_force_wake -> ath10k_pci_wake_wait -> ath10k_pci_is_awake Once trigger the device to wake up, we will continuously check the RTC state until it returns RTC_STATE_V_ON or timeout. But for QCA99x0 chips, we use wrong value for RTC_STATE_V_ON. Occasionally, we get 0x7 on the fist read, we thought as a failure case, but actually is the right value, also verified with the spec. So fix the issue by changing RTC_STATE_V_ON from 0x5 to 0x7, passed ~2000 iterations. Tested HW: QCA9984 Signed-off-by: Miaoqing Pan <miaoqing@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:12 +02:00
Claire Chang	8a808fadc9	ath10k: add missing error handling [ Upstream commit `4b553f3ca4` ] In function ath10k_sdio_mbox_rx_alloc() [sdio.c], ath10k_sdio_mbox_alloc_rx_pkt() is called without handling the error cases. This will make the driver think the allocation for skb is successful and try to access the skb. If we enable failslab, system will easily crash with NULL pointer dereferencing. Call trace of CONFIG_FAILSLAB: ath10k_sdio_irq_handler+0x570/0xa88 [ath10k_sdio] process_sdio_pending_irqs+0x4c/0x174 sdio_run_irqs+0x3c/0x64 sdio_irq_work+0x1c/0x28 Fixes: `d96db25d20` ("ath10k: add initial SDIO support") Signed-off-by: Claire Chang <tientzu@chromium.org> Reviewed-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:12 +02:00
Julian Anastasov	fe2ceeb4cf	ipvs: fix tinfo memory leak in start_sync_thread [ Upstream commit `5db7c8b9f9` ] syzkaller reports for memory leak in start_sync_thread [1] As Eric points out, kthread may start and stop before the threadfn function is called, so there is no chance the data (tinfo in our case) to be released in thread. Fix this by releasing tinfo in the controlling code instead. [1] BUG: memory leak unreferenced object 0xffff8881206bf700 (size 32): comm "syz-executor761", pid 7268, jiffies 4294943441 (age 20.470s) hex dump (first 32 bytes): 00 40 7c 09 81 88 ff ff 80 45 b8 21 81 88 ff ff .@\|......E.!.... 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<0000000057619e23>] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline] [<0000000057619e23>] slab_post_alloc_hook mm/slab.h:439 [inline] [<0000000057619e23>] slab_alloc mm/slab.c:3326 [inline] [<0000000057619e23>] kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553 [<0000000086ce5479>] kmalloc include/linux/slab.h:547 [inline] [<0000000086ce5479>] start_sync_thread+0x5d2/0xe10 net/netfilter/ipvs/ip_vs_sync.c:1862 [<000000001a9229cc>] do_ip_vs_set_ctl+0x4c5/0x780 net/netfilter/ipvs/ip_vs_ctl.c:2402 [<00000000ece457c8>] nf_sockopt net/netfilter/nf_sockopt.c:106 [inline] [<00000000ece457c8>] nf_setsockopt+0x4c/0x80 net/netfilter/nf_sockopt.c:115 [<00000000942f62d4>] ip_setsockopt net/ipv4/ip_sockglue.c:1258 [inline] [<00000000942f62d4>] ip_setsockopt+0x9b/0xb0 net/ipv4/ip_sockglue.c:1238 [<00000000a56a8ffd>] udp_setsockopt+0x4e/0x90 net/ipv4/udp.c:2616 [<00000000fa895401>] sock_common_setsockopt+0x38/0x50 net/core/sock.c:3130 [<0000000095eef4cf>] __sys_setsockopt+0x98/0x120 net/socket.c:2078 [<000000009747cf88>] __do_sys_setsockopt net/socket.c:2089 [inline] [<000000009747cf88>] __se_sys_setsockopt net/socket.c:2086 [inline] [<000000009747cf88>] __x64_sys_setsockopt+0x26/0x30 net/socket.c:2086 [<00000000ded8ba80>] do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:301 [<00000000893b4ac8>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Reported-by: syzbot+7e2e50c8adfccd2e5041@syzkaller.appspotmail.com Suggested-by: Eric Biggers <ebiggers@kernel.org> Fixes: `998e7a7680` ("ipvs: Use kthread_run() instead of doing a double-fork via kernel_thread()") Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:11 +02:00
Lorenzo Bianconi	20de38d282	mt7601u: fix possible memory leak when the device is disconnected [ Upstream commit `23377c200b` ] When the device is disconnected while passing traffic it is possible to receive out of order urbs causing a memory leak since the skb linked to the current tx urb is not removed. Fix the issue deallocating the skb cleaning up the tx ring. Moreover this patch fixes the following kernel warning [ 57.480771] usb 1-1: USB disconnect, device number 2 [ 57.483451] ------------[ cut here ]------------ [ 57.483462] TX urb mismatch [ 57.483481] WARNING: CPU: 1 PID: 32 at drivers/net/wireless/mediatek/mt7601u/dma.c:245 mt7601u_complete_tx+0x165/00 [ 57.483483] Modules linked in: [ 57.483496] CPU: 1 PID: 32 Comm: kworker/1:1 Not tainted 5.2.0-rc1+ #72 [ 57.483498] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.12.0-2.fc30 04/01/2014 [ 57.483502] Workqueue: usb_hub_wq hub_event [ 57.483507] RIP: 0010:mt7601u_complete_tx+0x165/0x1e0 [ 57.483510] Code: 8b b5 10 04 00 00 8b 8d 14 04 00 00 eb 8b 80 3d b1 cb e1 00 00 75 9e 48 c7 c7 a4 ea 05 82 c6 05 f [ 57.483513] RSP: 0000:ffffc900000a0d28 EFLAGS: 00010092 [ 57.483516] RAX: 000000000000000f RBX: ffff88802c0a62c0 RCX: ffffc900000a0c2c [ 57.483518] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff810a8371 [ 57.483520] RBP: ffff88803ced6858 R08: 0000000000000000 R09: 0000000000000001 [ 57.483540] R10: 0000000000000002 R11: 0000000000000000 R12: 0000000000000046 [ 57.483542] R13: ffff88802c0a6c88 R14: ffff88803baab540 R15: ffff88803a0cc078 [ 57.483548] FS: 0000000000000000(0000) GS:ffff88803eb00000(0000) knlGS:0000000000000000 [ 57.483550] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 57.483552] CR2: 000055e7f6780100 CR3: 0000000028c86000 CR4: 00000000000006a0 [ 57.483554] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 57.483556] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 57.483559] Call Trace: [ 57.483561] <IRQ> [ 57.483565] __usb_hcd_giveback_urb+0x77/0xe0 [ 57.483570] xhci_giveback_urb_in_irq.isra.0+0x8b/0x140 [ 57.483574] handle_cmd_completion+0xf5b/0x12c0 [ 57.483577] xhci_irq+0x1f6/0x1810 [ 57.483581] ? lockdep_hardirqs_on+0x9e/0x180 [ 57.483584] ? _raw_spin_unlock_irq+0x24/0x30 [ 57.483588] __handle_irq_event_percpu+0x3a/0x260 [ 57.483592] handle_irq_event_percpu+0x1c/0x60 [ 57.483595] handle_irq_event+0x2f/0x4c [ 57.483599] handle_edge_irq+0x7e/0x1a0 [ 57.483603] handle_irq+0x17/0x20 [ 57.483607] do_IRQ+0x54/0x110 [ 57.483610] common_interrupt+0xf/0xf [ 57.483612] </IRQ> Acked-by: Jakub Kicinski <kubakici@wp.pl> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:11 +02:00
Masahiro Yamada	0335778801	x86/build: Add 'set -e' to mkcapflags.sh to delete broken capflags.c [ Upstream commit `bc53d3d777` ] Without 'set -e', shell scripts continue running even after any error occurs. The missed 'set -e' is a typical bug in shell scripting. For example, when a disk space shortage occurs while this script is running, it actually ends up with generating a truncated capflags.c. Yet, mkcapflags.sh continues running and exits with 0. So, the build system assumes it has succeeded. It will not be re-generated in the next invocation of Make since its timestamp is newer than that of any of the source files. Add 'set -e' so that any error in this script is caught and propagated to the build system. Since `9c2af1c737` ("kbuild: add .DELETE_ON_ERROR special target"), make automatically deletes the target on any failure. So, the broken capflags.c will be deleted automatically. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Borislav Petkov <bp@alien8.de> Link: https://lkml.kernel.org/r/20190625072622.17679-1-yamada.masahiro@socionext.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:11 +02:00
Lorenzo Bianconi	3f7952b275	mt7601u: do not schedule rx_tasklet when the device has been disconnected [ Upstream commit `4079e8ccab` ] Do not schedule rx_tasklet when the usb dongle is disconnected. Moreover do not grub rx_lock in mt7601u_kill_rx since usb_poison_urb can run concurrently with urb completion and we can unlink urbs from rx ring in any order. This patch fixes the common kernel warning reported when the device is removed. [ 24.921354] usb 3-14: USB disconnect, device number 7 [ 24.921593] ------------[ cut here ]------------ [ 24.921594] RX urb mismatch [ 24.921675] WARNING: CPU: 4 PID: 163 at drivers/net/wireless/mediatek/mt7601u/dma.c:200 mt7601u_complete_rx+0xcb/0xd0 [mt7601u] [ 24.921769] CPU: 4 PID: 163 Comm: kworker/4:2 Tainted: G OE 4.19.31-041931-generic #201903231635 [ 24.921770] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z97 Extreme4, BIOS P1.30 05/23/2014 [ 24.921782] Workqueue: usb_hub_wq hub_event [ 24.921797] RIP: 0010:mt7601u_complete_rx+0xcb/0xd0 [mt7601u] [ 24.921800] RSP: 0018:ffff9bd9cfd03d08 EFLAGS: 00010086 [ 24.921802] RAX: 0000000000000000 RBX: ffff9bd9bf043540 RCX: 0000000000000006 [ 24.921803] RDX: 0000000000000007 RSI: 0000000000000096 RDI: ffff9bd9cfd16420 [ 24.921804] RBP: ffff9bd9cfd03d28 R08: 0000000000000002 R09: 00000000000003a8 [ 24.921805] R10: 0000002f485fca34 R11: 0000000000000000 R12: ffff9bd9bf043c1c [ 24.921806] R13: ffff9bd9c62fa3c0 R14: 0000000000000082 R15: 0000000000000000 [ 24.921807] FS: 0000000000000000(0000) GS:ffff9bd9cfd00000(0000) knlGS:0000000000000000 [ 24.921808] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 24.921808] CR2: 00007fb2648b0000 CR3: 0000000142c0a004 CR4: 00000000001606e0 [ 24.921809] Call Trace: [ 24.921812] <IRQ> [ 24.921819] __usb_hcd_giveback_urb+0x8b/0x140 [ 24.921821] usb_hcd_giveback_urb+0xca/0xe0 [ 24.921828] xhci_giveback_urb_in_irq.isra.42+0x82/0xf0 [ 24.921834] handle_cmd_completion+0xe02/0x10d0 [ 24.921837] xhci_irq+0x274/0x4a0 [ 24.921838] xhci_msi_irq+0x11/0x20 [ 24.921851] __handle_irq_event_percpu+0x44/0x190 [ 24.921856] handle_irq_event_percpu+0x32/0x80 [ 24.921861] handle_irq_event+0x3b/0x5a [ 24.921867] handle_edge_irq+0x80/0x190 [ 24.921874] handle_irq+0x20/0x30 [ 24.921889] do_IRQ+0x4e/0xe0 [ 24.921891] common_interrupt+0xf/0xf [ 24.921892] </IRQ> [ 24.921900] RIP: 0010:usb_hcd_flush_endpoint+0x78/0x180 [ 24.921354] usb 3-14: USB disconnect, device number 7 Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-26 09:14:11 +02:00

1 2 3 4 5 ...

789869 Commits