Commit Graph

888494 Commits

Author SHA1 Message Date
Linus Torvalds
bf3f401db6 Merge tag 'staging-5.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging and IIO driver fixes from Greg KH:
 "Here are some small staging and iio driver fixes for 5.5-rc7

  All of them are for some small reported issues. Nothing major, full
  details in the shortlog.

  All have been in linux-next with no reported issues"

* tag 'staging-5.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  staging: comedi: ni_routes: allow partial routing information
  staging: comedi: ni_routes: fix null dereference in ni_find_route_source()
  iio: light: vcnl4000: Fix scale for vcnl4040
  iio: buffer: align the size of scan bytes to size of the largest element
  iio: chemical: pms7003: fix unmet triggered buffer dependency
  iio: imu: st_lsm6dsx: Fix selection of ST_LSM6DS3_ID
  iio: adc: ad7124: Fix DT channel configuration
2020-01-18 12:06:09 -08:00
Linus Torvalds
c5fd2c5b8b Merge tag 'usb-5.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB driver fixes from Greg KH:
 "Here are some small USB driver and core fixes for 5.5-rc7

  There's one fix for hub wakeup issues and a number of small usb-serial
  driver fixes and device id updates.

  The hub fix has been in linux-next for a while with no reported
  issues, and the usb-serial ones have all passed 0-day with no
  problems"

* tag 'usb-5.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  USB: serial: quatech2: handle unbound ports
  USB: serial: keyspan: handle unbound ports
  USB: serial: io_edgeport: add missing active-port sanity check
  USB: serial: io_edgeport: handle unbound ports on URB completion
  USB: serial: ch341: handle unbound port at reset_resume
  USB: serial: suppress driver bind attributes
  USB: serial: option: add support for Quectel RM500Q in QDL mode
  usb: core: hub: Improved device recognition on remote wakeup
  USB: serial: opticon: fix control-message timeouts
  USB: serial: option: Add support for Quectel RM500Q
  USB: serial: simple: Add Motorola Solutions TETRA MTP3xxx and MTP85xx
2020-01-18 12:02:33 -08:00
Linus Torvalds
25e73aadf2 Merge tag 'io_uring-5.5-2020-01-16' of git://git.kernel.dk/linux-block
Pull io_uring fixes form Jens Axboe:

 - Ensure ->result is always set when IO is retried (Bijan)

 - In conjunction with the above, fix a regression in polled IO issue
   when retried (me/Bijan)

 - Don't setup async context for read/write fixed, otherwise we may
   wrongly map the iovec on retry (me)

 - Cancel io-wq work if we fail getting mm reference (me)

 - Ensure dependent work is always initialized correctly (me)

 - Only allow original task to submit IO, don't allow it from a passed
   ring fd (me)

* tag 'io_uring-5.5-2020-01-16' of git://git.kernel.dk/linux-block:
  io_uring: only allow submit from owning task
  io_uring: ensure workqueue offload grabs ring mutex for poll list
  io_uring: clear req->result always before issuing a read/write request
  io_uring: be consistent in assigning next work from handler
  io-wq: cancel work if we fail getting a mm reference
  io_uring: don't setup async context for read/write fixed
2020-01-17 11:25:45 -08:00
Linus Torvalds
effaf90137 Merge tag 'for-5.5-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
 "A few more fixes that have been in the works during last twp weeks.
  All have a user visible effect and are stable material:

   - scrub: properly update progress after calling cancel ioctl, calling
     'resume' would start from the beginning otherwise

   - fix subvolume reference removal, after moving out of the original
     path the reference is not recognized and will lead to transaction
     abort

   - fix reloc root lifetime checks, could lead to crashes when there's
     subvolume cleaning running in parallel

   - fix memory leak when quotas get disabled in the middle of extent
     accounting

   - fix transaction abort in case of balance being started on degraded
     mount on eg. RAID1"

* tag 'for-5.5-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: check rw_devices, not num_devices for balance
  Btrfs: always copy scrub arguments back to user space
  btrfs: relocation: fix reloc_root lifespan and access
  btrfs: fix memory leak in qgroup accounting
  btrfs: do not delete mismatched root refs
  btrfs: fix invalid removal of root ref
  btrfs: rework arguments of btrfs_unlink_subvol
2020-01-17 11:21:05 -08:00
Greg Kroah-Hartman
453495d4e7 Merge tag 'usb-serial-5.5-rc7' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus
Johan writes:

USB-serial fixes for 5.5-rc7

Here are a few fixes for issues related to unbound port devices which
could lead to NULL-pointer dereferences. Notably the bind attributes for
usb-serial (port) drivers are removed as almost none of the drivers can
handle individual ports going away once they've been bound.

Included are also some new device ids.

All but the unbound-port fixes have been in linux-next with no reported
issues.

Signed-off-by: Johan Hovold <johan@kernel.org>

* tag 'usb-serial-5.5-rc7' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial:
  USB: serial: quatech2: handle unbound ports
  USB: serial: keyspan: handle unbound ports
  USB: serial: io_edgeport: add missing active-port sanity check
  USB: serial: io_edgeport: handle unbound ports on URB completion
  USB: serial: ch341: handle unbound port at reset_resume
  USB: serial: suppress driver bind attributes
  USB: serial: option: add support for Quectel RM500Q in QDL mode
  USB: serial: opticon: fix control-message timeouts
  USB: serial: option: Add support for Quectel RM500Q
  USB: serial: simple: Add Motorola Solutions TETRA MTP3xxx and MTP85xx
2020-01-17 19:40:06 +01:00
Linus Torvalds
ab7541c3ad Merge tag 'fuse-fixes-5.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
Pull fuse fix from Miklos Szeredi:
 "Fix a regression in the last release affecting the ftp module of the
  gvfs filesystem"

* tag 'fuse-fixes-5.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  fuse: fix fuse_send_readpages() in the syncronous read case
2020-01-17 08:42:02 -08:00
Linus Torvalds
07d5ac6a12 Merge tag 'sound-5.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
 "This became bigger than I have hoped for rc7. But, the only large LOC
  is for stm32 fixes that are simple rewriting of register access
  helpers, while the rest are all nice and small fixes:

   - A few ASoC fixes for the remaining probe error handling bugs

   - ALSA sequencer core fix for racy proc file accesses

   - Revert the option rename of snd-hda-intel to make compatible again

   - Various device-specific fixes"

* tag 'sound-5.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: seq: Fix racy access for queue timer in proc read
  ALSA: usb-audio: fix sync-ep altsetting sanity check
  ASoC: msm8916-wcd-digital: Reset RX interpolation path after use
  ASoC: msm8916-wcd-analog: Fix MIC BIAS Internal1
  ASoC: cros_ec_codec: Make the device acpi compatible
  ASoC: sti: fix possible sleep-in-atomic
  ASoC: msm8916-wcd-analog: Fix selected events for MIC BIAS External1
  ASoC: hdac_hda: Fix error in driver removal after failed probe
  ASoC: SOF: Intel: fix HDA codec driver probe with multiple controllers
  ASoC: SOF: Intel: lower print level to dbg if we will reinit DSP
  ALSA: dice: fix fallback from protocol extension into limited functionality
  ALSA: firewire-tascam: fix corruption due to spin lock without restoration in SoftIRQ context
  ALSA: hda: Rename back to dmic_detect option
  ASoC: stm32: dfsdm: fix 16 bits record
  ASoC: stm32: sai: fix possible circular locking
  ASoC: Fix NULL dereference at freeing
  ASoC: Intel: bytcht_es8316: Fix Irbis NB41 netbook quirk
  ASoC: rt5640: Fix NULL dereference on module unload
2020-01-17 08:38:35 -08:00
Johan Hovold
9715a43eea USB: serial: quatech2: handle unbound ports
Check for NULL port data in the modem- and line-status handlers to avoid
dereferencing a NULL pointer in the unlikely case where a port device
isn't bound to a driver (e.g. after an allocation failure on port
probe).

Note that the other (stubbed) event handlers qt2_process_xmit_empty()
and qt2_process_flush() would need similar sanity checks in case they
are ever implemented.

Fixes: f7a33e608d ("USB: serial: add quatech2 usb to serial driver")
Cc: stable <stable@vger.kernel.org>     # 3.5
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
2020-01-17 16:22:59 +01:00
Johan Hovold
3018dd3fa1 USB: serial: keyspan: handle unbound ports
Check for NULL port data in the control URB completion handlers to avoid
dereferencing a NULL pointer in the unlikely case where a port device
isn't bound to a driver (e.g. after an allocation failure on port
probe()).

Fixes: 0ca1268e10 ("USB Serial Keyspan: add support for USA-49WG & USA-28XG")
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Cc: stable <stable@vger.kernel.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
2020-01-17 16:22:58 +01:00
Johan Hovold
1568c58d11 USB: serial: io_edgeport: add missing active-port sanity check
The driver receives the active port number from the device, but never
made sure that the port number was valid. This could lead to a
NULL-pointer dereference or memory corruption in case a device sends
data for an invalid port.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Cc: stable <stable@vger.kernel.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
2020-01-17 16:22:57 +01:00
Johan Hovold
e37d1aeda7 USB: serial: io_edgeport: handle unbound ports on URB completion
Check for NULL port data in the shared interrupt and bulk completion
callbacks to avoid dereferencing a NULL pointer in case a device sends
data for a port device which isn't bound to a driver (e.g. due to a
malicious device having unexpected endpoints or after an allocation
failure on port probe).

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Cc: stable <stable@vger.kernel.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
2020-01-17 16:22:57 +01:00
Johan Hovold
4d5ef53f75 USB: serial: ch341: handle unbound port at reset_resume
Check for NULL port data in reset_resume() to avoid dereferencing a NULL
pointer in case the port device isn't bound to a driver (e.g. after a
failed control request at port probe).

Fixes: 1ded7ea47b ("USB: ch341 serial: fix port number changed after resume")
Cc: stable <stable@vger.kernel.org>     # 2.6.30
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
2020-01-17 16:22:45 +01:00
Josef Bacik
b35cf1f0bf btrfs: check rw_devices, not num_devices for balance
The fstest btrfs/154 reports

  [ 8675.381709] BTRFS: Transaction aborted (error -28)
  [ 8675.383302] WARNING: CPU: 1 PID: 31900 at fs/btrfs/block-group.c:2038 btrfs_create_pending_block_groups+0x1e0/0x1f0 [btrfs]
  [ 8675.390925] CPU: 1 PID: 31900 Comm: btrfs Not tainted 5.5.0-rc6-default+ #935
  [ 8675.392780] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba527-rebuilt.opensuse.org 04/01/2014
  [ 8675.395452] RIP: 0010:btrfs_create_pending_block_groups+0x1e0/0x1f0 [btrfs]
  [ 8675.402672] RSP: 0018:ffffb2090888fb00 EFLAGS: 00010286
  [ 8675.404413] RAX: 0000000000000000 RBX: ffff92026dfa91c8 RCX: 0000000000000001
  [ 8675.406609] RDX: 0000000000000000 RSI: ffffffff8e100899 RDI: ffffffff8e100971
  [ 8675.408775] RBP: ffff920247c61660 R08: 0000000000000000 R09: 0000000000000000
  [ 8675.410978] R10: 0000000000000000 R11: 0000000000000000 R12: 00000000ffffffe4
  [ 8675.412647] R13: ffff92026db74000 R14: ffff920247c616b8 R15: ffff92026dfbc000
  [ 8675.413994] FS:  00007fd5e57248c0(0000) GS:ffff92027d800000(0000) knlGS:0000000000000000
  [ 8675.416146] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [ 8675.417833] CR2: 0000564aa51682d8 CR3: 000000006dcbc004 CR4: 0000000000160ee0
  [ 8675.419801] Call Trace:
  [ 8675.420742]  btrfs_start_dirty_block_groups+0x355/0x480 [btrfs]
  [ 8675.422600]  btrfs_commit_transaction+0xc8/0xaf0 [btrfs]
  [ 8675.424335]  reset_balance_state+0x14a/0x190 [btrfs]
  [ 8675.425824]  btrfs_balance.cold+0xe7/0x154 [btrfs]
  [ 8675.427313]  ? kmem_cache_alloc_trace+0x235/0x2c0
  [ 8675.428663]  btrfs_ioctl_balance+0x298/0x350 [btrfs]
  [ 8675.430285]  btrfs_ioctl+0x466/0x2550 [btrfs]
  [ 8675.431788]  ? mem_cgroup_charge_statistics+0x51/0xf0
  [ 8675.433487]  ? mem_cgroup_commit_charge+0x56/0x400
  [ 8675.435122]  ? do_raw_spin_unlock+0x4b/0xc0
  [ 8675.436618]  ? _raw_spin_unlock+0x1f/0x30
  [ 8675.438093]  ? __handle_mm_fault+0x499/0x740
  [ 8675.439619]  ? do_vfs_ioctl+0x56e/0x770
  [ 8675.441034]  do_vfs_ioctl+0x56e/0x770
  [ 8675.442411]  ksys_ioctl+0x3a/0x70
  [ 8675.443718]  ? trace_hardirqs_off_thunk+0x1a/0x1c
  [ 8675.445333]  __x64_sys_ioctl+0x16/0x20
  [ 8675.446705]  do_syscall_64+0x50/0x210
  [ 8675.448059]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
  [ 8675.479187] BTRFS: error (device vdb) in btrfs_create_pending_block_groups:2038: errno=-28 No space left

We now use btrfs_can_overcommit() to see if we can flip a block group
read only.  Before this would fail because we weren't taking into
account the usable un-allocated space for allocating chunks.  With my
patches we were allowed to do the balance, which is technically correct.

The test is trying to start balance on degraded mount.  So now we're
trying to allocate a chunk and cannot because we want to allocate a
RAID1 chunk, but there's only 1 device that's available for usage.  This
results in an ENOSPC.

But we shouldn't even be making it this far, we don't have enough
devices to restripe.  The problem is we're using btrfs_num_devices(),
that also includes missing devices. That's not actually what we want, we
need to use rw_devices.

The chunk_mutex is not needed here, rw_devices changes only in device
add, remove or replace, all are excluded by EXCL_OP mechanism.

Fixes: e4d8ec0f65 ("Btrfs: implement online profile changing")
CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ add stacktrace, update changelog, drop chunk_mutex ]
Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-17 15:40:54 +01:00
Filipe Manana
5afe6ce748 Btrfs: always copy scrub arguments back to user space
If scrub returns an error we are not copying back the scrub arguments
structure to user space. This prevents user space to know how much
progress scrub has done if an error happened - this includes -ECANCELED
which is returned when users ask for scrub to stop. A particular use
case, which is used in btrfs-progs, is to resume scrub after it is
canceled, in that case it relies on checking the progress from the scrub
arguments structure and then use that progress in a call to resume
scrub.

So fix this by always copying the scrub arguments structure to user
space, overwriting the value returned to user space with -EFAULT only if
copying the structure failed to let user space know that either that
copying did not happen, and therefore the structure is stale, or it
happened partially and the structure is probably not valid and corrupt
due to the partial copy.

Reported-by: Graham Cobb <g.btrfs@cobb.uk.net>
Link: https://lore.kernel.org/linux-btrfs/d0a97688-78be-08de-ca7d-bcb4c7fb397e@cobb.uk.net/
Fixes: 06fe39ab15 ("Btrfs: do not overwrite scrub error with fault error in scrub ioctl")
CC: stable@vger.kernel.org # 5.1+
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Tested-by: Graham Cobb <g.btrfs@cobb.uk.net>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-17 15:28:52 +01:00
Linus Torvalds
13b2668d6f Merge tag 'gpio-v5.5-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio
Pull GPIO fixes from Linus Walleij:
 "This reverts the GPIOLIB_IRQCHIP in the ThunderX driver.

  ThunderX is a piece of Arm-based server chip. I converted the driver
  to hierarchical gpiochip without access to real silicon and failed
  miserably since I didn't take MSI's into account.

  Kevin Hao helpfully stepped in and fixed it properly, let's revert it
  for v5.5 and put the proper conversion into v5.6"

* tag 'gpio-v5.5-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
  Revert "gpio: thunderx: Switch to GPIOLIB_IRQCHIP"
2020-01-17 06:03:11 -08:00
Linus Torvalds
5ffdff81cf Merge tag 'block-5.5-2020-01-16' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
 "Three fixes that should go into this release:

   - The 32-bit segment size fix that I mentioned last week (Ming)

   - Use uint for the block size (Mikulas)

   - A null_blk zone write handling fix (Damien)"

* tag 'block-5.5-2020-01-16' of git://git.kernel.dk/linux-block:
  block: fix an integer overflow in logical block size
  null_blk: Fix zone write handling
  block: fix get_max_segment_size() overflow on 32bit arch
2020-01-17 05:54:18 -08:00
Johan Hovold
fdb838efa3 USB: serial: suppress driver bind attributes
USB-serial drivers must not be unbound from their ports before the
corresponding USB driver is unbound from the parent interface so
suppress the bind and unbind attributes.

Unbinding a serial driver while it's port is open is a sure way to
trigger a crash as any driver state is released on unbind while port
hangup is handled on the parent USB interface level. Drivers for
multiport devices where ports share a resource such as an interrupt
endpoint also generally cannot handle individual ports going away.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Cc: stable <stable@vger.kernel.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
2020-01-17 11:11:26 +01:00
Jens Axboe
44d282796f io_uring: only allow submit from owning task
If the credentials or the mm doesn't match, don't allow the task to
submit anything on behalf of this ring. The task that owns the ring can
pass the file descriptor to another task, but we don't want to allow
that task to submit an SQE that then assumes the ring mm and creds if
it needs to go async.

Cc: stable@vger.kernel.org
Suggested-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-16 21:43:24 -07:00
Linus Torvalds
575966e080 Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Olof Johansson:
 "I've been sitting on these longer than I meant, so the patch count is
  a bit higher than ideal for this part of the release. There's also
  some reverts of double-applied patches that brings the diffstat up a
  bit.

  With that said, the biggest changes are:

   - Revert of duplicate i2c device addition on two Aspeed (BMC)
     Devicetrees.

   - Move of two device nodes that got applied to the wrong part of the
     tree on ASpeed G6.

   - Regulator fix for Beaglebone X15 (adding 12/5V supplies)

   - Use interrupts for keys on Amlogic SM1 to avoid missed polls

  In addition to that, there is a collection of smaller DT fixes:

   - Power supply assignment fixes for i.MX6

   - Fix of interrupt line for magnetometer on i.MX8 Librem5 devkit

   - Build fixlets (selects) for davinci/omap2+

   - More interrupt number fixes for Stratix10, Amlogic SM1, etc.

   - ... and more similar fixes across different platforms

  And some non-DT stuff:

   - optee fix to register multiple shared pages properly

   - Clock calculation fixes for MMP3

   - Clock fixes for OMAP as well"

* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (42 commits)
  MAINTAINERS: Add myself as the co-maintainer for Actions Semi platforms
  ARM: dts: imx7: Fix Toradex Colibri iMX7S 256MB NAND flash support
  ARM: dts: imx6sll-evk: Remove incorrect power supply assignment
  ARM: dts: imx6sl-evk: Remove incorrect power supply assignment
  ARM: dts: imx6sx-sdb: Remove incorrect power supply assignment
  ARM: dts: imx6qdl-sabresd: Remove incorrect power supply assignment
  ARM: dts: imx6q-icore-mipi: Use 1.5 version of i.Core MX6DL
  ARM: omap2plus: select RESET_CONTROLLER
  ARM: davinci: select CONFIG_RESET_CONTROLLER
  ARM: dts: aspeed: rainier: Fix fan fault and presence
  ARM: dts: aspeed: rainier: Remove duplicate i2c busses
  ARM: dts: aspeed: tacoma: Remove duplicate flash nodes
  ARM: dts: aspeed: tacoma: Remove duplicate i2c busses
  ARM: dts: aspeed: tacoma: Fix fsi master node
  ARM: dts: aspeed-g6: Fix FSI master location
  ARM: dts: mmp3: Fix the TWSI ranges
  clk: mmp2: Fix the order of timer mux parents
  ARM: mmp: do not divide the clock rate
  arm64: dts: rockchip: Fix IR on Beelink A1
  optee: Fix multi page dynamic shm pool alloc
  ...
2020-01-16 19:42:08 -08:00
Linus Torvalds
ef64753c19 Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
 "Second collection of clk fixes for the next release.

  This one includes a fix for PM on TI SoCs with sysc devices and fixes
  a bunch of clks that are stuck always enabled on Qualcomm SDM845 SoCs.

  Allwinner SoCs get the usual set of fixes too, mostly correcting
  drivers to have the right bits that match the hardware.

  There's also a Samsung and Tegra fix in here to mark a clk critical
  and avoid a double free.

  And finally there's a fix for critical clks that silences a big
  warning splat about trying to enable a clk that couldn't even be
  prepared"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: ti: dra7-atl: Remove pm_runtime_irq_safe()
  clk: qcom: gcc-sdm845: Add missing flag to votable GDSCs
  clk: sunxi-ng: h6-r: Fix AR100/R_APB2 parent order
  clk: sunxi-ng: h6-r: Simplify R_APB1 clock definition
  clk: sunxi-ng: sun8i-r: Fix divider on APB0 clock
  clk: Don't try to enable critical clocks if prepare failed
  clk: tegra: Fix double-free in tegra_clk_init()
  clk: samsung: exynos5420: Keep top G3D clocks enabled
  clk: sunxi-ng: r40: Allow setting parent rate for external clock outputs
  clk: sunxi-ng: v3s: Fix incorrect number of hw_clks.
2020-01-16 19:25:11 -08:00
Linus Torvalds
f4353c3e2a Merge tag 'pm-5.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fix from Rafael Wysocki:
 "Fix a coding mistake in the teo cpuidle governor causing data to be
  written beyond the last array element (Ikjoon Jang)"

* tag 'pm-5.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpuidle: teo: Fix intervals[] array indexing bug
2020-01-16 15:55:30 -08:00
Manivannan Sadhasivam
70db729fe1 MAINTAINERS: Add myself as the co-maintainer for Actions Semi platforms
Since I've been doing the maintainership work for couple of cycles, we've
decided to add myself as the co-maintainer along with Andreas.

Link: https://lore.kernel.org/r/20200114084348.25659-2-manivannan.sadhasivam@linaro.org
Cc: "Andreas Färber" <afaerber@suse.de>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Olof Johansson <olof@lixom.net>
2020-01-16 15:49:19 -08:00
Linus Torvalds
0c99ee44b8 Merge tag 'tag-chrome-platform-fixes-for-v5.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux
Pull chrome platform fix from Benson Leung:
 "One fix in the wilco_ec keyboard backlight driver to allow the EC
  driver to continue loading in the absence of a backlight module"

* tag 'tag-chrome-platform-fixes-for-v5.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux:
  platform/chrome: wilco_ec: Fix keyboard backlight probing
2020-01-16 10:26:40 -08:00
Reinhard Speyerer
f3eaabbfd0 USB: serial: option: add support for Quectel RM500Q in QDL mode
Add support for Quectel RM500Q in QDL mode.

T:  Bus=02 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 24 Spd=480  MxCh= 0
D:  Ver= 2.10 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
P:  Vendor=2c7c ProdID=0800 Rev= 0.00
S:  Manufacturer=Qualcomm CDMA Technologies MSM
S:  Product=QUSB_BULK_SN:xxxxxxxx
S:  SerialNumber=xxxxxxxx
C:* #Ifs= 1 Cfg#= 1 Atr=a0 MxPwr=  2mA
I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=10 Driver=option
E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms

It is assumed that the ZLP flag required for other Qualcomm-based
5G devices also applies to Quectel RM500Q.

Signed-off-by: Reinhard Speyerer <rspmn@arcor.de>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
2020-01-16 16:54:34 +01:00
Takashi Iwai
e5dbdcb312 Merge tag 'asoc-fix-v5.5-rc6' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v5.5

This is mostly driver specific fixes, plus an error handling fix
in the core.  There is a rather large diffstat for the stm32 SAI
driver, this is a very large but mostly mechanical update which
wraps every register access in the driver to allow a fix to the
locking which avoids circular locks, the active change is much
smaller and more reasonably sized.
2020-01-16 14:14:26 +01:00
Miklos Szeredi
7df1e988c7 fuse: fix fuse_send_readpages() in the syncronous read case
Buffered read in fuse normally goes via:

 -> generic_file_buffered_read()
   -> fuse_readpages()
     -> fuse_send_readpages()
       ->fuse_simple_request() [called since v5.4]

In the case of a read request, fuse_simple_request() will return a
non-negative bytecount on success or a negative error value.  A positive
bytecount was taken to be an error and the PG_error flag set on the page.
This resulted in generic_file_buffered_read() falling back to ->readpage(),
which would repeat the read request and succeed.  Because of the repeated
read succeeding the bug was not detected with regression tests or other use
cases.

The FTP module in GVFS however fails the second read due to the
non-seekable nature of FTP downloads.

Fix by checking and ignoring positive return value from
fuse_simple_request().

Reported-by: Ondrej Holy <oholy@redhat.com>
Link: https://gitlab.gnome.org/GNOME/gvfs/issues/441
Fixes: 134831e36b ("fuse: convert readpages to simple api")
Cc: <stable@vger.kernel.org> # v5.4
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-01-16 11:09:36 +01:00
Jens Axboe
11ba820bf1 io_uring: ensure workqueue offload grabs ring mutex for poll list
A previous commit moved the locking for the async sqthread, but didn't
take into account that the io-wq workers still need it. We can't use
req->in_async for this anymore as both the sqthread and io-wq workers
set it, gate the need for locking on io_wq_current_is_worker() instead.

Fixes: 8a4955ff1c ("io_uring: sqthread should grab ctx->uring_lock for submissions")
Reported-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-15 21:51:17 -07:00
Mikulas Patocka
ad6bf88a6c block: fix an integer overflow in logical block size
Logical block size has type unsigned short. That means that it can be at
most 32768. However, there are architectures that can run with 64k pages
(for example arm64) and on these architectures, it may be possible to
create block devices with 64k block size.

For exmaple (run this on an architecture with 64k pages):

Mount will fail with this error because it tries to read the superblock using 2-sector
access:
  device-mapper: writecache: I/O is not aligned, sector 2, size 1024, block size 65536
  EXT4-fs (dm-0): unable to read superblock

This patch changes the logical block size from unsigned short to unsigned
int to avoid the overflow.

Cc: stable@vger.kernel.org
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-15 21:43:09 -07:00
Bijan Mottahedeh
797f3f535d io_uring: clear req->result always before issuing a read/write request
req->result is cleared when io_issue_sqe() calls io_read/write_pre()
routines.  Those routines however are not called when the sqe
argument is NULL, which is the case when io_issue_sqe() is called from
io_wq_submit_work().  io_issue_sqe() may then examine a stale result if
a polled request had previously failed with -EAGAIN:

        if (ctx->flags & IORING_SETUP_IOPOLL) {
                if (req->result == -EAGAIN)
                        return -EAGAIN;

                io_iopoll_req_issued(req);
        }

and in turn cause a subsequently completed request to be re-issued in
io_wq_submit_work().

Signed-off-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-15 21:36:13 -07:00
Takashi Iwai
60adcfde92 ALSA: seq: Fix racy access for queue timer in proc read
snd_seq_info_timer_read() reads the information of the timer assigned
for each queue, but it's done in a racy way which may lead to UAF as
spotted by syzkaller.

This patch applies the missing q->timer_mutex lock while accessing the
timer object as well as a slight code change to adapt the standard
coding style.

Reported-by: syzbot+2b2ef983f973e5c40943@syzkaller.appspotmail.com
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200115203733.26530-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2020-01-15 21:38:18 +01:00
Jari Ruusu
f5ae2ea634 Fix built-in early-load Intel microcode alignment
Intel Software Developer's Manual, volume 3, chapter 9.11.6 says:

 "Note that the microcode update must be aligned on a 16-byte boundary
  and the size of the microcode update must be 1-KByte granular"

When early-load Intel microcode is loaded from initramfs, userspace tool
'iucode_tool' has already 16-byte aligned those microcode bits in that
initramfs image.  Image that was created something like this:

 iucode_tool --write-earlyfw=FOO.cpio microcode-files...

However, when early-load Intel microcode is loaded from built-in
firmware BLOB using CONFIG_EXTRA_FIRMWARE= kernel config option, that
16-byte alignment is not guaranteed.

Fix this by forcing all built-in firmware BLOBs to 16-byte alignment.

[ If we end up having other firmware with much bigger alignment
  requirements, we might need to introduce some method for the firmware
  to specify it, this is the minimal "just increase the alignment a bit
  to account for this one special case" patch    - Linus ]

Signed-off-by: Jari Ruusu <jari.ruusu@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-15 11:50:37 -08:00
Linus Torvalds
a4feff2264 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2
Pull arch/nios2 fixlet from Ley Foon Tan:
 "Update my nios2 maintainer email"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2:
  MAINTAINERS: Update Ley Foon Tan's email address
2020-01-15 11:33:53 -08:00
Linus Torvalds
51d6981751 Merge tag 'platform-drivers-x86-v5.5-3' of git://git.infradead.org/linux-platform-drivers-x86
Pull x86 platform driver fixes from Andy Shevchenko:

 - Fix keyboard brightness control for ASUS laptops

 - Better handling parameters of GPD pocket fan module to avoid
   thermal shock

 - Add IDs to PMC platform driver to support Intel Comet Lake

 - Fix potential dead lock in Mellanox TM FIFO driver and ABI
   documentation

* tag 'platform-drivers-x86-v5.5-3' of git://git.infradead.org/linux-platform-drivers-x86:
  Documentation/ABI: Add missed attribute for mlxreg-io sysfs interfaces
  Documentation/ABI: Fix documentation inconsistency for mlxreg-io sysfs interfaces
  platform/x86: asus-wmi: Fix keyboard brightness cannot be set to 0
  platform/x86: intel_pmc_core: update Comet Lake platform driver
  platform/x86: GPD pocket fan: Allow somewhat lower/higher temperature limits
  platform/x86: GPD pocket fan: Use default values when wrong modparams are given
  platform/mellanox: fix potential deadlock in the tmfifo driver
  platform/x86: intel-ips: Use the correct style for SPDX License Identifier
2020-01-15 11:30:50 -08:00
Linus Torvalds
0174cb6ce9 Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fix from Herbert Xu:
 "This fixes a build problem for the hisilicon driver"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: hisilicon/sec2 - Use atomics instead of __sync
2020-01-15 10:21:34 -08:00
Linus Torvalds
84bf39461e Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs fixes from Al Viro:
 "Fixes for mountpoint_last() bugs (by converting to use of
  lookup_last()) and an autofs regression fix from this cycle (caused by
  follow_managed() breakage introduced in barrier fixes series)"

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fix autofs regression caused by follow_managed() changes
  reimplement path_mountpoint() with less magic
2020-01-15 09:58:14 -08:00
Damien Le Moal
16c731fed6 null_blk: Fix zone write handling
null_zone_write() only allows writing empty and implicitly opened zones.
Writing to closed and explicitly opened zones must also be allowed and
the zone condition must be transitioned to implicit open if the zone
is not explicitly opened already.

Fixes: da644b2cc1 ("null_blk: add zone open, close, and finish support")
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-15 08:18:39 -07:00
Ian Abbott
9fea3a40f6 staging: comedi: ni_routes: allow partial routing information
This patch fixes a regression on setting up asynchronous commands to use
external trigger sources when board-specific routing information is
missing.

`ni_find_device_routes()` (called via `ni_assign_device_routes()`) finds
the table of register values for the device family and the set of valid
routes for the specific board.  If both are found,
`tables->route_values` is set to point to the table of register values
for the device family and `tables->valid_routes` is set to point to the
list of valid routes for the specific board.  If either is not found,
both `tables->route_values` and `tables->valid_routes` are left set at
their initial null values (initialized by `ni_assign_device_routes()`)
and the function returns `-ENODATA`.

Returning an error results in some routing functionality being disabled.
Unfortunately, leaving `table->route_values` set to `NULL` also breaks
the setting up of asynchronous commands that are configured to use
external trigger sources.  Calls to `ni_check_trigger_arg()` or
`ni_check_trigger_arg_roffs()` while checking the asynchronous command
set-up would result in a null pointer dereference if
`table->route_values` is `NULL`.  The null pointer dereference is fixed
in another patch, but it now results in failure to set up the
asynchronous command.  That is a regression from the behavior prior to
commit 347e244884 ("staging: comedi: tio: implement global tio/ctr
routing") and commit 56d0b826d3 ("staging: comedi: ni_mio_common:
implement new routing for TRIG_EXT").

Change `ni_find_device_routes()` to set `tables->route_values` and/or
`tables->valid_routes` to valid information even if the other one can
only be set to `NULL` due to missing information.  The function will
still return an error in that case.  This should result in
`tables->valid_routes` being valid for all currently supported device
families even if the board-specific routing information is missing.
That should be enough to fix the regression on setting up asynchronous
commands to use external triggers for boards with missing routing
information.

Fixes: 347e244884 ("staging: comedi: tio: implement global tio/ctr routing")
Fixes: 56d0b826d3 ("staging: comedi: ni_mio_common: implement new routing for TRIG_EXT").
Cc: <stable@vger.kernel.org> # 4.20+
Cc: Spencer E. Olson <olsonse@umich.edu>
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Link: https://lore.kernel.org/r/20200114182532.132058-3-abbotti@mev.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-15 13:30:09 +01:00
Ian Abbott
01e20b664f staging: comedi: ni_routes: fix null dereference in ni_find_route_source()
In `ni_find_route_source()`, `tables->route_values` gets dereferenced.
However it is possible that `tables->route_values` is `NULL`, leading to
a null pointer dereference.  `tables->route_values` will be `NULL` if
the call to `ni_assign_device_routes()` during board initialization
returned an error due to missing device family routing information or
missing board-specific routing information.  For example, there is
currently no board-specific routing information provided for the
PCIe-6251 board and several other boards, so those are affected by this
bug.

The bug is triggered when `ni_find_route_source()` is called via
`ni_check_trigger_arg()` or `ni_check_trigger_arg_roffs()` when checking
the arguments for setting up asynchronous commands.  Fix it by returning
`-EINVAL` if `tables->route_values` is `NULL`.

Even with this fix, setting up asynchronous commands to use external
trigger sources for boards with missing routing information will still
fail gracefully.  Since `ni_find_route_source()` only depends on the
device family routing information, it would be better if that was made
available even if the board-specific routing information is missing.
That will be addressed by another patch.

Fixes: 4bb90c87ab ("staging: comedi: add interface to ni routing table information")
Cc: <stable@vger.kernel.org> # 4.20+
Cc: Spencer E. Olson <olsonse@umich.edu>
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Link: https://lore.kernel.org/r/20200114182532.132058-2-abbotti@mev.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-15 13:30:08 +01:00
Keiya Nobuta
9c06ac4c83 usb: core: hub: Improved device recognition on remote wakeup
If hub_activate() is called before D+ has stabilized after remote
wakeup, the following situation might occur:

         __      ___________________
        /  \    /
D+   __/    \__/

Hub  _______________________________
          |  ^   ^           ^
          |  |   |           |
Host _____v__|___|___________|______
          |  |   |           |
          |  |   |           \-- Interrupt Transfer (*3)
          |  |    \-- ClearPortFeature (*2)
          |   \-- GetPortStatus (*1)
          \-- Host detects remote wakeup

- D+ goes high, Host starts running by remote wakeup
- D+ is not stable, goes low
- Host requests GetPortStatus at (*1) and gets the following hub status:
  - Current Connect Status bit is 0
  - Connect Status Change bit is 1
- D+ stabilizes, goes high
- Host requests ClearPortFeature and thus Connect Status Change bit is
  cleared at (*2)
- After waiting 100 ms, Host starts the Interrupt Transfer at (*3)
- Since the Connect Status Change bit is 0, Hub returns NAK.

In this case, port_event() is not called in hub_event() and Host cannot
recognize device. To solve this issue, flag change_bits even if only
Connect Status Change bit is 1 when got in the first GetPortStatus.

This issue occurs rarely because it only if D+ changes during a very
short time between GetPortStatus and ClearPortFeature. However, it is
fatal if it occurs in embedded system.

Signed-off-by: Keiya Nobuta <nobuta.keiya@fujitsu.com>
Cc: stable <stable@vger.kernel.org>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Link: https://lore.kernel.org/r/20200109051448.28150-1-nobuta.keiya@fujitsu.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-15 13:14:28 +01:00
Kevin Hao
a564ac35d6 Revert "gpio: thunderx: Switch to GPIOLIB_IRQCHIP"
This reverts commit a7fc89f9d5 because
there are some bugs in this commit, and we don't have a simple way to
fix these bugs. So revert this commit to make the thunderx gpio work
on the stable kernel at least. We will switch to GPIOLIB_IRQCHIP
for thunderx gpio by following patches.

Fixes: a7fc89f9d5 ("gpio: thunderx: Switch to GPIOLIB_IRQCHIP")
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Link: https://lore.kernel.org/r/20200114082821.14015-2-haokexin@gmail.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2020-01-15 11:17:21 +01:00
Al Viro
508c877276 fix autofs regression caused by follow_managed() changes
we need to reload ->d_flags after the call of ->d_manage() - the thing
might've been called with dentry still negative and have the damn thing
turned positive while we'd waited.

Fixes: d41efb522e "fs/namei.c: pull positivity check into follow_managed()"
Reported-by: Ian Kent <raven@themaw.net>
Tested-by: Ian Kent <raven@themaw.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-15 01:36:46 -05:00
Al Viro
c64cd6e34e reimplement path_mountpoint() with less magic
... and get rid of a bunch of bugs in it.  Background:
the reason for path_mountpoint() is that umount() really doesn't
want attempts to revalidate the root of what it's trying to umount.
The thing we want to avoid actually happen from complete_walk();
solution was to do something parallel to normal path_lookupat()
and it both went overboard and got the boilerplate subtly
(and not so subtly) wrong.

A better solution is to do pretty much what the normal path_lookupat()
does, but instead of complete_walk() do unlazy_walk().  All it takes
to avoid that ->d_weak_revalidate() call...  mountpoint_last() goes
away, along with everything it got wrong, and so does the magic around
LOOKUP_NO_REVAL.

Another source of bugs is that when we traverse mounts at the final
location (and we need to do that - umount . expects to get whatever's
overmounting ., if any, out of the lookup) we really ought to take
care of ->d_manage() - as it is, manual umount of autofs automount
in progress can lead to unpleasant surprises for the daemon.  Easily
solved by using handle_lookup_down() instead of follow_mount().

Tested-by: Ian Kent <raven@themaw.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-15 01:36:06 -05:00
Jens Axboe
78912934f4 io_uring: be consistent in assigning next work from handler
If we pass back dependent work in case of links, we need to always
ensure that we call the link setup and work prep handler. If not, we
might be missing some setup for the next work item.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-14 22:09:06 -07:00
Jens Axboe
e0bbb3461a io-wq: cancel work if we fail getting a mm reference
If we require mm and user context, mark the request for cancellation
if we fail to acquire the desired mm.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-14 22:06:11 -07:00
Ley Foon Tan
051d75d3bb MAINTAINERS: Update Ley Foon Tan's email address
@altera.com email is going to removed. Change to @intel.com email.

Signed-off-by: Ley Foon Tan <ley.foon.tan@intel.com>
2020-01-15 11:11:22 +08:00
Linus Torvalds
95e20af9fb Merge tag 'nfs-for-5.5-2' of git://git.linux-nfs.org/projects/anna/linux-nfs
Pull NFS client bugfixes from Anna Schumaker:
 "Three NFS over RDMA fixes for bugs Chuck found that can be hit during
  device removal:

   - Fix create_qp crash on device unload

   - Fix completion wait during device removal

   - Fix oops in receive handler after device removal"

* tag 'nfs-for-5.5-2' of git://git.linux-nfs.org/projects/anna/linux-nfs:
  xprtrdma: Fix oops in Receive handler after device removal
  xprtrdma: Fix completion wait during device removal
  xprtrdma: Fix create_qp crash on device unload
2020-01-14 13:33:14 -08:00
Ming Lei
4a2f704eb2 block: fix get_max_segment_size() overflow on 32bit arch
Commit 429120f3df starts to take account of segment's start dma address
when computing max segment size, and data type of 'unsigned long'
is used to do that. However, the segment mask may be 0xffffffff, so
the figured out segment size may be overflowed in case of zero physical
address on 32bit arch.

Fix the issue by returning queue_max_segment_size() directly when that
happens.

Fixes: 429120f3df ("block: fix splitting segments on boundary masks")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Cc: Christoph Hellwig <hch@lst.de>
Tested-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-14 13:37:40 -07:00
Chuck Lever
671c450b6f xprtrdma: Fix oops in Receive handler after device removal
Since v5.4, a device removal occasionally triggered this oops:

Dec  2 17:13:53 manet kernel: BUG: unable to handle page fault for address: 0000000c00000219
Dec  2 17:13:53 manet kernel: #PF: supervisor read access in kernel mode
Dec  2 17:13:53 manet kernel: #PF: error_code(0x0000) - not-present page
Dec  2 17:13:53 manet kernel: PGD 0 P4D 0
Dec  2 17:13:53 manet kernel: Oops: 0000 [#1] SMP
Dec  2 17:13:53 manet kernel: CPU: 2 PID: 468 Comm: kworker/2:1H Tainted: G        W         5.4.0-00050-g53717e43af61 #883
Dec  2 17:13:53 manet kernel: Hardware name: Supermicro SYS-6028R-T/X10DRi, BIOS 1.1a 10/16/2015
Dec  2 17:13:53 manet kernel: Workqueue: ib-comp-wq ib_cq_poll_work [ib_core]
Dec  2 17:13:53 manet kernel: RIP: 0010:rpcrdma_wc_receive+0x7c/0xf6 [rpcrdma]
Dec  2 17:13:53 manet kernel: Code: 6d 8b 43 14 89 c1 89 45 78 48 89 4d 40 8b 43 2c 89 45 14 8b 43 20 89 45 18 48 8b 45 20 8b 53 14 48 8b 30 48 8b 40 10 48 8b 38 <48> 8b 87 18 02 00 00 48 85 c0 75 18 48 8b 05 1e 24 c4 e1 48 85 c0
Dec  2 17:13:53 manet kernel: RSP: 0018:ffffc900035dfe00 EFLAGS: 00010246
Dec  2 17:13:53 manet kernel: RAX: ffff888467290000 RBX: ffff88846c638400 RCX: 0000000000000048
Dec  2 17:13:53 manet kernel: RDX: 0000000000000048 RSI: 00000000f942e000 RDI: 0000000c00000001
Dec  2 17:13:53 manet kernel: RBP: ffff888467611b00 R08: ffff888464e4a3c4 R09: 0000000000000000
Dec  2 17:13:53 manet kernel: R10: ffffc900035dfc88 R11: fefefefefefefeff R12: ffff888865af4428
Dec  2 17:13:53 manet kernel: R13: ffff888466023000 R14: ffff88846c63f000 R15: 0000000000000010
Dec  2 17:13:53 manet kernel: FS:  0000000000000000(0000) GS:ffff88846fa80000(0000) knlGS:0000000000000000
Dec  2 17:13:53 manet kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Dec  2 17:13:53 manet kernel: CR2: 0000000c00000219 CR3: 0000000002009002 CR4: 00000000001606e0
Dec  2 17:13:53 manet kernel: Call Trace:
Dec  2 17:13:53 manet kernel: __ib_process_cq+0x5c/0x14e [ib_core]
Dec  2 17:13:53 manet kernel: ib_cq_poll_work+0x26/0x70 [ib_core]
Dec  2 17:13:53 manet kernel: process_one_work+0x19d/0x2cd
Dec  2 17:13:53 manet kernel: ? cancel_delayed_work_sync+0xf/0xf
Dec  2 17:13:53 manet kernel: worker_thread+0x1a6/0x25a
Dec  2 17:13:53 manet kernel: ? cancel_delayed_work_sync+0xf/0xf
Dec  2 17:13:53 manet kernel: kthread+0xf4/0xf9
Dec  2 17:13:53 manet kernel: ? kthread_queue_delayed_work+0x74/0x74
Dec  2 17:13:53 manet kernel: ret_from_fork+0x24/0x30

The proximal cause is that this rpcrdma_rep has a rr_rdmabuf that
is still pointing to the old ib_device, which has been freed. The
only way that is possible is if this rpcrdma_rep was not destroyed
by rpcrdma_ia_remove.

Debugging showed that was indeed the case: this rpcrdma_rep was
still in use by a completing RPC at the time of the device removal,
and thus wasn't on the rep free list. So, it was not found by
rpcrdma_reps_destroy().

The fix is to introduce a list of all rpcrdma_reps so that they all
can be found when a device is removed. That list is used to perform
only regbuf DMA unmapping, replacing that call to
rpcrdma_reps_destroy().

Meanwhile, to prevent corruption of this list, I've moved the
destruction of temp rpcrdma_rep objects to rpcrdma_post_recvs().
rpcrdma_xprt_drain() ensures that post_recvs (and thus rep_destroy) is
not invoked while rpcrdma_reps_unmap is walking rb_all_reps, thus
protecting the rb_all_reps list.

Fixes: b0b227f071 ("xprtrdma: Use an llist to manage free rpcrdma_reps")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-14 13:30:24 -05:00
Chuck Lever
13cb886c59 xprtrdma: Fix completion wait during device removal
I've found that on occasion, "rmmod <dev>" will hang while if an NFS
is under load.

Ensure that ri_remove_done is initialized only just before the
transport is woken up to force a close. This avoids the completion
possibly getting initialized again while the CM event handler is
waiting for a wake-up.

Fixes: bebd031866 ("xprtrdma: Support unplugging an HCA from under an NFS mount")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-14 13:30:24 -05:00
Chuck Lever
b32b9ed493 xprtrdma: Fix create_qp crash on device unload
On device re-insertion, the RDMA device driver crashes trying to set
up a new QP:

Nov 27 16:32:06 manet kernel: BUG: kernel NULL pointer dereference, address: 00000000000001c0
Nov 27 16:32:06 manet kernel: #PF: supervisor write access in kernel mode
Nov 27 16:32:06 manet kernel: #PF: error_code(0x0002) - not-present page
Nov 27 16:32:06 manet kernel: PGD 0 P4D 0
Nov 27 16:32:06 manet kernel: Oops: 0002 [#1] SMP
Nov 27 16:32:06 manet kernel: CPU: 1 PID: 345 Comm: kworker/u28:0 Tainted: G        W         5.4.0 #852
Nov 27 16:32:06 manet kernel: Hardware name: Supermicro SYS-6028R-T/X10DRi, BIOS 1.1a 10/16/2015
Nov 27 16:32:06 manet kernel: Workqueue: xprtiod xprt_rdma_connect_worker [rpcrdma]
Nov 27 16:32:06 manet kernel: RIP: 0010:atomic_try_cmpxchg+0x2/0x12
Nov 27 16:32:06 manet kernel: Code: ff ff 48 8b 04 24 5a c3 c6 07 00 0f 1f 40 00 c3 31 c0 48 81 ff 08 09 68 81 72 0c 31 c0 48 81 ff 83 0c 68 81 0f 92 c0 c3 8b 06 <f0> 0f b1 17 0f 94 c2 84 d2 75 02 89 06 88 d0 c3 53 ba 01 00 00 00
Nov 27 16:32:06 manet kernel: RSP: 0018:ffffc900035abbf0 EFLAGS: 00010046
Nov 27 16:32:06 manet kernel: RAX: 0000000000000000 RBX: 00000000000001c0 RCX: 0000000000000000
Nov 27 16:32:06 manet kernel: RDX: 0000000000000001 RSI: ffffc900035abbfc RDI: 00000000000001c0
Nov 27 16:32:06 manet kernel: RBP: ffffc900035abde0 R08: 000000000000000e R09: ffffffffffffc000
Nov 27 16:32:06 manet kernel: R10: 0000000000000000 R11: 000000000002e800 R12: ffff88886169d9f8
Nov 27 16:32:06 manet kernel: R13: ffff88886169d9f4 R14: 0000000000000246 R15: 0000000000000000
Nov 27 16:32:06 manet kernel: FS:  0000000000000000(0000) GS:ffff88846fa40000(0000) knlGS:0000000000000000
Nov 27 16:32:06 manet kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 27 16:32:06 manet kernel: CR2: 00000000000001c0 CR3: 0000000002009006 CR4: 00000000001606e0
Nov 27 16:32:06 manet kernel: Call Trace:
Nov 27 16:32:06 manet kernel: do_raw_spin_lock+0x2f/0x5a
Nov 27 16:32:06 manet kernel: create_qp_common.isra.47+0x856/0xadf [mlx4_ib]
Nov 27 16:32:06 manet kernel: ? slab_post_alloc_hook.isra.60+0xa/0x1a
Nov 27 16:32:06 manet kernel: ? __kmalloc+0x125/0x139
Nov 27 16:32:06 manet kernel: mlx4_ib_create_qp+0x57f/0x972 [mlx4_ib]

The fix is to copy the qp_init_attr struct that was just created by
rpcrdma_ep_create() instead of using the one from the previous
connection instance.

Fixes: 98ef77d1aa ("xprtrdma: Send Queue size grows after a reconnect")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-14 13:30:24 -05:00