Commit Graph

950968 Commits

Author SHA1 Message Date
Linus Torvalds
6f0306d1bf Merge tag 'usb-5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
 "Let's try this again...  Here are some USB fixes for 5.9-rc3.

  This differs from the previous pull request for this release in that
  the usb gadget patch now does not break some systems, and actually
  does what it was intended to do. Many thanks to Marek Szyprowski for
  quickly noticing and testing the patch from Andy Shevchenko to resolve
  this issue.

  Additionally, some more new USB quirks have been added to get some new
  devices to work properly based on user reports.

  Other than that, the patches are all here, and they contain:

   - usb gadget driver fixes

   - xhci driver fixes

   - typec fixes

   - new quirks and ids

   - fixes for USB patches that went into 5.9-rc1.

  All of these have been tested in linux-next with no reported issues"

* tag 'usb-5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (33 commits)
  usb: storage: Add unusual_uas entry for Sony PSZ drives
  USB: Ignore UAS for JMicron JMS567 ATA/ATAPI Bridge
  usb: host: ohci-exynos: Fix error handling in exynos_ohci_probe()
  USB: gadget: u_f: Unbreak offset calculation in VLAs
  USB: quirks: Ignore duplicate endpoint on Sound Devices MixPre-D
  usb: typec: tcpm: Fix Fix source hard reset response for TDA 2.3.1.1 and TDA 2.3.1.2 failures
  USB: PHY: JZ4770: Fix static checker warning.
  USB: gadget: f_ncm: add bounds checks to ncm_unwrap_ntb()
  USB: gadget: u_f: add overflow checks to VLA macros
  xhci: Always restore EP_SOFT_CLEAR_TOGGLE even if ep reset failed
  xhci: Do warm-reset when both CAS and XDEV_RESUME are set
  usb: host: xhci: fix ep context print mismatch in debugfs
  usb: uas: Add quirk for PNY Pro Elite
  tools: usb: move to tools buildsystem
  USB: Fix device driver race
  USB: Also match device drivers using the ->match vfunc
  usb: host: xhci-tegra: fix tegra_xusb_get_phy()
  usb: host: xhci-tegra: otg usb2/usb3 port init
  usb: hcd: Fix use after free in usb_hcd_pci_remove()
  usb: typec: ucsi: Hold con->lock for the entire duration of ucsi_register_port()
  ...
2020-08-30 10:51:03 -07:00
Linus Torvalds
42df60fcdf Merge tag 'edac_urgent_for_v5.9_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras
Pull EDAC fix from Borislav Petkov:
 "A fix to properly clear ghes_edac driver state on driver remove so
  that a subsequent load can probe the system properly (Shiju Jose)"

* tag 'edac_urgent_for_v5.9_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
  EDAC/ghes: Fix NULL pointer dereference in ghes_edac_register()
2020-08-30 10:47:23 -07:00
Linus Torvalds
c4011283a7 Merge tag 'dma-mapping-5.9-2' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping fix from Christoph Hellwig:
 "Fix a possibly uninitialized variable (Dan Carpenter)"

* tag 'dma-mapping-5.9-2' of git://git.infradead.org/users/hch/dma-mapping:
  dma-pool: Fix an uninitialized variable bug in atomic_pool_expand()
2020-08-30 10:37:18 -07:00
Thomas Gleixner
784a083037 genirq/matrix: Deal with the sillyness of for_each_cpu() on UP
Most of the CPU mask operations behave the same way, but for_each_cpu() and
it's variants ignore the cpumask argument and claim that CPU0 is always in
the mask. This is historical, inconsistent and annoying behaviour.

The matrix allocator uses for_each_cpu() and can be called on UP with an
empty cpumask. The calling code does not expect that this succeeds but
until commit e027fffff7 ("x86/irq: Unbreak interrupt affinity setting")
this went unnoticed. That commit added a WARN_ON() to catch cases which
move an interrupt from one vector to another on the same CPU. The warning
triggers on UP.

Add a check for the cpumask being empty to prevent this.

Fixes: 2f75d9e1c9 ("genirq: Implement bitmap matrix allocator")
Reported-by: kernel test robot <rong.a.chen@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
2020-08-30 19:17:28 +02:00
Linus Torvalds
1127b219ce Merge tag 'fallthrough-fixes-5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux
Pull fallthrough fixes from Gustavo A. R. Silva:
 "Fix some minor issues introduced by the recent treewide fallthrough
  conversions:

   - Fix identation issue

   - Fix erroneous fallthrough annotation

   - Remove unnecessary fallthrough annotation

   - Fix code comment changed by fallthrough conversion"

* tag 'fallthrough-fixes-5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux:
  arm64/cpuinfo: Remove unnecessary fallthrough annotation
  media: dib0700: Fix identation issue in dib8096_set_param_override()
  afs: Remove erroneous fallthough annotation
  iio: dpot-dac: fix code comment in dpot_dac_read_raw()
2020-08-29 14:21:58 -07:00
Linus Torvalds
0a4c56c80f fsldma: fix very broken 32-bit ppc ioread64 functionality
Commit ef91bb196b ("kernel.h: Silence sparse warning in
lower_32_bits") caused new warnings to show in the fsldma driver, but
that commit was not to blame: it only exposed some very incorrect code
that tried to take the low 32 bits of an address.

That made no sense for multiple reasons, the most notable one being that
that code was intentionally limited to only 32-bit ppc builds, so "only
low 32 bits of an address" was completely nonsensical.  There were no
high bits to mask off to begin with.

But even more importantly fropm a correctness standpoint, turning the
address into an integer then caused the subsequent address arithmetic to
be completely wrong too, and the "+1" actually incremented the address
by one, rather than by four.

Which again was incorrect, since the code was reading two 32-bit values
and trying to make a 64-bit end result of it all.  Surprisingly, the
iowrite64() did not suffer from the same odd and incorrect model.

This code has never worked, but it's questionable whether anybody cared:
of the two users that actually read the 64-bit value (by way of some C
preprocessor hackery and eventually the 'get_cdar()' inline function),
one of them explicitly ignored the value, and the other one might just
happen to work despite the incorrect value being read.

This patch at least makes it not fail the build any more, and makes the
logic superficially sane.  Whether it makes any difference to the code
_working_ or not shall remain a mystery.

Compile-tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-29 13:50:56 -07:00
Linus Torvalds
e77aee1326 Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
 "A core fix for ACPI matching and two driver bugfixes"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: iproc: Fix shifting 31 bits
  i2c: rcar: in slave mode, clear NACK earlier
  i2c: acpi: Remove dead code, i.e. i2c_acpi_match_device()
  i2c: core: Don't fail PRP0001 enumeration when no ID table exist
2020-08-29 13:13:44 -07:00
Linus Torvalds
1b46b921b0 Merge tag 's390-5.9-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Vasily Gorbik:

 - Disable preemption trace in percpu macros since the lockdep code
   itself uses percpu variables now and it causes recursions.

 - Fix kernel space 4-level paging broken by recent vmem rework.

* tag 's390-5.9-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/vmem: fix vmem_add_range for 4-level paging
  s390: don't trace preemption in percpu macros
2020-08-29 12:51:58 -07:00
Linus Torvalds
c8b5563abe Merge tag 'for-linus-5.9-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
 "Two fixes for Xen: one needed for ongoing work to support virtio with
  Xen, and one for a corner case in IRQ handling with Xen"

* tag 'for-linus-5.9-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  arm/xen: Add misuse warning to virt_to_gfn
  xen/xenbus: Fix granting of vmalloc'd memory
  XEN uses irqdesc::irq_data_common::handler_data to store a per interrupt XEN data pointer which contains XEN specific information.
2020-08-29 12:44:30 -07:00
Linus Torvalds
e4cad138aa Merge tag 'hwmon-for-v5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:

 - Fix tempeerature scale in gsc-hwmon driver

 - Fix divide by 0 error in nct7904 driver

 - Drop non-existing attribute from pmbus/isl68137 driver

 - Fix status check in applesmc driver

* tag 'hwmon-for-v5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (gsc-hwmon) Scale temperature to millidegrees
  hwmon: (applesmc) check status earlier.
  hwmon: (nct7904) Correct divide by 0
  hwmon: (pmbus/isl68137) remove READ_TEMPERATURE_1 telemetry for RAA228228
2020-08-29 12:37:00 -07:00
Jens Axboe
5d220bcd37 Merge branch 'nvme-5.9-rc' of git://git.infradead.org/nvme into block-5.9
Pull NVMe fixes from Sagi:

"- instance leak and io boundary fixes from Keith
 - fc locking fix from Christophe
 - various tcp/rdma reset during traffic fixes from Me
 - pci use-after-free fix from Tong
 - tcp target null deref fix from Ziye"

* 'nvme-5.9-rc' of git://git.infradead.org/nvme:
  nvme-pci: cancel nvme device request before disabling
  nvme: only use power of two io boundaries
  nvme: fix controller instance leak
  nvmet-fc: Fix a missed _irqsave version of spin_lock in 'nvmet_fc_fod_op_done()'
  nvme: Fix NULL dereference for pci nvme controllers
  nvme-rdma: fix reset hang if controller died in the middle of a reset
  nvme-rdma: fix timeout handler
  nvme-rdma: serialize controller teardown sequences
  nvme-tcp: fix reset hang if controller died in the middle of a reset
  nvme-tcp: fix timeout handler
  nvme-tcp: serialize controller teardown sequences
  nvme: have nvme_wait_freeze_timeout return if it timed out
  nvme-fabrics: don't check state NVME_CTRL_NEW for request acceptance
  nvmet-tcp: Fix NULL dereference when a connect data comes in h2cdata pdu
2020-08-29 10:54:11 -06:00
Balazs Scheidler
67407a406d netfilter: nft_socket: add wildcard support
Add NFT_SOCKET_WILDCARD to match to wildcard socket listener.

Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-08-29 13:04:44 +02:00
Florian Westphal
c46172147e netfilter: conntrack: do not auto-delete clash entries on reply
Its possible that we have more than one packet with the same ct tuple
simultaneously, e.g. when an application emits n packets on same UDP
socket from multiple threads.

NAT rules might be applied to those packets. With the right set of rules,
n packets will be mapped to m destinations, where at least two packets end
up with the same destination.

When this happens, the existing clash resolution may merge the skb that
is processed after the first has been received with the identical tuple
already in hash table.

However, its possible that this identical tuple is a NAT_CLASH tuple.
In that case the second skb will be sent, but no reply can be received
since the reply that is processed first removes the NAT_CLASH tuple.

Do not auto-delete, this gives a 1 second window for replies to be passed
back to originator.

Packets that are coming later (udp stream case) will not be affected:
they match the original ct entry, not a NAT_CLASH one.

Also prevent NAT_CLASH entries from getting offloaded.

Fixes: 6a757c07e5 ("netfilter: conntrack: allow insertion of clashing entries")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-08-29 13:03:06 +02:00
Dan Crawford
15cbff3fbb ALSA: hda - Fix silent audio output and corrupted input on MSI X570-A PRO
Following Christian Lachner's patch for Gigabyte X570-based motherboards,
also patch the MSI X570-A PRO motherboard; the ALC1220 codec requires the
same workaround for Clevo laptops to enforce the DAC/mixer connection
path. Set up a quirk entry for that.

I suspect most if all X570 motherboards will require similar patches.

[ The entries reordered in the SSID order -- tiwai ]

Related buglink: https://bugzilla.kernel.org/show_bug.cgi?id=205275
Signed-off-by: Dan Crawford <dnlcrwfrd@gmail.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200829024946.5691-1-dnlcrwfrd@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2020-08-29 08:48:52 +02:00
Tong Zhang
7ad92f656b nvme-pci: cancel nvme device request before disabling
This patch addresses an irq free warning and null pointer dereference
error problem when nvme devices got timeout error during initialization.
This problem happens when nvme_timeout() function is called while
nvme_reset_work() is still in execution. This patch fixed the problem by
setting flag of the problematic request to NVME_REQ_CANCELLED before
calling nvme_dev_disable() to make sure __nvme_submit_sync_cmd() returns
an error code and let nvme_submit_sync_cmd() fail gracefully.
The following is console output.

[   62.472097] nvme nvme0: I/O 13 QID 0 timeout, disable controller
[   62.488796] nvme nvme0: could not set timestamp (881)
[   62.494888] ------------[ cut here ]------------
[   62.495142] Trying to free already-free IRQ 11
[   62.495366] WARNING: CPU: 0 PID: 7 at kernel/irq/manage.c:1751 free_irq+0x1f7/0x370
[   62.495742] Modules linked in:
[   62.495902] CPU: 0 PID: 7 Comm: kworker/u4:0 Not tainted 5.8.0+ #8
[   62.496206] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-48-gd9c812dda519-p4
[   62.496772] Workqueue: nvme-reset-wq nvme_reset_work
[   62.497019] RIP: 0010:free_irq+0x1f7/0x370
[   62.497223] Code: e8 ce 49 11 00 48 83 c4 08 4c 89 e0 5b 5d 41 5c 41 5d 41 5e 41 5f c3 44 89 f6 48 c70
[   62.498133] RSP: 0000:ffffa96800043d40 EFLAGS: 00010086
[   62.498391] RAX: 0000000000000000 RBX: ffff9b87fc458400 RCX: 0000000000000000
[   62.498741] RDX: 0000000000000001 RSI: 0000000000000096 RDI: ffffffff9693d72c
[   62.499091] RBP: ffff9b87fd4c8f60 R08: ffffa96800043bfd R09: 0000000000000163
[   62.499440] R10: ffffa96800043bf8 R11: ffffa96800043bfd R12: ffff9b87fd4c8e00
[   62.499790] R13: ffff9b87fd4c8ea4 R14: 000000000000000b R15: ffff9b87fd76b000
[   62.500140] FS:  0000000000000000(0000) GS:ffff9b87fdc00000(0000) knlGS:0000000000000000
[   62.500534] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   62.500816] CR2: 0000000000000000 CR3: 000000003aa0a000 CR4: 00000000000006f0
[   62.501165] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   62.501515] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   62.501864] Call Trace:
[   62.501993]  pci_free_irq+0x13/0x20
[   62.502167]  nvme_reset_work+0x5d0/0x12a0
[   62.502369]  ? update_load_avg+0x59/0x580
[   62.502569]  ? ttwu_queue_wakelist+0xa8/0xc0
[   62.502780]  ? try_to_wake_up+0x1a2/0x450
[   62.502979]  process_one_work+0x1d2/0x390
[   62.503179]  worker_thread+0x45/0x3b0
[   62.503361]  ? process_one_work+0x390/0x390
[   62.503568]  kthread+0xf9/0x130
[   62.503726]  ? kthread_park+0x80/0x80
[   62.503911]  ret_from_fork+0x22/0x30
[   62.504090] ---[ end trace de9ed4a70f8d71e2 ]---
[  123.912275] nvme nvme0: I/O 12 QID 0 timeout, disable controller
[  123.914670] nvme nvme0: 1/0/0 default/read/poll queues
[  123.916310] BUG: kernel NULL pointer dereference, address: 0000000000000000
[  123.917469] #PF: supervisor write access in kernel mode
[  123.917725] #PF: error_code(0x0002) - not-present page
[  123.917976] PGD 0 P4D 0
[  123.918109] Oops: 0002 [#1] SMP PTI
[  123.918283] CPU: 0 PID: 7 Comm: kworker/u4:0 Tainted: G        W         5.8.0+ #8
[  123.918650] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-48-gd9c812dda519-p4
[  123.919219] Workqueue: nvme-reset-wq nvme_reset_work
[  123.919469] RIP: 0010:__blk_mq_alloc_map_and_request+0x21/0x80
[  123.919757] Code: 66 0f 1f 84 00 00 00 00 00 41 55 41 54 55 48 63 ee 53 48 8b 47 68 89 ee 48 89 fb 8b4
[  123.920657] RSP: 0000:ffffa96800043d40 EFLAGS: 00010286
[  123.920912] RAX: ffff9b87fc4fee40 RBX: ffff9b87fc8cb008 RCX: 0000000000000000
[  123.921258] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9b87fc618000
[  123.921602] RBP: 0000000000000000 R08: ffff9b87fdc2c4a0 R09: ffff9b87fc616000
[  123.921949] R10: 0000000000000000 R11: ffff9b87fffd1500 R12: 0000000000000000
[  123.922295] R13: 0000000000000000 R14: ffff9b87fc8cb200 R15: ffff9b87fc8cb000
[  123.922641] FS:  0000000000000000(0000) GS:ffff9b87fdc00000(0000) knlGS:0000000000000000
[  123.923032] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  123.923312] CR2: 0000000000000000 CR3: 000000003aa0a000 CR4: 00000000000006f0
[  123.923660] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  123.924007] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  123.924353] Call Trace:
[  123.924479]  blk_mq_alloc_tag_set+0x137/0x2a0
[  123.924694]  nvme_reset_work+0xed6/0x12a0
[  123.924898]  process_one_work+0x1d2/0x390
[  123.925099]  worker_thread+0x45/0x3b0
[  123.925280]  ? process_one_work+0x390/0x390
[  123.925486]  kthread+0xf9/0x130
[  123.925642]  ? kthread_park+0x80/0x80
[  123.925825]  ret_from_fork+0x22/0x30
[  123.926004] Modules linked in:
[  123.926158] CR2: 0000000000000000
[  123.926322] ---[ end trace de9ed4a70f8d71e3 ]---
[  123.926549] RIP: 0010:__blk_mq_alloc_map_and_request+0x21/0x80
[  123.926832] Code: 66 0f 1f 84 00 00 00 00 00 41 55 41 54 55 48 63 ee 53 48 8b 47 68 89 ee 48 89 fb 8b4
[  123.927734] RSP: 0000:ffffa96800043d40 EFLAGS: 00010286
[  123.927989] RAX: ffff9b87fc4fee40 RBX: ffff9b87fc8cb008 RCX: 0000000000000000
[  123.928336] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9b87fc618000
[  123.928679] RBP: 0000000000000000 R08: ffff9b87fdc2c4a0 R09: ffff9b87fc616000
[  123.929025] R10: 0000000000000000 R11: ffff9b87fffd1500 R12: 0000000000000000
[  123.929370] R13: 0000000000000000 R14: ffff9b87fc8cb200 R15: ffff9b87fc8cb000
[  123.929715] FS:  0000000000000000(0000) GS:ffff9b87fdc00000(0000) knlGS:0000000000000000
[  123.930106] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  123.930384] CR2: 0000000000000000 CR3: 000000003aa0a000 CR4: 00000000000006f0
[  123.930731] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  123.931077] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

Co-developed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Tong Zhang <ztong0001@gmail.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2020-08-28 16:43:57 -07:00
Keith Busch
e83d776f9f nvme: only use power of two io boundaries
The kernel requires a power of two for boundaries because that's the
only way it can efficiently split commands that cross them. A
controller, however, may report a non-power of two boundary.

The driver had been rounding the controller's value to one the kernel
can use, but splitting on the wrong boundary provides no benefit on the
device side, and incurs additional submission overhead from non-optimal
splits.

Don't provide any boundary hint if the controller's value can't be used
and log a warning when first scanning a disk's unreported IO boundary.
Since the chunk sector logic has grown, move it to a separate function.

Cc: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2020-08-28 16:43:57 -07:00
Keith Busch
192f6c29bb nvme: fix controller instance leak
If the driver has to unbind from the controller for an early failure
before the subsystem has been set up, there won't be a subsystem holding
the controller's instance, so the controller needs to free its own
instance in this case.

Fixes: 733e4b69d5 ("nvme: Assign subsys instance from first ctrl")
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2020-08-28 16:43:57 -07:00
Christophe JAILLET
70e37988db nvmet-fc: Fix a missed _irqsave version of spin_lock in 'nvmet_fc_fod_op_done()'
The way 'spin_lock()' and 'spin_lock_irqsave()' are used is not consistent
in this function.

Use 'spin_lock_irqsave()' also here, as there is no guarantee that
interruptions are disabled at that point, according to surrounding code.

Fixes: a97ec51b37 ("nvmet_fc: Rework target side abort handling")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2020-08-28 16:43:57 -07:00
Sagi Grimberg
7cd49f7576 nvme: Fix NULL dereference for pci nvme controllers
PCIe controllers do not have fabric opts, verify they exist before
showing ctrl_loss_tmo or reconnect_delay attributes.

Fixes: 764075fdcb ("nvme: expose reconnect_delay and ctrl_loss_tmo via sysfs")
Reported-by: Tobias Markus <tobias@markus-regensburg.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2020-08-28 16:43:57 -07:00
Sagi Grimberg
2362acb678 nvme-rdma: fix reset hang if controller died in the middle of a reset
If the controller becomes unresponsive in the middle of a reset, we
will hang because we are waiting for the freeze to complete, but that
cannot happen since we have commands that are inflight holding the
q_usage_counter, and we can't blindly fail requests that times out.

So give a timeout and if we cannot wait for queue freeze before
unfreezing, fail and have the error handling take care how to
proceed (either schedule a reconnect of remove the controller).

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2020-08-28 16:43:57 -07:00
Sagi Grimberg
0475a8dcbc nvme-rdma: fix timeout handler
When a request times out in a LIVE state, we simply trigger error
recovery and let the error recovery handle the request cancellation,
however when a request times out in a non LIVE state, we make sure to
complete it immediately as it might block controller setup or teardown
and prevent forward progress.

However tearing down the entire set of I/O and admin queues causes
freeze/unfreeze imbalance (q->mq_freeze_depth) because and is really
an overkill to what we actually need, which is to just fence controller
teardown that may be running, stop the queue, and cancel the request if
it is not already completed.

Now that we have the controller teardown_lock, we can safely serialize
request cancellation. This addresses a hang caused by calling extra
queue freeze on controller namespaces, causing unfreeze to not complete
correctly.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2020-08-28 16:43:57 -07:00
Sagi Grimberg
5110f40241 nvme-rdma: serialize controller teardown sequences
In the timeout handler we may need to complete a request because the
request that timed out may be an I/O that is a part of a serial sequence
of controller teardown or initialization. In order to complete the
request, we need to fence any other context that may compete with us
and complete the request that is timing out.

In this case, we could have a potential double completion in case
a hard-irq or a different competing context triggered error recovery
and is running inflight request cancellation concurrently with the
timeout handler.

Protect using a ctrl teardown_lock to serialize contexts that may
complete a cancelled request due to error recovery or a reset.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2020-08-28 16:43:57 -07:00
Sagi Grimberg
e5c01f4f7f nvme-tcp: fix reset hang if controller died in the middle of a reset
If the controller becomes unresponsive in the middle of a reset, we will
hang because we are waiting for the freeze to complete, but that cannot
happen since we have commands that are inflight holding the
q_usage_counter, and we can't blindly fail requests that times out.

So give a timeout and if we cannot wait for queue freeze before
unfreezing, fail and have the error handling take care how to proceed
(either schedule a reconnect of remove the controller).

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2020-08-28 16:43:57 -07:00
Sagi Grimberg
236187c4ed nvme-tcp: fix timeout handler
When a request times out in a LIVE state, we simply trigger error
recovery and let the error recovery handle the request cancellation,
however when a request times out in a non LIVE state, we make sure to
complete it immediately as it might block controller setup or teardown
and prevent forward progress.

However tearing down the entire set of I/O and admin queues causes
freeze/unfreeze imbalance (q->mq_freeze_depth) because and is really
an overkill to what we actually need, which is to just fence controller
teardown that may be running, stop the queue, and cancel the request if
it is not already completed.

Now that we have the controller teardown_lock, we can safely serialize
request cancellation. This addresses a hang caused by calling extra
queue freeze on controller namespaces, causing unfreeze to not complete
correctly.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2020-08-28 16:43:57 -07:00
Sagi Grimberg
d4d61470ae nvme-tcp: serialize controller teardown sequences
In the timeout handler we may need to complete a request because the
request that timed out may be an I/O that is a part of a serial sequence
of controller teardown or initialization. In order to complete the
request, we need to fence any other context that may compete with us
and complete the request that is timing out.

In this case, we could have a potential double completion in case
a hard-irq or a different competing context triggered error recovery
and is running inflight request cancellation concurrently with the
timeout handler.

Protect using a ctrl teardown_lock to serialize contexts that may
complete a cancelled request due to error recovery or a reset.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2020-08-28 16:43:56 -07:00
Sagi Grimberg
7cf0d7c0f3 nvme: have nvme_wait_freeze_timeout return if it timed out
Users can detect if the wait has completed or not and take appropriate
actions based on this information (e.g. weather to continue
initialization or rather fail and schedule another initialization
attempt).

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2020-08-28 16:43:56 -07:00
Sagi Grimberg
d7144f5c4c nvme-fabrics: don't check state NVME_CTRL_NEW for request acceptance
NVME_CTRL_NEW should never see any I/O, because in order to start
initialization it has to transition to NVME_CTRL_CONNECTING and from
there it will never return to this state.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2020-08-28 16:43:56 -07:00
Ziye Yang
a6ce7d7b4a nvmet-tcp: Fix NULL dereference when a connect data comes in h2cdata pdu
When handling commands without in-capsule data, we assign the ttag
assuming we already have the queue commands array allocated (based
on the queue size information in the connect data payload). However
if the connect itself did not send the connect data in-capsule we
have yet to allocate the queue commands,and we will assign a bogus
ttag and suffer a NULL dereference when we receive the corresponding
h2cdata pdu.

Fix this by checking if we already allocated commands before
dereferencing it when handling h2cdata, if we didn't, its for sure a
connect and we should use the preallocated connect command.

Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2020-08-28 16:43:56 -07:00
Linus Torvalds
4d41ead6ea Merge tag 'block-5.9-2020-08-28' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:

 - nbd timeout fix (Hou)

 - device size fix for loop LOOP_CONFIGURE (Martijn)

 - MD pull from Song with raid5 stripe size fix (Yufen)

* tag 'block-5.9-2020-08-28' of git://git.kernel.dk/linux-block:
  md/raid5: make sure stripe_size as power of two
  loop: Set correct device size when using LOOP_CONFIGURE
  nbd: restore default timeout when setting it to zero
2020-08-28 16:38:29 -07:00
Linus Torvalds
24148d8648 Merge tag 'io_uring-5.9-2020-08-28' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
 "A few fixes in here, all based on reports and test cases from folks
  using it. Most of it is stable material as well:

   - Hashed work cancelation fix (Pavel)

   - poll wakeup signalfd fix

   - memlock accounting fix

   - nonblocking poll retry fix

   - ensure we never return -ERESTARTSYS for reads

   - ensure offset == -1 is consistent with preadv2() as documented

   - IOPOLL -EAGAIN handling fixes

   - remove useless task_work bounce for block based -EAGAIN retry"

* tag 'io_uring-5.9-2020-08-28' of git://git.kernel.dk/linux-block:
  io_uring: don't bounce block based -EAGAIN retry off task_work
  io_uring: fix IOPOLL -EAGAIN retries
  io_uring: clear req->result on IOPOLL re-issue
  io_uring: make offset == -1 consistent with preadv2/pwritev2
  io_uring: ensure read requests go through -ERESTART* transformation
  io_uring: don't use poll handler if file can't be nonblocking read/written
  io_uring: fix imbalanced sqo_mm accounting
  io_uring: revert consumed iov_iter bytes on error
  io-wq: fix hang after cancelling pending hashed work
  io_uring: don't recurse on tsk->sighand->siglock with signalfd
2020-08-28 16:23:16 -07:00
David S. Miller
c8146fe292 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2020-08-28

The following pull-request contains BPF updates for your *net* tree.

We've added 4 non-merge commits during the last 4 day(s) which contain
a total of 4 files changed, 7 insertions(+), 4 deletions(-).

The main changes are:

1) Fix out of bounds access for BPF_OBJ_GET_INFO_BY_FD retrieval, from Yonghong Song.

2) Fix wrong __user annotation in bpf_stats sysctl handler, from Tobias Klauser.

3) Few fixes for BPF selftest scripting in test_{progs,maps}, from Jesper Dangaard Brouer.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-28 16:12:48 -07:00
Linus Torvalds
005c53447a Merge tag 'devprop-5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull device properties framework fix from Rafael Wysocki:
 "Prevent the promotion of the secondary firmware node of a device to
  the primary one from leaking a pointer (Heikki Krogerus)"

* tag 'devprop-5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  device property: Fix the secondary firmware node handling in set_primary_fwnode()
2020-08-28 13:23:31 -07:00
Linus Torvalds
0b2f18e7ae Merge tag 'acpi-5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
 "These fix two recent issues in the ACPI memory mappings management
  code and tighten up error handling in the ACPI driver for AMD SoCs
  (APD).

  Specifics:

   - Avoid redundant rounding to the page size in acpi_os_map_iomem() to
     address a recently introduced issue with the EFI memory map
     permission check on ARM64 (Ard Biesheuvel).

   - Fix acpi_release_memory() to wait until the memory mappings
     released by it have been really unmapped (Rafael Wysocki).

   - Make the ACPI driver for AMD SoCs (APD) check the return value of
     acpi_dev_get_property() to avoid failures in the cases when the
     device property under inspection is missing (Furquan Shaikh)"

* tag 'acpi-5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: OSL: Prevent acpi_release_memory() from returning too early
  ACPI: ioremap: avoid redundant rounding to OS page size
  ACPI: SoC: APD: Check return value of acpi_dev_get_property()
2020-08-28 13:17:14 -07:00
Linus Torvalds
326e311b84 Merge tag 'pm-5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
 "These fix the recently added Tegra194 cpufreq driver and the handling
  of devices using runtime PM during system-wide suspend, improve the
  intel_pstate driver documentation and clean up the cpufreq core.

  Specifics:

   - Make the recently added Tegra194 cpufreq driver use
     read_cpuid_mpir() instead of cpu_logical_map() to avoid exporting
     logical_cpu_map (Sumit Gupta).

   - Drop the automatic system wakeup event reporting for devices with
     pending runtime-resume requests during system-wide suspend to avoid
     spurious aborts of the suspend flow (Rafael Wysocki).

   - Fix build warning in the intel_pstate driver documentation and
     improve the wording in there (Randy Dunlap).

   - Clean up two pieces of code in the cpufreq core (Viresh Kumar)"

* tag 'pm-5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpufreq: Use WARN_ON_ONCE() for invalid relation
  cpufreq: No need to verify cpufreq_driver in show_scaling_cur_freq()
  PM: sleep: core: Fix the handling of pending runtime resume requests
  Documentation: fix pm/intel_pstate build warning and wording
  cpufreq: replace cpu_logical_map() with read_cpuid_mpir()
2020-08-28 13:12:09 -07:00
Daniel Borkmann
10496f261e Merge branch 'bpf-sleepable'
Alexei Starovoitov says:

====================
v2->v3:
- switched to minimal allowlist approach. Essentially that means that syscall
  entry, few btrfs allow_error_inject functions, should_fail_bio(), and two LSM
  hooks: file_mprotect and bprm_committed_creds are the only hooks that allow
  attaching of sleepable BPF programs. When comprehensive analysis of LSM hooks
  will be done this allowlist will be extended.
- added patch 1 that fixes prototypes of two mm functions to reliably work with
  error injection. It's also necessary for resolve_btfids tool to recognize
  these two funcs, but that's secondary.

v1->v2:
- split fmod_ret fix into separate patch
- added denylist

v1:
This patch set introduces the minimal viable support for sleepable bpf programs.
In this patch only fentry/fexit/fmod_ret and lsm progs can be sleepable.
Only array and pre-allocated hash and lru maps allowed.

Here is 'perf report' difference of sleepable vs non-sleepable:
   3.86%  bench     [k] __srcu_read_unlock
   3.22%  bench     [k] __srcu_read_lock
   0.92%  bench     [k] bpf_prog_740d4210cdcd99a3_bench_trigger_fentry_sleep
   0.50%  bench     [k] bpf_trampoline_10297
   0.26%  bench     [k] __bpf_prog_exit_sleepable
   0.21%  bench     [k] __bpf_prog_enter_sleepable
vs
   0.88%  bench     [k] bpf_prog_740d4210cdcd99a3_bench_trigger_fentry
   0.84%  bench     [k] bpf_trampoline_10297
   0.13%  bench     [k] __bpf_prog_enter
   0.12%  bench     [k] __bpf_prog_exit
vs
   0.79%  bench     [k] bpf_prog_740d4210cdcd99a3_bench_trigger_fentry_sleep
   0.72%  bench     [k] bpf_trampoline_10381
   0.31%  bench     [k] __bpf_prog_exit_sleepable
   0.29%  bench     [k] __bpf_prog_enter_sleepable

Sleepable vs non-sleepable program invocation overhead is only marginally higher
due to rcu_trace. srcu approach is much slower.
====================

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2020-08-28 21:20:38 +02:00
Alexei Starovoitov
e68a144547 selftests/bpf: Add sleepable tests
Modify few tests to sanity test sleepable bpf functionality.

Running 'bench trig-fentry-sleep' vs 'bench trig-fentry' and 'perf report':
sleepable with SRCU:
   3.86%  bench     [k] __srcu_read_unlock
   3.22%  bench     [k] __srcu_read_lock
   0.92%  bench     [k] bpf_prog_740d4210cdcd99a3_bench_trigger_fentry_sleep
   0.50%  bench     [k] bpf_trampoline_10297
   0.26%  bench     [k] __bpf_prog_exit_sleepable
   0.21%  bench     [k] __bpf_prog_enter_sleepable

sleepable with RCU_TRACE:
   0.79%  bench     [k] bpf_prog_740d4210cdcd99a3_bench_trigger_fentry_sleep
   0.72%  bench     [k] bpf_trampoline_10381
   0.31%  bench     [k] __bpf_prog_exit_sleepable
   0.29%  bench     [k] __bpf_prog_enter_sleepable

non-sleepable with RCU:
   0.88%  bench     [k] bpf_prog_740d4210cdcd99a3_bench_trigger_fentry
   0.84%  bench     [k] bpf_trampoline_10297
   0.13%  bench     [k] __bpf_prog_enter
   0.12%  bench     [k] __bpf_prog_exit

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: KP Singh <kpsingh@google.com>
Link: https://lore.kernel.org/bpf/20200827220114.69225-6-alexei.starovoitov@gmail.com
2020-08-28 21:20:33 +02:00
Alexei Starovoitov
2b288740a1 libbpf: Support sleepable progs
Pass request to load program as sleepable via ".s" suffix in the section name.
If it happens in the future that all map types and helpers are allowed with
BPF_F_SLEEPABLE flag "fmod_ret/" and "lsm/" can be aliased to "fmod_ret.s/" and
"lsm.s/" to make all lsm and fmod_ret programs sleepable by default. The fentry
and fexit programs would always need to have sleepable vs non-sleepable
distinction, since not all fentry/fexit progs will be attached to sleepable
kernel functions.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: KP Singh <kpsingh@google.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200827220114.69225-5-alexei.starovoitov@gmail.com
2020-08-28 21:20:33 +02:00
Alexei Starovoitov
07be4c4a3e bpf: Add bpf_copy_from_user() helper.
Sleepable BPF programs can now use copy_from_user() to access user memory.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: KP Singh <kpsingh@google.com>
Link: https://lore.kernel.org/bpf/20200827220114.69225-4-alexei.starovoitov@gmail.com
2020-08-28 21:20:33 +02:00
Alexei Starovoitov
1e6c62a882 bpf: Introduce sleepable BPF programs
Introduce sleepable BPF programs that can request such property for themselves
via BPF_F_SLEEPABLE flag at program load time. In such case they will be able
to use helpers like bpf_copy_from_user() that might sleep. At present only
fentry/fexit/fmod_ret and lsm programs can request to be sleepable and only
when they are attached to kernel functions that are known to allow sleeping.

The non-sleepable programs are relying on implicit rcu_read_lock() and
migrate_disable() to protect life time of programs, maps that they use and
per-cpu kernel structures used to pass info between bpf programs and the
kernel. The sleepable programs cannot be enclosed into rcu_read_lock().
migrate_disable() maps to preempt_disable() in non-RT kernels, so the progs
should not be enclosed in migrate_disable() as well. Therefore
rcu_read_lock_trace is used to protect the life time of sleepable progs.

There are many networking and tracing program types. In many cases the
'struct bpf_prog *' pointer itself is rcu protected within some other kernel
data structure and the kernel code is using rcu_dereference() to load that
program pointer and call BPF_PROG_RUN() on it. All these cases are not touched.
Instead sleepable bpf programs are allowed with bpf trampoline only. The
program pointers are hard-coded into generated assembly of bpf trampoline and
synchronize_rcu_tasks_trace() is used to protect the life time of the program.
The same trampoline can hold both sleepable and non-sleepable progs.

When rcu_read_lock_trace is held it means that some sleepable bpf program is
running from bpf trampoline. Those programs can use bpf arrays and preallocated
hash/lru maps. These map types are waiting on programs to complete via
synchronize_rcu_tasks_trace();

Updates to trampoline now has to do synchronize_rcu_tasks_trace() and
synchronize_rcu_tasks() to wait for sleepable progs to finish and for
trampoline assembly to finish.

This is the first step of introducing sleepable progs. Eventually dynamically
allocated hash maps can be allowed and networking program types can become
sleepable too.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: KP Singh <kpsingh@google.com>
Link: https://lore.kernel.org/bpf/20200827220114.69225-3-alexei.starovoitov@gmail.com
2020-08-28 21:20:33 +02:00
Alexei Starovoitov
76cd61739f mm/error_inject: Fix allow_error_inject function signatures.
'static' and 'static noinline' function attributes make no guarantees that
gcc/clang won't optimize them. The compiler may decide to inline 'static'
function and in such case ALLOW_ERROR_INJECT becomes meaningless. The compiler
could have inlined __add_to_page_cache_locked() in one callsite and didn't
inline in another. In such case injecting errors into it would cause
unpredictable behavior. It's worse with 'static noinline' which won't be
inlined, but it still can be optimized. Like the compiler may decide to remove
one argument or constant propagate the value depending on the callsite.

To avoid such issues make sure that these functions are global noinline.

Fixes: af3b854492 ("mm/page_alloc.c: allow error injection")
Fixes: cfcbfb1382 ("mm/filemap.c: enable error injection at add_to_page_cache()")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Link: https://lore.kernel.org/bpf/20200827220114.69225-2-alexei.starovoitov@gmail.com
2020-08-28 21:20:32 +02:00
Rafael J. Wysocki
4f31d53c21 Merge branch 'acpi-mm'
* acpi-mm:
  ACPI: OSL: Prevent acpi_release_memory() from returning too early
  ACPI: ioremap: avoid redundant rounding to OS page size
2020-08-28 21:17:56 +02:00
Rafael J. Wysocki
ef7d960403 Merge branch 'pm-cpufreq'
* pm-cpufreq:
  cpufreq: Use WARN_ON_ONCE() for invalid relation
  cpufreq: No need to verify cpufreq_driver in show_scaling_cur_freq()
  Documentation: fix pm/intel_pstate build warning and wording
  cpufreq: replace cpu_logical_map() with read_cpuid_mpir()
2020-08-28 20:58:16 +02:00
Linus Torvalds
96d454cd2c Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:

 - Fix kernel build with the integrated LLVM assembler which doesn't see
   the -Wa,-march option.

 - Fix "make vdso_install" when COMPAT_VDSO is disabled.

 - Make KVM more robust if the AT S1E1R instruction triggers an
   exception (architecture corner cases).

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  KVM: arm64: Set HCR_EL2.PTW to prevent AT taking synchronous exception
  KVM: arm64: Survive synchronous exceptions caused by AT instructions
  KVM: arm64: Add kvm_extable for vaxorcism code
  arm64: vdso32: make vdso32 install conditional
  arm64: use a common .arch preamble for inline assembly
2020-08-28 11:37:33 -07:00
Herbert Xu
ef91bb196b kernel.h: Silence sparse warning in lower_32_bits
I keep getting sparse warnings in crypto such as:

  CHECK   drivers/crypto/ccree/cc_hash.c
   drivers/crypto/ccree/cc_hash.c:49:9: warning: cast truncates bits from constant value (47b5481dbefa4fa4 becomes befa4fa4)
   drivers/crypto/ccree/cc_hash.c:49:26: warning: cast truncates bits from constant value (db0c2e0d64f98fa7 becomes 64f98fa7)
   [.. many more ..]

This patch removes the warning by adding a mask to keep sparse
happy.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-28 11:21:20 -07:00
Fabian Frederick
67afbda696 selftests: netfilter: add command usage
Avoid bad command arguments.
Based on tools/power/cpupower/bench/cpufreq-bench_plot.sh

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-08-28 20:12:00 +02:00
Fabian Frederick
2f4bba4ef7 selftests: netfilter: simplify command testing
Fix some shellcheck SC2181 warnings:
"Check exit code directly with e.g. 'if mycmd;', not indirectly with
$?." as suggested by Stefano Brivio.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-08-28 20:11:59 +02:00
Fabian Frederick
d721b68654 selftests: netfilter: remove unused variable in make_file()
'who' variable was not used in make_file()
Problem found using Shellcheck

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-08-28 20:11:59 +02:00
Fabian Frederick
a7bf670ebe selftests: netfilter: exit on invalid parameters
exit script with comments when parameters are wrong during address
addition. No need for a message when trying to change MTU with lower
values: output is self-explanatory.
Use short testing sequence to avoid shellcheck warnings
(suggested by Stefano Brivio).

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-08-28 20:11:59 +02:00
Fabian Frederick
da2f849e89 selftests: netfilter: fix header example
nft_flowtable.sh is made for bash not sh.
Also give values which not return "RTNETLINK answers: Invalid argument"

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-08-28 20:11:58 +02:00
Pablo Neira Ayuso
ee92118355 netfilter: nfnetlink: nfnetlink_unicast() reports EAGAIN instead of ENOBUFS
Frontend callback reports EAGAIN to nfnetlink to retry a command, this
is used to signal that module autoloading is required. Unfortunately,
nlmsg_unicast() reports EAGAIN in case the receiver socket buffer gets
full, so it enters a busy-loop.

This patch updates nfnetlink_unicast() to turn EAGAIN into ENOBUFS and
to use nlmsg_unicast(). Remove the flags field in nfnetlink_unicast()
since this is always MSG_DONTWAIT in the existing code which is exactly
what nlmsg_unicast() passes to netlink_unicast() as parameter.

Fixes: 96518518cc ("netfilter: add nftables")
Reported-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-08-28 20:11:58 +02:00