Commit Graph

858756 Commits

Author SHA1 Message Date
Bjorn Helgaas
7db4af43c9 PCI: Use dev_printk() when possible
Use dev_printk() when possible.  This makes messages more consistent with
other device-related messages and, in some cases, adds useful information.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2019-05-09 07:49:49 -05:00
Mao Han
3213486f2e csky: Add support for perf unwind-libdw
This patch add support for DWARF register mappings and libdw registers
initialization, which is used by perf callchain analyzing, eg:

perf record --call-graph=dwarf <COMMAND>

Here is elfutils csky backend patch set:
https://sourceware.org/ml/elfutils-devel/2019-q2/msg00007.html

Signed-off-by: Mao Han <han_mao@c-sky.com>
Signed-off-by: Guo Ren <ren_guo@c-sky.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Arnd Bergmann <arnd@arnd.de>
2019-05-09 20:36:42 +08:00
Daniel Drake
1d25724b41 drm/i915/fbc: disable framebuffer compression on GeminiLake
On many (all?) the Gemini Lake systems we work with, there is frequent
momentary graphical corruption at the top of the screen, and it seems
that disabling framebuffer compression can avoid this.

The ticket was reported 6 months ago and has already affected a
multitude of users, without any real progress being made. So, lets
disable framebuffer compression on GeminiLake until a solution is found.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108085
Fixes: fd7d6c5c8f ("drm/i915: enable FBC on gen9+ too")
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: <stable@vger.kernel.org> # v4.11+
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Drake <drake@endlessm.com>
Signed-off-by: Jian-Hong Pan <jian-hong@endlessm.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190423092810.28359-1-jian-hong@endlessm.com
2019-05-09 15:30:42 +03:00
Amir Goldstein
4d8e7055a4 fsnotify: fix unlink performance regression
__fsnotify_parent() has an optimization in place to avoid unneeded
take_dentry_name_snapshot().  When fsnotify_nameremove() was changed
not to call __fsnotify_parent(), we left out the optimization.
Kernel test robot reported a 5% performance regression in concurrent
unlink() workload.

Reported-by: kernel test robot <rong.a.chen@intel.com>
Link: https://lore.kernel.org/lkml/20190505062153.GG29809@shao2-debian/
Link: https://lore.kernel.org/linux-fsdevel/20190104090357.GD22409@quack2.suse.cz/
Fixes: 5f02a87763 ("fsnotify: annotate directory entry modification events")
CC: stable@vger.kernel.org
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2019-05-09 12:44:00 +02:00
Daniel Vetter
094aa54f0f drm: Some ocd in drm_file.c
Move the open helper around to avoid the forward decl, and give
drm_setup a drm_legacy_ prefix since it's all legacy stuff in there.

v2: Move drm_legacy_setup into drm_legacy_misc.c (Chris). The
counterpart in the form of drm_legacy_dev_reinit is there already too,
plus it fits perfectly into Dave's work of making DRIVER_LEGACY code
compile-time optional.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20190502135603.20413-1-daniel.vetter@ffwll.ch
2019-05-09 11:40:20 +02:00
Filipe Manana
72bd2323ec Btrfs: do not abort transaction at btrfs_update_root() after failure to COW path
Currently when we fail to COW a path at btrfs_update_root() we end up
always aborting the transaction. However all the current callers of
btrfs_update_root() are able to deal with errors returned from it, many do
end up aborting the transaction themselves (directly or not, such as the
transaction commit path), other BUG_ON() or just gracefully cancel whatever
they were doing.

When syncing the fsync log, we call btrfs_update_root() through
tree-log.c:update_log_root(), and if it returns an -ENOSPC error, the log
sync code does not abort the transaction, instead it gracefully handles
the error and returns -EAGAIN to the fsync handler, so that it falls back
to a transaction commit. Any other error different from -ENOSPC, makes the
log sync code abort the transaction.

So remove the transaction abort from btrfs_update_log() when we fail to
COW a path to update the root item, so that if an -ENOSPC failure happens
we avoid aborting the current transaction and have a chance of the fsync
succeeding after falling back to a transaction commit.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203413
Fixes: 79787eaab4 ("btrfs: replace many BUG_ONs with proper error handling")
Cc: stable@vger.kernel.org # 4.4+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-05-09 11:25:27 +02:00
Josef Bacik
d7400ee1b4 btrfs: use the existing reserved items for our first prop for inheritance
We're now reserving an extra items worth of space for property
inheritance.  We only have one property at the moment so this covers us,
but if we add more in the future this will allow us to not get bitten by
the extra space reservation.  If we do add more properties in the future
we should re-visit how we calculate the space reservation needs by the
callers.

Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
[ refreshed on top of prop/xattr cleanups ]
Signed-off-by: David Sterba <dsterba@suse.com>
2019-05-09 11:18:14 +02:00
Daniel Drake
2420a0b179 x86/tsc: Set LAPIC timer period to crystal clock frequency
The APIC timer calibration (calibrate_APIC_timer()) can be skipped
in cases where we know the APIC timer frequency. On Intel SoCs,
we believe that the APIC is fed by the crystal clock; this would make
sense, and the crystal clock frequency has been verified against the
APIC timer calibration result on ApolloLake, GeminiLake, Kabylake,
CoffeeLake, WhiskeyLake and AmberLake.

Set lapic_timer_period based on the crystal clock frequency
accordingly.

APIC timer calibration would normally be skipped on modern CPUs
by nature of the TSC deadline timer being used instead,
however this change is still potentially useful, e.g. if the
TSC deadline timer has been disabled with a kernel parameter.
calibrate_APIC_timer() uses the legacy timer, but we are seeing
new platforms that omit such legacy functionality, so avoiding
such codepaths is becoming more important.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Daniel Drake <drake@endlessm.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: len.brown@intel.com
Cc: linux@endlessm.com
Cc: rafael.j.wysocki@intel.com
Link: http://lkml.kernel.org/r/20190509055417.13152-3-drake@endlessm.com
Link: https://lkml.kernel.org/r/20190419083533.32388-1-drake@endlessm.com
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1904031206440.1967@nanos.tec.linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-05-09 11:06:49 +02:00
Daniel Drake
52ae346bd2 x86/apic: Rename 'lapic_timer_frequency' to 'lapic_timer_period'
This variable is a period unit (number of clock cycles per jiffy),
not a frequency (which is number of cycles per second).

Give it a more appropriate name.

Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Daniel Drake <drake@endlessm.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: len.brown@intel.com
Cc: linux@endlessm.com
Cc: rafael.j.wysocki@intel.com
Link: http://lkml.kernel.org/r/20190509055417.13152-2-drake@endlessm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-05-09 11:06:49 +02:00
Daniel Drake
604dc9170f x86/tsc: Use CPUID.0x16 to calculate missing crystal frequency
native_calibrate_tsc() had a data mapping Intel CPU families
and crystal clock speed, but hardcoded tables are not ideal, and this
approach was already problematic at least in the Skylake X case, as
seen in commit:

  b511203093 ("x86/tsc: Fix erroneous TSC rate on Skylake Xeon")

By examining CPUID data from http://instlatx64.atw.hu/ and units
in the lab, we have found that 3 different scenarios need to be dealt
with, and we can eliminate most of the hardcoded data using an approach a
little more advanced than before:

 1. ApolloLake, GeminiLake, CannonLake (and presumably all new chipsets
    from this point) report the crystal frequency directly via CPUID.0x15.
    That's definitive data that we can rely upon.

 2. Skylake, Kabylake and all variants of those two chipsets report a
    crystal frequency of zero, however we can calculate the crystal clock
    speed by condidering data from CPUID.0x16.

    This method correctly distinguishes between the two crystal clock
    frequencies present on different Skylake X variants that caused
    headaches before.

    As the calculations do not quite match the previously-hardcoded values
    in some cases (e.g. 23913043Hz instead of 24MHz), TSC refinement is
    enabled on all platforms where we had to calculate the crystal
    frequency in this way.

 3. Denverton (GOLDMONT_X) reports a crystal frequency of zero and does
    not support CPUID.0x16, so we leave this entry hardcoded.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Daniel Drake <drake@endlessm.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: len.brown@intel.com
Cc: linux@endlessm.com
Cc: rafael.j.wysocki@intel.com
Link: http://lkml.kernel.org/r/20190509055417.13152-1-drake@endlessm.com
Link: https://lkml.kernel.org/r/20190419083533.32388-1-drake@endlessm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-05-09 11:06:48 +02:00
Miroslav Lichvar
fdc6bae940 ntp: Allow TAI-UTC offset to be set to zero
The ADJ_TAI adjtimex mode sets the TAI-UTC offset of the system clock.
It is typically set by NTP/PTP implementations and it is automatically
updated by the kernel on leap seconds. The initial value is zero (which
applications may interpret as unknown), but this value cannot be set by
adjtimex. This limitation seems to go back to the original "nanokernel"
implementation by David Mills.

Change the ADJ_TAI check to accept zero as a valid TAI-UTC offset in
order to allow setting it back to the initial value.

Fixes: 153b5d054a ("ntp: support for TAI")
Suggested-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Link: https://lkml.kernel.org/r/20190417084833.7401-1-mlichvar@redhat.com
2019-05-09 10:46:58 +02:00
Dave Hansen
5a28fc94c9 x86/mpx, mm/core: Fix recursive munmap() corruption
This is a bit of a mess, to put it mildly.  But, it's a bug
that only seems to have showed up in 4.20 but wasn't noticed
until now, because nobody uses MPX.

MPX has the arch_unmap() hook inside of munmap() because MPX
uses bounds tables that protect other areas of memory.  When
memory is unmapped, there is also a need to unmap the MPX
bounds tables.  Barring this, unused bounds tables can eat 80%
of the address space.

But, the recursive do_munmap() that gets called vi arch_unmap()
wreaks havoc with __do_munmap()'s state.  It can result in
freeing populated page tables, accessing bogus VMA state,
double-freed VMAs and more.

See the "long story" further below for the gory details.

To fix this, call arch_unmap() before __do_unmap() has a chance
to do anything meaningful.  Also, remove the 'vma' argument
and force the MPX code to do its own, independent VMA lookup.

== UML / unicore32 impact ==

Remove unused 'vma' argument to arch_unmap().  No functional
change.

I compile tested this on UML but not unicore32.

== powerpc impact ==

powerpc uses arch_unmap() well to watch for munmap() on the
VDSO and zeroes out 'current->mm->context.vdso_base'.  Moving
arch_unmap() makes this happen earlier in __do_munmap().  But,
'vdso_base' seems to only be used in perf and in the signal
delivery that happens near the return to userspace.  I can not
find any likely impact to powerpc, other than the zeroing
happening a little earlier.

powerpc does not use the 'vma' argument and is unaffected by
its removal.

I compile-tested a 64-bit powerpc defconfig.

== x86 impact ==

For the common success case this is functionally identical to
what was there before.  For the munmap() failure case, it's
possible that some MPX tables will be zapped for memory that
continues to be in use.  But, this is an extraordinarily
unlikely scenario and the harm would be that MPX provides no
protection since the bounds table got reset (zeroed).

I can't imagine anyone doing this:

	ptr = mmap();
	// use ptr
	ret = munmap(ptr);
	if (ret)
		// oh, there was an error, I'll
		// keep using ptr.

Because if you're doing munmap(), you are *done* with the
memory.  There's probably no good data in there _anyway_.

This passes the original reproducer from Richard Biener as
well as the existing mpx selftests/.

The long story:

munmap() has a couple of pieces:

 1. Find the affected VMA(s)
 2. Split the start/end one(s) if neceesary
 3. Pull the VMAs out of the rbtree
 4. Actually zap the memory via unmap_region(), including
    freeing page tables (or queueing them to be freed).
 5. Fix up some of the accounting (like fput()) and actually
    free the VMA itself.

This specific ordering was actually introduced by:

  dd2283f260 ("mm: mmap: zap pages with read mmap_sem in munmap")

during the 4.20 merge window.  The previous __do_munmap() code
was actually safe because the only thing after arch_unmap() was
remove_vma_list().  arch_unmap() could not see 'vma' in the
rbtree because it was detached, so it is not even capable of
doing operations unsafe for remove_vma_list()'s use of 'vma'.

Richard Biener reported a test that shows this in dmesg:

  [1216548.787498] BUG: Bad rss-counter state mm:0000000017ce560b idx:1 val:551
  [1216548.787500] BUG: non-zero pgtables_bytes on freeing mm: 24576

What triggered this was the recursive do_munmap() called via
arch_unmap().  It was freeing page tables that has not been
properly zapped.

But, the problem was bigger than this.  For one, arch_unmap()
can free VMAs.  But, the calling __do_munmap() has variables
that *point* to VMAs and obviously can't handle them just
getting freed while the pointer is still in use.

I tried a couple of things here.  First, I tried to fix the page
table freeing problem in isolation, but I then found the VMA
issue.  I also tried having the MPX code return a flag if it
modified the rbtree which would force __do_munmap() to re-walk
to restart.  That spiralled out of control in complexity pretty
fast.

Just moving arch_unmap() and accepting that the bonkers failure
case might eat some bounds tables seems like the simplest viable
fix.

This was also reported in the following kernel bugzilla entry:

  https://bugzilla.kernel.org/show_bug.cgi?id=203123

There are some reports that this commit triggered this bug:

  dd2283f260 ("mm: mmap: zap pages with read mmap_sem in munmap")

While that commit certainly made the issues easier to hit, I believe
the fundamental issue has been with us as long as MPX itself, thus
the Fixes: tag below is for one of the original MPX commits.

[ mingo: Minor edits to the changelog and the patch. ]

Reported-by: Richard Biener <rguenther@suse.de>
Reported-by: H.J. Lu <hjl.tools@gmail.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Yang Shi <yang.shi@linux.alibaba.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rik van Riel <riel@surriel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: linux-arch@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: linux-um@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: stable@vger.kernel.org
Fixes: dd2283f260 ("mm: mmap: zap pages with read mmap_sem in munmap")
Link: http://lkml.kernel.org/r/20190419194747.5E1AD6DC@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-05-09 10:37:17 +02:00
Maarten Lankhorst
752c4f3c1d Merge remote-tracking branch 'drm/drm-next' into drm-misc-next
Requested for backmerging airlied's drm-legacy cleanup.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2019-05-09 10:19:03 +02:00
Ramalingam C
c16fd9be70 drm/hdcp: gathering hdcp related code into drm_hdcp.c
Considering the significant size of hdcp related code in drm, all
hdcp related codes are moved into separate file called drm_hdcp.c.

v2:
  Rebased.
v2:
  Rebased.

Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Suggested-by: Daniel Vetter <daniel@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Dave Airlie <airlied@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20190507162745.25600-7-ramalingam.c@intel.com
2019-05-09 09:44:41 +02:00
Ramalingam C
f26ae6a652 drm/i915: SRM revocation check for HDCP1.4 and 2.2
DRM HDCP SRM revocation check services are used from I915 for HDCP1.4
and 2.2 revocation check during the respective authentication flow.

v2:
  Rebased.
v3:
  %s/*_ksvs_revocated/*_check_ksvs_revoked [Daniel]
  unwanted noise is removed.

Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20190507162745.25600-6-ramalingam.c@intel.com
2019-05-09 09:44:41 +02:00
Ramalingam C
6498bf5800 drm: revocation check at drm subsystem
On every hdcp revocation check request SRM is read from fw file
/lib/firmware/display_hdcp_srm.bin

SRM table is parsed and stored at drm_hdcp.c, with functions exported
for the services for revocation check from drivers (which
implements the HDCP authentication)

This patch handles the HDCP1.4 and 2.2 versions of SRM table.

v2:
  moved the uAPI to request_firmware_direct() [Daniel]
v3:
  kdoc added. [Daniel]
  srm_header unified and bit field definitions are removed. [Daniel]
  locking improved. [Daniel]
  vrl length violation is fixed. [Daniel]
v4:
  s/__swab16/be16_to_cpu [Daniel]
  be24_to_cpu is done through a global func [Daniel]
  Unused variables are removed. [Daniel]
  unchecked return values are dropped from static funcs [Daniel]

Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Acked-by: Satyeshwar Singh <satyeshwar.singh@intel.com>
Reviewed-by: Daniel Vetter <daniel@ffwll.ch>
Acked-by: Dave Airlie <airlied@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20190507162745.25600-5-ramalingam.c@intel.com
2019-05-09 09:44:41 +02:00
Ramalingam C
0de655cae4 drm: generic fn converting be24 to cpu and vice versa
Existing functions for converting a 3bytes(be24) of big endian value
into u32 of little endian and vice versa are renamed as

s/drm_hdcp2_seq_num_to_u32/drm_hdcp_be24_to_cpu
s/drm_hdcp2_u32_to_seq_num/drm_hdcp_cpu_to_be24

Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Suggested-by: Daniel Vetter <daniel@ffwll.ch>
cc: Tomas Winkler <tomas.winkler@intel.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20190507162745.25600-4-ramalingam.c@intel.com
2019-05-09 09:44:41 +02:00
Ramalingam C
43318c0ae3 drm/i915: debugfs: HDCP2.2 capability read
Adding the HDCP2.2 capability of HDCP src and sink info into debugfs
entry "i915_hdcp_sink_capability"

This helps the userspace tests to skip the HDCP2.2 test on non HDCP2.2
sinks.

v2:
  Rebased.

Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20190507162745.25600-3-ramalingam.c@intel.com
2019-05-09 09:44:41 +02:00
Ramalingam C
585b000de2 drm: move content protection property to mode_config
Content protection property is created once and stored in
drm_mode_config. And attached to all HDCP capable connectors.

Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Dave Airlie <airlied@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20190507162745.25600-2-ramalingam.c@intel.com
2019-05-09 09:44:41 +02:00
Pablo Neira Ayuso
c6c9c0596c netfilter: nf_tables: remove NFT_CT_TIMEOUT
Never used anywhere in the code.

Fixes: 7e0b2b57f0 ("netfilter: nft_ct: add ct timeout support")
Reported-by: Stéphane Veyret <sveyret@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-05-09 08:55:41 +02:00
Florian Westphal
680f6af533 netfilter: ebtables: CONFIG_COMPAT: reject trailing data after last rule
If userspace provides a rule blob with trailing data after last target,
we trigger a splat, then convert ruleset to 64bit format (with trailing
data), then pass that to do_replace_finish() which then returns -EINVAL.

Erroring out right away avoids the splat plus unneeded translation and
error unwind.

Fixes: 81e675c227 ("netfilter: ebtables: add CONFIG_COMPAT support")
Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-05-09 08:54:49 +02:00
Al Viro
05883eee85 do_move_mount(): fix an unsafe use of is_anon_ns()
What triggers it is a race between mount --move and umount -l
of the source; we should reject it (the source is parentless *and*
not the root of anon namespace at that), but the check for namespace
being an anon one is broken in that case - is_anon_ns() needs
ns to be non-NULL.  Better fixed here than in is_anon_ns(), since
the rest of the callers is guaranteed to get a non-NULL argument...

Reported-by: syzbot+494c7ddf66acac0ad747@syzkaller.appspotmail.com
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-05-09 02:32:50 -04:00
Marek Behun
8fbbfd966e mailbox: Add support for Armada 37xx rWTM mailbox
This adds support for the mailbox via which the kernel can communicate
with the firmware running on the secure processor of the Armada 37xx
SOC.

The rWTM secure processor has access to internal eFuses and
cryptographic circuits, such as the Entropy Bit Generator to generate
true random numbers.

Signed-off-by: Marek Behun <marek.behun@nic.cz>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
2019-05-09 00:41:00 -05:00
Marek Behun
004c35cd8e dt-bindings: mailbox: Document armada-3700-rwtm-mailbox binding
This adds device tree binding documentation for the rWTM BIU mailbox
driver on the Armada 37xx SOC (EspressoBin, Turris Mox).

Signed-off-by: Marek Behun <marek.behun@nic.cz>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
2019-05-09 00:41:00 -05:00
Fabien Dessenne
68a1c8485c mailbox: stm32-ipcc: check invalid irq
On failure of_irq_get() returns a negative value or zero, which is
not handled as an error in the existing implementation.
Instead of using this API, use platform_get_irq() that returns
exclusively a negative value on failure.
Also, do not output an error log in case of defer probe error.

Signed-off-by: Fabien Dessenne <fabien.dessenne@st.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
2019-05-09 00:41:00 -05:00
Anson Huang
0c40e631cd mailbox: imx: use devm_platform_ioremap_resource() to simplify code
Use the new helper devm_platform_ioremap_resource() which wraps the
platform_get_resource() and devm_ioremap_resource() together, to
simplify the code.

Signed-off-by: Anson Huang <Anson.Huang@nxp.com>
Reviewed-by: Mukesh Ojha <mojha@codeaurora.org>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
2019-05-09 00:40:49 -05:00
Herbert Xu
cbc22b0621 Revert "crypto: caam/jr - Remove extra memory barrier during job ring dequeue"
This reverts commit bbfcac5ff5.

It caused a crash regression on powerpc:

https://lore.kernel.org/linux-crypto/87pnp2aflz.fsf@concordia.ellerman.id.au/

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-05-09 13:20:48 +08:00
Iuliana Prodan
8c65d35435 crypto: caam - fix caam_dump_sg that iterates through scatterlist
Fix caam_dump_sg by correctly determining the next scatterlist
entry in the list.

Fixes: 5ecf8ef910 ("crypto: caam - fix sg dump")
Signed-off-by: Iuliana Prodan <iuliana.prodan@nxp.com>
Reviewed-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-05-09 13:17:56 +08:00
Herbert Xu
24586b5fea crypto: caam - fix DKP detection logic
The detection for DKP (Derived Key Protocol) relied on the value
of the setkey function.  This was broken by the recent change which
added des3_aead_setkey.

This patch fixes this by introducing a new flag for DKP and setting
that where needed.

Fixes: 1b52c40919 ("crypto: caam - Forbid 2-key 3DES in FIPS mode")
Reported-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Tested-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-05-09 13:17:56 +08:00
Atul Gupta
0816ecf48f MAINTAINERS: Maintainer for Chelsio crypto driver
Modified the maintainer name

Signed-off-by: Atul Gupta <atul.gupta@chelsio.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-05-09 13:17:23 +08:00
Atul Gupta
0a4491d3fe crypto: chelsio - count incomplete block in IV
The partial block should count as one and appropriately appended
to IV. eg 499B for AES CTR should count 32 block than 31 and
correct count value is updated in iv out.

Signed-off-by: Atul Gupta <atul.gupta@chelsio.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-05-09 13:17:23 +08:00
Atul Gupta
33ddc108c5 crypto: chelsio - Fix softlockup with heavy I/O
removed un-necessary lock_chcr_dev to protect device state
DETACH. lock is not required to protect I/O count

Signed-off-by: Atul Gupta <atul.gupta@chelsio.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-05-09 13:17:22 +08:00
Atul Gupta
b4f9166430 crypto: chelsio - Fix NULL pointer dereference
Do not request FW to generate cidx update if there is less
space in tx queue to post new request.
SGE DBP 1 pidx increment too large
BUG: unable to handle kernel NULL pointer dereference at
0000000000000124
SGE error for queue 101

Signed-off-by: Atul Gupta <atul.gupta@chelsio.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-05-09 13:17:22 +08:00
Takashi Iwai
ed97c988bd Merge tag 'asoc-v5.2-5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v5.2

A bunch of driver specific fixes that came in since the initial pull
request for v5.2, mainly warning fixes for the newly added Sound Open
Firmware code which people appeared to only start looking at after I'd
sent the pull request.
2019-05-09 07:13:40 +02:00
Linus Torvalds
a2d635decb Merge tag 'drm-next-2019-05-09' of git://anongit.freedesktop.org/drm/drm
Pull drm updates from Dave Airlie:
 "This has two exciting community drivers for ARM Mali accelerators.
  Since ARM has never been open source friendly on the GPU side of the
  house, the community has had to create open source drivers for the
  Mali GPUs. Lima covers the older t4xx and panfrost the newer 6xx/7xx
  series. Well done to all involved and hopefully this will help ARM
  head in the right direction.

  There is also now the ability if you don't have any of the legacy
  drivers enabled (pre-KMS) to remove all the pre-KMS support code from
  the core drm, this saves 10% or so in codesize on my machine.

  i915 also enable Icelake/Elkhart Lake Gen11 GPUs by default, vboxvideo
  moves out of staging.

  There are also some rcar-du patches which crossover with media tree
  but all should be acked by Mauro.

  Summary:

  uapi changes:
   - Colorspace connector property
   - fourcc - new YUV formts
   - timeline sync objects initially merged
   - expose FB_DAMAGE_CLIPS to atomic userspace

  new drivers:
   - vboxvideo: moved out of staging
   - aspeed: ASPEED SoC BMC chip display support
   - lima: ARM Mali4xx GPU acceleration driver support
   - panfrost: ARM Mali6xx/7xx Midgard/Bitfrost acceleration driver support

  core:
   - component helper docs
   - unplugging fixes
   - devm device init
   - MIPI/DSI rate control
   - shmem backed gem objects
   - connector, display_info, edid_quirks cleanups
   - dma_buf fence chain support
   - 64-bit dma-fence seqno comparison fixes
   - move initial fb config code to core
   - gem fence array helpers for Lima
   - ability to remove legacy support code if no drivers requires it (removes 10% of drm.ko size)
   - lease fixes

  ttm:
   - unified DRM_FILE_PAGE_OFFSET handling
   - Account for kernel allocations in kernel zone only

  panel:
   - OSD070T1718-19TS panel support
   - panel-tpo-td028ttec1 backlight support
   - Ronbo RB070D30 MIPI/DSI
   - Feiyang FY07024DI26A30-D MIPI-DSI panel
   - Rocktech jh057n00900 MIPI-DSI panel

  i915:
   - Comet Lake (Gen9) PCI IDs
   - Updated Icelake PCI IDs
   - Elkhartlake (Gen11) support
   - DP MST property addtions
   - plane and watermark fixes
   - Icelake port sync and VEBOX disable fixes
   - struct_mutex usage reduction
   - Icelake gamma fix
   - GuC reset fixes
   - make mmap more asynchronous
   - sound display power well race fixes
   - DDI/MIPI-DSI clocks for Icelake
   - Icelake RPS frequency changing support
   - Icelake workarounds

  amdgpu:
   - Use HMM for userptr
   - vega20 experimental smu11 support
   - RAS support for vega20
   - BACO support for vega12 + fixes for vega20
   - reworked IH interrupt handling
   - amdkfd RAS support
   - Freesync improvements
   - initial timeline sync object support
   - DC Z ordering fixes
   - NV12 planes support
   - colorspace properties for planes=
   - eDP opts if eDP already initialized

  nouveau:
   - misc fixes

  etnaviv:
   - misc fixes

  msm:
   - GPU zap shader support expansion
   - robustness ABI addition

  exynos:
   - Logging cleanups

  tegra:
   - Shared reset fix
   - CPU cache maintenance fix

  cirrus:
   - driver rewritten using simple helpers

  meson:
   - G12A support

  vmwgfx:
   - Resource dirtying management improvements
   - Userspace logging improvements

  virtio:
   - PRIME fixes

  rockchip:
   - rk3066 hdmi support

  sun4i:
   - DSI burst mode support

  vc4:
   - load tracker to detect underflow

  v3d:
   - v3d v4.2 support

  malidp:
   - initial Mali D71 support in komeda driver

  tfp410:
   - omap related improvement

  omapdrm:
   - drm bridge/panel support
   - drop some omap specific panels

  rcar-du:
   - Display writeback support"

* tag 'drm-next-2019-05-09' of git://anongit.freedesktop.org/drm/drm: (1507 commits)
  drm/msm/a6xx: No zap shader is not an error
  drm/cma-helper: Fix drm_gem_cma_free_object()
  drm: Fix timestamp docs for variable refresh properties.
  drm/komeda: Mark the local functions as static
  drm/komeda: Fixed warning: Function parameter or member not described
  drm/komeda: Expose bus_width to Komeda-CORE
  drm/komeda: Add sysfs attribute: core_id and config_id
  drm: add non-desktop quirk for Valve HMDs
  drm/panfrost: Show stored feature registers
  drm/panfrost: Don't scream about deferred probe
  drm/panfrost: Disable PM on probe failure
  drm/panfrost: Set DMA masks earlier
  drm/panfrost: Add sanity checks to submit IOCTL
  drm/etnaviv: initialize idle mask before querying the HW db
  drm: introduce a capability flag for syncobj timeline support
  drm: report consistent errors when checking syncobj capibility
  drm/nouveau/nouveau: forward error generated while resuming objects tree
  drm/nouveau/fb/ramgk104: fix spelling mistake "sucessfully" -> "successfully"
  drm/nouveau/i2c: Disable i2c bus access after ->fini()
  drm/nouveau: Remove duplicate ACPI_VIDEO_NOTIFY_PROBE definition
  ...
2019-05-08 21:35:19 -07:00
Michael Ellerman
8150a153c0 powerpc/64s: Use early_mmu_has_feature() in set_kuap()
When implementing the KUAP support on Radix we fixed one case where
mmu_has_feature() was being called too early in boot via
__put_user_size().

However since then some new code in linux-next has created a new path
via which we can end up calling mmu_has_feature() too early.

On P9 this leads to crashes early in boot if we have both PPC_KUAP and
CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG enabled. Our early boot code
calls printk() which calls probe_kernel_read(), that does a
__copy_from_user_inatomic() which calls into set_kuap() and that uses
mmu_has_feature().

At that point in boot we haven't patched MMU features yet so the debug
code in mmu_has_feature() complains, and calls printk(). At that point
we recurse, eg:

  ...
  dump_stack+0xdc
  probe_kernel_read+0x1a4
  check_pointer+0x58
  ...
  printk+0x40
  dump_stack_print_info+0xbc
  dump_stack+0x8
  probe_kernel_read+0x1a4
  probe_kernel_read+0x19c
  check_pointer+0x58
  ...
  printk+0x40
  cpufeatures_process_feature+0xc8
  scan_cpufeatures_subnodes+0x380
  of_scan_flat_dt_subnodes+0xb4
  dt_cpu_ftrs_scan_callback+0x158
  of_scan_flat_dt+0xf0
  dt_cpu_ftrs_scan+0x3c
  early_init_devtree+0x360
  early_setup+0x9c

And so on for infinity, symptom is a dead system.

Even more fun is what happens when using the hash MMU (ie. p8 or p9
with Radix disabled), and when we don't have
CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG enabled. With the debug disabled
we don't check if static keys have been initialised, we just rely on
the jump label. But the jump label defaults to true so we just whack
the AMR even though Radix is not enabled.

Clearing the AMR is fine, but after we've done the user copy we write
(0b11 << 62) into AMR. When using hash that makes all pages with key
zero no longer readable or writable. All kernel pages implicitly have
key zero, and so all of a sudden the kernel can't read or write any of
its memory. Again dead system.

In the medium term we have several options for fixing this.
probe_kernel_read() doesn't need to touch AMR at all, it's not doing a
user access after all, but it uses __copy_from_user_inatomic() just
because it's easy, we could fix that.

It would also be safe to default to not writing to the AMR during
early boot, until we've detected features. But it's not clear that
flipping all the MMU features to static_key_false won't introduce
other bugs.

But for now just switch to early_mmu_has_feature() in set_kuap(), that
avoids all the problems with jump labels. It adds the overhead of a
global lookup and test, but that's probably trivial compared to the
writes to the AMR anyway.

Fixes: 890274c2dc ("powerpc/64s: Implement KUAP for Radix MMU")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Russell Currey <ruscur@russell.cc>
2019-05-09 14:28:56 +10:00
Chao Yu
c9c8ed50d9 f2fs: fix to avoid potential race on sbi->unusable_block_count access/update
Use sbi.stat_lock to protect sbi->unusable_block_count accesss/udpate, in
order to avoid potential race on it.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-05-08 21:23:13 -07:00
Chao Yu
d764834378 f2fs: add tracepoint for f2fs_filemap_fault()
This patch adds tracepoint for f2fs_filemap_fault().

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-05-08 21:23:13 -07:00
Chao Yu
93770ab7a6 f2fs: introduce DATA_GENERIC_ENHANCE
Previously, f2fs_is_valid_blkaddr(, blkaddr, DATA_GENERIC) will check
whether @blkaddr locates in main area or not.

That check is weak, since the block address in range of main area can
point to the address which is not valid in segment info table, and we
can not detect such condition, we may suffer worse corruption as system
continues running.

So this patch introduce DATA_GENERIC_ENHANCE to enhance the sanity check
which trigger SIT bitmap check rather than only range check.

This patch did below changes as wel:
- set SBI_NEED_FSCK in f2fs_is_valid_blkaddr().
- get rid of is_valid_data_blkaddr() to avoid panic if blkaddr is invalid.
- introduce verify_fio_blkaddr() to wrap fio {new,old}_blkaddr validation check.
- spread blkaddr check in:
 * f2fs_get_node_info()
 * __read_out_blkaddrs()
 * f2fs_submit_page_read()
 * ra_data_block()
 * do_recover_data()

This patch can fix bug reported from bugzilla below:

https://bugzilla.kernel.org/show_bug.cgi?id=203215
https://bugzilla.kernel.org/show_bug.cgi?id=203223
https://bugzilla.kernel.org/show_bug.cgi?id=203231
https://bugzilla.kernel.org/show_bug.cgi?id=203235
https://bugzilla.kernel.org/show_bug.cgi?id=203241

= Update by Jaegeuk Kim =

DATA_GENERIC_ENHANCE enhanced to validate block addresses on read/write paths.
But, xfstest/generic/446 compalins some generated kernel messages saying invalid
bitmap was detected when reading a block. The reaons is, when we get the
block addresses from extent_cache, there is no lock to synchronize it from
truncating the blocks in parallel.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-05-08 21:23:13 -07:00
Chao Yu
896285ad02 f2fs: fix to handle error in f2fs_disable_checkpoint()
In f2fs_disable_checkpoint(), it needs to detect and propagate error
number returned from f2fs_write_checkpoint().

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-05-08 21:23:13 -07:00
Chengguang Xu
d5d5f0c0c9 f2fs: remove redundant check in f2fs_file_write_iter()
We have already checked flag IOCB_DIRECT in the sanity
check of flag IOCB_NOWAIT, so don't have to check it
again here.

Signed-off-by: Chengguang Xu <cgxu519@gmx.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-05-08 21:23:12 -07:00
Chao Yu
f5a131bb23 f2fs: fix to be aware of readonly device in write_checkpoint()
As Park Ju Hyung reported:

Probably unrelated but a similar issue:
Warning appears upon unmounting a corrupted R/O f2fs loop image.

Should be a trivial issue to fix as well :)

[ 2373.758424] ------------[ cut here ]------------
[ 2373.758428] generic_make_request: Trying to write to read-only
block-device loop1 (partno 0)
[ 2373.758455] WARNING: CPU: 1 PID: 13950 at block/blk-core.c:2174
generic_make_request_checks+0x590/0x630
[ 2373.758556] CPU: 1 PID: 13950 Comm: umount Tainted: G           O
   4.19.35-zen+ #1
[ 2373.758558] Hardware name: System manufacturer System Product
Name/ROG MAXIMUS X HERO (WI-FI AC), BIOS 1704 09/14/2018
[ 2373.758564] RIP: 0010:generic_make_request_checks+0x590/0x630
[ 2373.758567] Code: 5c 03 00 00 48 8d 74 24 08 48 89 df c6 05 b5 cd
36 01 01 e8 c2 90 01 00 48 89 c6 44 89 ea 48 c7 c7 98 64 59 82 e8 d5
9b a7 ff <0f> 0b 48 8b 7b 08 e9 f2 fa ff ff 41 8b 86 98 02 00 00 49 8b
16 89
[ 2373.758570] RSP: 0018:ffff8882bdb43950 EFLAGS: 00010282
[ 2373.758573] RAX: 0000000000000050 RBX: ffff8887244c6700 RCX: 0000000000000006
[ 2373.758575] RDX: 0000000000000007 RSI: 0000000000000086 RDI: ffff88884ec56340
[ 2373.758577] RBP: ffff888849c426c0 R08: 0000000000000004 R09: 00000000000003ba
[ 2373.758579] R10: 0000000000000001 R11: 0000000000000029 R12: 0000000000001000
[ 2373.758581] R13: 0000000000000000 R14: ffff888844a2e800 R15: ffff8882bdb43ac0
[ 2373.758584] FS:  00007fc0d114f8c0(0000) GS:ffff88884ec40000(0000)
knlGS:0000000000000000
[ 2373.758586] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2373.758588] CR2: 00007fc0d1ad12c0 CR3: 00000002bdb82003 CR4: 00000000003606e0
[ 2373.758590] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2373.758592] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 2373.758593] Call Trace:
[ 2373.758602]  ? generic_make_request+0x46/0x3d0
[ 2373.758608]  ? wait_woken+0x80/0x80
[ 2373.758612]  ? mempool_alloc+0xb7/0x1a0
[ 2373.758618]  ? submit_bio+0x30/0x110
[ 2373.758622]  ? bvec_alloc+0x7c/0xd0
[ 2373.758628]  ? __submit_merged_bio+0x68/0x390
[ 2373.758633]  ? f2fs_submit_page_write+0x1bb/0x7f0
[ 2373.758638]  ? f2fs_do_write_meta_page+0x7f/0x160
[ 2373.758642]  ? __f2fs_write_meta_page+0x70/0x140
[ 2373.758647]  ? f2fs_sync_meta_pages+0x140/0x250
[ 2373.758653]  ? f2fs_write_checkpoint+0x5c5/0x17b0
[ 2373.758657]  ? f2fs_sync_fs+0x9c/0x110
[ 2373.758664]  ? sync_filesystem+0x66/0x80
[ 2373.758667]  ? generic_shutdown_super+0x1d/0x100
[ 2373.758670]  ? kill_block_super+0x1c/0x40
[ 2373.758674]  ? kill_f2fs_super+0x64/0xb0
[ 2373.758678]  ? deactivate_locked_super+0x2d/0xb0
[ 2373.758682]  ? cleanup_mnt+0x65/0xa0
[ 2373.758688]  ? task_work_run+0x7f/0xa0
[ 2373.758693]  ? exit_to_usermode_loop+0x9c/0xa0
[ 2373.758698]  ? do_syscall_64+0xc7/0xf0
[ 2373.758703]  ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 2373.758706] ---[ end trace 5d3639907c56271b ]---
[ 2373.758780] print_req_error: I/O error, dev loop1, sector 143048
[ 2373.758800] print_req_error: I/O error, dev loop1, sector 152200
[ 2373.758808] print_req_error: I/O error, dev loop1, sector 8192
[ 2373.758819] print_req_error: I/O error, dev loop1, sector 12272

This patch adds to detect readonly device in write_checkpoint() to avoid
trigger write IOs on it.

Reported-by: Park Ju Hyung <qkrwngud825@gmail.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-05-08 21:23:12 -07:00
Chao Yu
b61af314c9 f2fs: fix to skip recovery on readonly device
As Park Ju Hyung reported in mailing list:

https://sourceforge.net/p/linux-f2fs/mailman/message/36639787/

generic_make_request: Trying to write to read-only block-device loop0 (partno 0)
WARNING: CPU: 0 PID: 23437 at block/blk-core.c:2174 generic_make_request_checks+0x594/0x630

 generic_make_request+0x46/0x3d0
 submit_bio+0x30/0x110
 __submit_merged_bio+0x68/0x390
 f2fs_submit_page_write+0x1bb/0x7f0
 f2fs_do_write_meta_page+0x7f/0x160
 __f2fs_write_meta_page+0x70/0x140
 f2fs_sync_meta_pages+0x140/0x250
 f2fs_write_checkpoint+0x5c5/0x17b0
 f2fs_sync_fs+0x9c/0x110
 sync_filesystem+0x66/0x80
 f2fs_recover_fsync_data+0x790/0xa30
 f2fs_fill_super+0xe4e/0x1980
 mount_bdev+0x518/0x610
 mount_fs+0x34/0x13f
 vfs_kern_mount.part.11+0x4f/0x120
 do_mount+0x2d1/0xe40
 __x64_sys_mount+0xbf/0xe0
 do_syscall_64+0x4a/0xf0
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

print_req_error: I/O error, dev loop0, sector 4096

If block device is readonly, we should never trigger write IO from
filesystem layer, but previously, orphan and journal recovery didn't
consider such condition, result in triggering above warning, fix it.

Reported-by: Park Ju Hyung <qkrwngud825@gmail.com>
Tested-by: Park Ju Hyung <qkrwngud825@gmail.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-05-08 21:23:12 -07:00
Chao Yu
f824deb54b f2fs: fix to consider multiple device for readonly check
This patch introduce f2fs_hw_is_readonly() to check whether lower
device is readonly or not, it adapts multiple device scenario.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-05-08 21:23:12 -07:00
Chao Yu
b471eb99e6 f2fs: relocate chksum_offset for large_nat_bitmap feature
For large_nat_bitmap feature, there is a design flaw:

Previous:

struct f2fs_checkpoint layout:
+--------------------------+  0x0000
| checkpoint_ver           |
| ......                   |
| checksum_offset          |------+
| ......                   |      |
| sit_nat_version_bitmap[] |<-----|-------+
| ......                   |      |       |
| checksum_value           |<-----+       |
+--------------------------+  0x1000      |
|                          |      nat_bitmap + sit_bitmap
| payload blocks           |              |
|                          |              |
+--------------------------|<-------------+

Obviously, if nat_bitmap size + sit_bitmap size is larger than
MAX_BITMAP_SIZE_IN_CKPT, nat_bitmap or sit_bitmap may overlap
checkpoint checksum's position, once checkpoint() is triggered
from kernel, nat or sit bitmap will be damaged by checksum field.

In order to fix this, let's relocate checksum_value's position
to the head of sit_nat_version_bitmap as below, then nat/sit
bitmap and chksum value update will become safe.

After:

struct f2fs_checkpoint layout:
+--------------------------+  0x0000
| checkpoint_ver           |
| ......                   |
| checksum_offset          |------+
| ......                   |      |
| sit_nat_version_bitmap[] |<-----+
| ......                   |<-------------+
|                          |              |
+--------------------------+  0x1000      |
|                          |      nat_bitmap + sit_bitmap
| payload blocks           |              |
|                          |              |
+--------------------------|<-------------+

Related report and discussion:

https://sourceforge.net/p/linux-f2fs/mailman/message/36642346/

Reported-by: Park Ju Hyung <qkrwngud825@gmail.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-05-08 21:23:11 -07:00
Chao Yu
d7eb8f1cdf f2fs: allow unfixed f2fs_checkpoint.checksum_offset
Previously, f2fs_checkpoint.checksum_offset points fixed position of
f2fs_checkpoint structure:

"#define CP_CHKSUM_OFFSET	4092"

It is unnecessary, and it breaks the consecutiveness of nat and sit
bitmap stored across checkpoint park block and payload blocks.

This patch allows f2fs to handle unfixed .checksum_offset.

In addition, for the case checksum value is stored in the middle of
checkpoint park, calculating checksum value with superposition method
like we did for inode_checksum.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-05-08 21:23:11 -07:00
Youngjun Yoo
3a912b7732 f2fs: Replace spaces with tab
Modify coding style
ERROR: code indent should use tabs where possible

Signed-off-by: Youngjun Yoo <youngjun.willow@gmail.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-05-08 21:23:11 -07:00
Youngjun Yoo
c456362b91 f2fs: insert space before the open parenthesis '('
Modify coding style
ERROR: space required before the open parenthesis '('

Signed-off-by: Youngjun Yoo <youngjun.willow@gmail.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-05-08 21:23:11 -07:00
Chao Yu
d02a6e6174 f2fs: allow address pointer number of dnode aligning to specified size
This patch expands scalability of dnode layout, it allows address pointer
number of dnode aligning to specified size (now, the size is one byte by
default), and later the number can align to compress cluster size
(1 << n bytes, n=[2,..)), it can avoid cluster acrossing two dnode, making
design of compress meta layout simple.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-05-08 21:23:10 -07:00
Chao Yu
2df0ab0457 f2fs: introduce f2fs_read_single_page() for cleanup
This patch introduces f2fs_read_single_page() to wrap core operations
of reading one page in f2fs_mpage_readpages().

In addition, if we failed in f2fs_mpage_readpages(), propagate error
number to f2fs_read_data_page(), for f2fs_read_data_pages() path,
always return success.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-05-08 21:23:10 -07:00