Commit Graph

1068338 Commits

Author SHA1 Message Date
James Tai
d53de05681 ANDROID: GKI: Update RTK STB KMI symbol list
13 function symbol(s) added
  'int __traceiter_dwc3_writel(void*, void*, u32, u32)'
  'int class_compat_create_link(struct class_compat*, struct device*, struct device*)'
  'struct class_compat* class_compat_register(const char*)'
  'void class_compat_remove_link(struct class_compat*, struct device*, struct device*)'
  'void class_compat_unregister(struct class_compat*)'
  'int clk_set_phase(struct clk*, int)'
  'int device_attach(struct device*)'
  'int extcon_sync(struct extcon_dev*, unsigned int)'
  'struct gpio_desc* gpiod_get_from_of_node(const struct device_node*, const char*, int, enum gpiod_flags, const char*)'
  'struct pwm_device* pwm_get(struct device*, const char*)'
  'void usb_remove_phy(struct usb_phy*)'
  'struct usb_role_switch* usb_role_switch_find_by_fwnode(const struct fwnode_handle*)'
  'enum usb_role usb_role_switch_get_role(struct usb_role_switch*)'

1 variable symbol(s) added
  'struct tracepoint __tracepoint_dwc3_writel'

Bug: 289850528
Change-Id: I5c28af47c863f6bd3a40f0fe520a6dfc82a04630
Signed-off-by: James Tai <james.tai@realtek.com>
2023-07-06 23:47:49 +00:00
Jaegeuk Kim
0765cda329 UPSTREAM: f2fs: fix deadlock in i_xattr_sem and inode page lock
Thread #1:

[122554.641906][   T92]  f2fs_getxattr+0xd4/0x5fc
    -> waiting for f2fs_down_read(&F2FS_I(inode)->i_xattr_sem);

[122554.641927][   T92]  __f2fs_get_acl+0x50/0x284
[122554.641948][   T92]  f2fs_init_acl+0x84/0x54c
[122554.641969][   T92]  f2fs_init_inode_metadata+0x460/0x5f0
[122554.641990][   T92]  f2fs_add_inline_entry+0x11c/0x350
    -> Locked dir->inode_page by f2fs_get_node_page()

[122554.642009][   T92]  f2fs_do_add_link+0x100/0x1e4
[122554.642025][   T92]  f2fs_create+0xf4/0x22c
[122554.642047][   T92]  vfs_create+0x130/0x1f4

Thread #2:

[123996.386358][   T92]  __get_node_page+0x8c/0x504
    -> waiting for dir->inode_page lock

[123996.386383][   T92]  read_all_xattrs+0x11c/0x1f4
[123996.386405][   T92]  __f2fs_setxattr+0xcc/0x528
[123996.386424][   T92]  f2fs_setxattr+0x158/0x1f4
    -> f2fs_down_write(&F2FS_I(inode)->i_xattr_sem);

[123996.386443][   T92]  __f2fs_set_acl+0x328/0x430
[123996.386618][   T92]  f2fs_set_acl+0x38/0x50
[123996.386642][   T92]  posix_acl_chmod+0xc8/0x1c8
[123996.386669][   T92]  f2fs_setattr+0x5e0/0x6bc
[123996.386689][   T92]  notify_change+0x4d8/0x580
[123996.386717][   T92]  chmod_common+0xd8/0x184
[123996.386748][   T92]  do_fchmodat+0x60/0x124
[123996.386766][   T92]  __arm64_sys_fchmodat+0x28/0x3c

Bug: 280545073
Fixes: 27161f13e3 "f2fs: avoid race in between read xattr & write xattr"
Cc: <stable@vger.kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
(cherry picked from commit 82d8a4f642421ece594542e1fabc689dcb094b1a)
Change-Id: Iec383216e1887e11c69374d28e4ecdedda133919
2023-07-06 17:45:20 +00:00
Jaegeuk Kim
38fff8f312 Revert "FROMLIST: f2fs: remove i_xattr_sem to avoid deadlock and fix the original issue"
This reverts commit 21061b7d0f.

Let's use the upstream version.

Bug: 280545073
Signed-off-by: Jaegeuk Kim <jaegeuk@google.com>
Change-Id: Idcdc94d6bd6b6272535a49c8639517ef1bddb246
2023-07-05 17:27:51 -07:00
Zheng Wang
60a2ccabe2 UPSTREAM: usb: gadget: udc: renesas_usb3: Fix use after free bug in renesas_usb3_remove due to race condition
[ Upstream commit 2b947f8769 ]

In renesas_usb3_probe, role_work is bound with renesas_usb3_role_work.
renesas_usb3_start will be called to start the work.

If we remove the driver which will call usbhs_remove, there may be
an unfinished work. The possible sequence is as follows:

CPU0                  			CPU1

                    			 renesas_usb3_role_work
renesas_usb3_remove
usb_role_switch_unregister
device_unregister
kfree(sw)
//free usb3->role_sw
                    			 usb_role_switch_set_role
                    			 //use usb3->role_sw

The usb3->role_sw could be freed under such circumstance and then
used in usb_role_switch_set_role.

This bug was found by static analysis. And note that removing a
driver is a root-only operation, and should never happen in normal
case. But the root user may directly remove the device which
will also trigger the remove function.

Fix it by canceling the work before cleanup in the renesas_usb3_remove.

Bug: 289003615
Fixes: 39facfa01c ("usb: gadget: udc: renesas_usb3: Add register of usb role switch")
Signed-off-by: Zheng Wang <zyytlz.wz@163.com>
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Link: https://lore.kernel.org/r/20230320062931.505170-1-zyytlz.wz@163.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit df23805209)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I79a1dbeba9a90ee5daf94648ef6a32207b283561
2023-07-04 15:08:48 +01:00
Zheng Wang
ebe7bbdffd UPSTREAM: media: rkvdec: fix use after free bug in rkvdec_remove
[ Upstream commit 3228cec23b ]

In rkvdec_probe, rkvdec->watchdog_work is bound with
rkvdec_watchdog_func. Then rkvdec_vp9_run may
be called to start the work.

If we remove the module which will call rkvdec_remove
 to make cleanup, there may be a unfinished work.
 The possible sequence is as follows, which will
 cause a typical UAF bug.

Fix it by canceling the work before cleanup in rkvdec_remove.

CPU0                  CPU1

                    |rkvdec_watchdog_func
rkvdec_remove       |
 rkvdec_v4l2_cleanup|
  v4l2_m2m_release  |
    kfree(m2m_dev); |
                    |
                    | v4l2_m2m_get_curr_priv
                    |   m2m_dev->curr_ctx //use

Bug: 289003637
Fixes: cd33c83044 ("media: rkvdec: Add the rkvdec driver")
Signed-off-by: Zheng Wang <zyytlz.wz@163.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 6a17add9c6)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: Ibdf4667315d98ac1cd42545f61e271c291893edd
2023-07-04 09:44:15 +00:00
Zhang Zhengming
4d634bb7be UPSTREAM: relayfs: fix out-of-bounds access in relay_file_read
commit 43ec16f145 upstream.

There is a crash in relay_file_read, as the var from
point to the end of last subbuf.

The oops looks something like:
pc : __arch_copy_to_user+0x180/0x310
lr : relay_file_read+0x20c/0x2c8
Call trace:
 __arch_copy_to_user+0x180/0x310
 full_proxy_read+0x68/0x98
 vfs_read+0xb0/0x1d0
 ksys_read+0x6c/0xf0
 __arm64_sys_read+0x20/0x28
 el0_svc_common.constprop.3+0x84/0x108
 do_el0_svc+0x74/0x90
 el0_svc+0x1c/0x28
 el0_sync_handler+0x88/0xb0
 el0_sync+0x148/0x180

We get the condition by analyzing the vmcore:

1). The last produced byte and last consumed byte
    both at the end of the last subbuf

2). A softirq calls function(e.g __blk_add_trace)
    to write relay buffer occurs when an program is calling
    relay_file_read_avail().

        relay_file_read
                relay_file_read_avail
                        relay_file_read_consume(buf, 0, 0);
                        //interrupted by softirq who will write subbuf
                        ....
                        return 1;
                //read_start point to the end of the last subbuf
                read_start = relay_file_read_start_pos
                //avail is equal to subsize
                avail = relay_file_read_subbuf_avail
                //from  points to an invalid memory address
                from = buf->start + read_start
                //system is crashed
                copy_to_user(buffer, from, avail)

Bug: 288957094
Link: https://lkml.kernel.org/r/20230419040203.37676-1-zhang.zhengming@h3c.com
Fixes: 8d62fdebda ("relay file read: start-pos fix")
Signed-off-by: Zhang Zhengming <zhang.zhengming@h3c.com>
Reviewed-by: Zhao Lei <zhao_lei1@hoperun.com>
Reviewed-by: Zhou Kete <zhou.kete@h3c.com>
Reviewed-by: Pengcheng Yang <yangpc@wangsu.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit f6ee841ff2)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: Ibbdf65d8bf2268c3e8c09520f595167a2ed41e8b
2023-07-04 08:30:26 +01:00
Maciej Żenczykowski
b8cb7eb0b4 BACKPORT: revert "net: align SO_RCVMARK required privileges with SO_MARK"
This reverts commit 1f86123b97 ("net: align SO_RCVMARK required
privileges with SO_MARK") because the reasoning in the commit message
is not really correct:
  SO_RCVMARK is used for 'reading' incoming skb mark (via cmsg), as such
  it is more equivalent to 'getsockopt(SO_MARK)' which has no priv check
  and retrieves the socket mark, rather than 'setsockopt(SO_MARK) which
  sets the socket mark and does require privs.

  Additionally incoming skb->mark may already be visible if
  sysctl_fwmark_reflect and/or sysctl_tcp_fwmark_accept are enabled.

  Furthermore, it is easier to block the getsockopt via bpf
  (either cgroup setsockopt hook, or via syscall filters)
  then to unblock it if it requires CAP_NET_RAW/ADMIN.

On Android the socket mark is (among other things) used to store
the network identifier a socket is bound to.  Setting it is privileged,
but retrieving it is not.  We'd like unprivileged userspace to be able
to read the network id of incoming packets (where mark is set via
iptables [to be moved to bpf])...

An alternative would be to add another sysctl to control whether
setting SO_RCVMARK is privilged or not.
(or even a MASK of which bits in the mark can be exposed)
But this seems like over-engineering...

Note: This is a non-trivial revert, due to later merged commit e42c7beee7
("bpf: net: Consider has_current_bpf_ctx() when testing capable() in sk_setsockopt()")
which changed both 'ns_capable' into 'sockopt_ns_capable' calls.

Bug: 254441685
Fixes: 1f86123b97 ("net: align SO_RCVMARK required privileges with SO_MARK")
Cc: Larysa Zaremba <larysa.zaremba@intel.com>
Cc: Simon Horman <simon.horman@corigine.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Eyal Birger <eyal.birger@gmail.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Patrick Rohr <prohr@google.com>
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20230618103130.51628-1-maze@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
(cherry picked from commit a9628e8877)
[Lee: Fixed trivial merge conflict - result is the same]
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: Iee4d495734536509c1fc4db61879113a311e4033
2023-07-03 19:15:32 +00:00
Benjamin Berg
9b46997240 UPSTREAM: wifi: cfg80211: fix link del callback to call correct handler
The wrapper function was incorrectly calling the add handler instead of
the del handler. This had no negative side effect as the default
handlers are essentially identical.

Bug: 254441685
Fixes: f2a0290b2d ("wifi: cfg80211: add optional link add/remove callbacks")
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230608163202.ebd00e000459.Iaff7dc8d1cdecf77f53ea47a0e5080caa36ea02a@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
(cherry picked from commit 1ff56684fa)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I8e3c2ff671437de0fdc3c40b9ee7c9eb7849eec9
2023-07-03 19:15:28 +00:00
Johannes Berg
dc11ed25f7 UPSTREAM: wifi: cfg80211: reject bad AP MLD address
When trying to authenticate, if the AP MLD address isn't
a valid address, mac80211 can throw a warning. Avoid that
by rejecting such addresses.

Bug: 254441685
Fixes: d648c23024 ("wifi: nl80211: support MLO in auth/assoc")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230604120651.89188912bd1d.I8dbc6c8ee0cb766138803eec59508ef4ce477709@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
(cherry picked from commit 727073ca5e)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I20872bbd7f27db4fcbcd0b2b40f1835a175247f8
2023-07-03 19:15:21 +00:00
Akihiko Odaki
2e6bf292f3 UPSTREAM: KVM: arm64: Populate fault info for watchpoint
When handling ESR_ELx_EC_WATCHPT_LOW, far_el2 member of struct
kvm_vcpu_fault_info will be copied to far member of struct
kvm_debug_exit_arch and exposed to the userspace. The userspace will
see stale values from older faults if the fault info does not get
populated.

Bug: 254441685
Fixes: 8fb2046180 ("KVM: arm64: Move early handlers to per-EC handlers")
Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230530024651.10014-1-akihiko.odaki@daynix.com
Cc: stable@vger.kernel.org
(cherry picked from commit 811154e234)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I3d6dfed43293fbcd60898943e41ef2e3f6697a9f
2023-07-03 19:15:21 +00:00
Michal Luczaj
c8a3a08497 UPSTREAM: KVM: Fix vcpu_array[0] races
In kvm_vm_ioctl_create_vcpu(), add vcpu to vcpu_array iff it's safe to
access vcpu via kvm_get_vcpu() and kvm_for_each_vcpu(), i.e. when there's
no failure path requiring vcpu removal and destruction. Such order is
important because vcpu_array accessors may end up referencing vcpu at
vcpu_array[0] even before online_vcpus is set to 1.

When online_vcpus=0, any call to kvm_get_vcpu() goes through
array_index_nospec() and ends with an attempt to xa_load(vcpu_array, 0):

	int num_vcpus = atomic_read(&kvm->online_vcpus);
	i = array_index_nospec(i, num_vcpus);
	return xa_load(&kvm->vcpu_array, i);

Similarly, when online_vcpus=0, a kvm_for_each_vcpu() does not iterate over
an "empty" range, but actually [0, ULONG_MAX]:

	xa_for_each_range(&kvm->vcpu_array, idx, vcpup, 0, \
			  (atomic_read(&kvm->online_vcpus) - 1))

In both cases, such online_vcpus=0 edge case, even if leading to
unnecessary calls to XArray API, should not be an issue; requesting
unpopulated indexes/ranges is handled by xa_load() and xa_for_each_range().

However, this means that when the first vCPU is created and inserted in
vcpu_array *and* before online_vcpus is incremented, code calling
kvm_get_vcpu()/kvm_for_each_vcpu() already has access to that first vCPU.

This should not pose a problem assuming that once a vcpu is stored in
vcpu_array, it will remain there, but that's not the case:
kvm_vm_ioctl_create_vcpu() first inserts to vcpu_array, then requests a
file descriptor. If create_vcpu_fd() fails, newly inserted vcpu is removed
from the vcpu_array, then destroyed:

	vcpu->vcpu_idx = atomic_read(&kvm->online_vcpus);
	r = xa_insert(&kvm->vcpu_array, vcpu->vcpu_idx, vcpu, GFP_KERNEL_ACCOUNT);
	kvm_get_kvm(kvm);
	r = create_vcpu_fd(vcpu);
	if (r < 0) {
		xa_erase(&kvm->vcpu_array, vcpu->vcpu_idx);
		kvm_put_kvm_no_destroy(kvm);
		goto unlock_vcpu_destroy;
	}
	atomic_inc(&kvm->online_vcpus);

This results in a possible race condition when a reference to a vcpu is
acquired (via kvm_get_vcpu() or kvm_for_each_vcpu()) moments before said
vcpu is destroyed.

Bug: 254441685
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Message-Id: <20230510140410.1093987-2-mhal@rbox.co>
Cc: stable@vger.kernel.org
Fixes: c5b0775491 ("KVM: Convert the kvm->vcpus array to a xarray", 2021-12-08)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit afb2acb2e3)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I79735110d2e95dddb8181c72716a24cd87736094
2023-07-03 19:15:21 +00:00
Arnd Bergmann
d18fa8c525 UPSTREAM: media: pvrusb2: fix DVB_CORE dependency
Now that DVB_CORE can be a loadable module, pvrusb2 can run into
a link error:

ld.lld: error: undefined symbol: dvb_module_probe
>>> referenced by pvrusb2-devattr.c
>>>               drivers/media/usb/pvrusb2/pvrusb2-devattr.o:(pvr2_lgdt3306a_attach) in archive vmlinux.a
ld.lld: error: undefined symbol: dvb_module_release
>>> referenced by pvrusb2-devattr.c
>>>               drivers/media/usb/pvrusb2/pvrusb2-devattr.o:(pvr2_dual_fe_attach) in archive vmlinux.a

Refine the Kconfig dependencies to avoid this case.

Bug: 254441685
Link: https://lore.kernel.org/linux-media/20230117171055.2714621-1-arnd@kernel.org
Fixes: 7655c342db ("media: Kconfig: Make DVB_CORE=m possible when MEDIA_SUPPORT=y")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
(cherry picked from commit 53558de2b5)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: Ib47c2e59aff511becce09e81c71c9eeb01695a67
2023-07-03 19:15:16 +00:00
Mark Rutland
f4aace942a UPSTREAM: kasan: hw_tags: avoid invalid virt_to_page()
When booting with 'kasan.vmalloc=off', a kernel configured with support
for KASAN_HW_TAGS will explode at boot time due to bogus use of
virt_to_page() on a vmalloc adddress.  With CONFIG_DEBUG_VIRTUAL selected
this will be reported explicitly, and with or without CONFIG_DEBUG_VIRTUAL
the kernel will dereference a bogus address:

| ------------[ cut here ]------------
| virt_to_phys used for non-linear address: (____ptrval____) (0xffff800008000000)
| WARNING: CPU: 0 PID: 0 at arch/arm64/mm/physaddr.c:15 __virt_to_phys+0x78/0x80
| Modules linked in:
| CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.3.0-rc3-00073-g83865133300d-dirty #4
| Hardware name: linux,dummy-virt (DT)
| pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
| pc : __virt_to_phys+0x78/0x80
| lr : __virt_to_phys+0x78/0x80
| sp : ffffcd076afd3c80
| x29: ffffcd076afd3c80 x28: 0068000000000f07 x27: ffff800008000000
| x26: fffffbfff0000000 x25: fffffbffff000000 x24: ff00000000000000
| x23: ffffcd076ad3c000 x22: fffffc0000000000 x21: ffff800008000000
| x20: ffff800008004000 x19: ffff800008000000 x18: ffff800008004000
| x17: 666678302820295f x16: ffffffffffffffff x15: 0000000000000004
| x14: ffffcd076b009e88 x13: 0000000000000fff x12: 0000000000000003
| x11: 00000000ffffefff x10: c0000000ffffefff x9 : 0000000000000000
| x8 : 0000000000000000 x7 : 205d303030303030 x6 : 302e30202020205b
| x5 : ffffcd076b41d63f x4 : ffffcd076afd3827 x3 : 0000000000000000
| x2 : 0000000000000000 x1 : ffffcd076afd3a30 x0 : 000000000000004f
| Call trace:
|  __virt_to_phys+0x78/0x80
|  __kasan_unpoison_vmalloc+0xd4/0x478
|  __vmalloc_node_range+0x77c/0x7b8
|  __vmalloc_node+0x54/0x64
|  init_IRQ+0x94/0xc8
|  start_kernel+0x194/0x420
|  __primary_switched+0xbc/0xc4
| ---[ end trace 0000000000000000 ]---
| Unable to handle kernel paging request at virtual address 03fffacbe27b8000
| Mem abort info:
|   ESR = 0x0000000096000004
|   EC = 0x25: DABT (current EL), IL = 32 bits
|   SET = 0, FnV = 0
|   EA = 0, S1PTW = 0
|   FSC = 0x04: level 0 translation fault
| Data abort info:
|   ISV = 0, ISS = 0x00000004
|   CM = 0, WnR = 0
| swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000041bc5000
| [03fffacbe27b8000] pgd=0000000000000000, p4d=0000000000000000
| Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
| Modules linked in:
| CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W          6.3.0-rc3-00073-g83865133300d-dirty #4
| Hardware name: linux,dummy-virt (DT)
| pstate: 200000c5 (nzCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
| pc : __kasan_unpoison_vmalloc+0xe4/0x478
| lr : __kasan_unpoison_vmalloc+0xd4/0x478
| sp : ffffcd076afd3ca0
| x29: ffffcd076afd3ca0 x28: 0068000000000f07 x27: ffff800008000000
| x26: 0000000000000000 x25: 03fffacbe27b8000 x24: ff00000000000000
| x23: ffffcd076ad3c000 x22: fffffc0000000000 x21: ffff800008000000
| x20: ffff800008004000 x19: ffff800008000000 x18: ffff800008004000
| x17: 666678302820295f x16: ffffffffffffffff x15: 0000000000000004
| x14: ffffcd076b009e88 x13: 0000000000000fff x12: 0000000000000001
| x11: 0000800008000000 x10: ffff800008000000 x9 : ffffb2f8dee00000
| x8 : 000ffffb2f8dee00 x7 : 205d303030303030 x6 : 302e30202020205b
| x5 : ffffcd076b41d63f x4 : ffffcd076afd3827 x3 : 0000000000000000
| x2 : 0000000000000000 x1 : ffffcd076afd3a30 x0 : ffffb2f8dee00000
| Call trace:
|  __kasan_unpoison_vmalloc+0xe4/0x478
|  __vmalloc_node_range+0x77c/0x7b8
|  __vmalloc_node+0x54/0x64
|  init_IRQ+0x94/0xc8
|  start_kernel+0x194/0x420
|  __primary_switched+0xbc/0xc4
| Code: d34cfc08 aa1f03fa 8b081b39 d503201f (f9400328)
| ---[ end trace 0000000000000000 ]---
| Kernel panic - not syncing: Attempted to kill the idle task!

This is because init_vmalloc_pages() erroneously calls virt_to_page() on
a vmalloc address, while virt_to_page() is only valid for addresses in
the linear/direct map. Since init_vmalloc_pages() expects virtual
addresses in the vmalloc range, it must use vmalloc_to_page() rather
than virt_to_page().

We call init_vmalloc_pages() from __kasan_unpoison_vmalloc(), where we
check !is_vmalloc_or_module_addr(), suggesting that we might encounter a
non-vmalloc address. Luckily, this never happens. By design, we only
call __kasan_unpoison_vmalloc() on pointers in the vmalloc area, and I
have verified that we don't violate that expectation. Given that,
is_vmalloc_or_module_addr() must always be true for any legitimate
argument to __kasan_unpoison_vmalloc().

Correct init_vmalloc_pages() to use vmalloc_to_page(), and remove the
redundant and misleading use of is_vmalloc_or_module_addr() in
__kasan_unpoison_vmalloc().

Bug: 254441685
Link: https://lkml.kernel.org/r/20230418164212.1775741-1-mark.rutland@arm.com
Fixes: 6c2f761dad ("kasan: fix zeroing vmalloc memory with HW_TAGS")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
(cherry picked from commit 29083fd84d)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I64bebeea4b1625e8f648ef6f99b99cc1dd4e6faa
2023-07-03 19:15:11 +00:00
Alice Chao
8f4b51c499 UPSTREAM: scsi: ufs: core: mcq: Fix &hwq->cq_lock deadlock issue
When ufshcd_err_handler() is executed, CQ event interrupt can enter waiting
for the same lock. This can happen in ufshcd_handle_mcq_cq_events() and
also in ufs_mtk_mcq_intr(). The following warning message will be generated
when &hwq->cq_lock is used in IRQ context with IRQ enabled. Use
ufshcd_mcq_poll_cqe_lock() with spin_lock_irqsave instead of spin_lock to
resolve the deadlock issue.

[name:lockdep&]WARNING: inconsistent lock state
[name:lockdep&]--------------------------------
[name:lockdep&]inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
[name:lockdep&]kworker/u16:4/260 [HC0[0]:SC0[0]:HE1:SE1] takes:
  ffffff8028444600 (&hwq->cq_lock){?.-.}-{2:2}, at:
ufshcd_mcq_poll_cqe_lock+0x30/0xe0
[name:lockdep&]{IN-HARDIRQ-W} state was registered at:
  lock_acquire+0x17c/0x33c
  _raw_spin_lock+0x5c/0x7c
  ufshcd_mcq_poll_cqe_lock+0x30/0xe0
  ufs_mtk_mcq_intr+0x60/0x1bc [ufs_mediatek_mod]
  __handle_irq_event_percpu+0x140/0x3ec
  handle_irq_event+0x50/0xd8
  handle_fasteoi_irq+0x148/0x2b0
  generic_handle_domain_irq+0x4c/0x6c
  gic_handle_irq+0x58/0x134
  call_on_irq_stack+0x40/0x74
  do_interrupt_handler+0x84/0xe4
  el1_interrupt+0x3c/0x78
<snip>

Possible unsafe locking scenario:
       CPU0
       ----
  lock(&hwq->cq_lock);
  <Interrupt>
    lock(&hwq->cq_lock);
  *** DEADLOCK ***
2 locks held by kworker/u16:4/260:

[name:lockdep&]
 stack backtrace:
CPU: 7 PID: 260 Comm: kworker/u16:4 Tainted: G S      W  OE
6.1.17-mainline-android14-2-g277223301adb #1
Workqueue: ufs_eh_wq_0 ufshcd_err_handler

 Call trace:
  dump_backtrace+0x10c/0x160
  show_stack+0x20/0x30
  dump_stack_lvl+0x98/0xd8
  dump_stack+0x20/0x60
  print_usage_bug+0x584/0x76c
  mark_lock_irq+0x488/0x510
  mark_lock+0x1ec/0x25c
  __lock_acquire+0x4d8/0xffc
  lock_acquire+0x17c/0x33c
  _raw_spin_lock+0x5c/0x7c
  ufshcd_mcq_poll_cqe_lock+0x30/0xe0
  ufshcd_poll+0x68/0x1b0
  ufshcd_transfer_req_compl+0x9c/0xc8
  ufshcd_err_handler+0x3bc/0xea0
  process_one_work+0x2f4/0x7e8
  worker_thread+0x234/0x450
  kthread+0x110/0x134
  ret_from_fork+0x10/0x20

Bug: 254441685
Fixes: ed975065c3 ("scsi: ufs: core: mcq: Add completion support in poll")
Reviewed-by: Can Guo <quic_cang@quicinc.com>
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Alice Chao <alice.chao@mediatek.com>
Link: https://lore.kernel.org/r/20230424080400.8955-1-alice.chao@mediatek.com
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 948afc6961)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: If4af26c78561e0fd3f92bd039976380617cc3550
2023-07-03 19:15:07 +00:00
Lee Jones
94fb13dc4f UPSTREAM: x86/mm: Avoid using set_pgd() outside of real PGD pages
commit d082d48737 upstream.

KPTI keeps around two PGDs: one for userspace and another for the
kernel. Among other things, set_pgd() contains infrastructure to
ensure that updates to the kernel PGD are reflected in the user PGD
as well.

One side-effect of this is that set_pgd() expects to be passed whole
pages.  Unfortunately, init_trampoline_kaslr() passes in a single entry:
'trampoline_pgd_entry'.

When KPTI is on, set_pgd() will update 'trampoline_pgd_entry' (an
8-Byte globally stored [.bss] variable) and will then proceed to
replicate that value into the non-existent neighboring user page
(located +4k away), leading to the corruption of other global [.bss]
stored variables.

Fix it by directly assigning 'trampoline_pgd_entry' and avoiding
set_pgd().

[ dhansen: tweak subject and changelog ]

Bug: 274115504
Fixes: 0925dda596 ("x86/mm/KASLR: Use only one PUD entry for real mode trampoline")
Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/all/20230614163859.924309-1-lee@kernel.org/g
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 364fdcbb03)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: Idc1fc494d7ccb4a8a3765e1f46482583b528a584
2023-07-03 19:14:28 +00:00
Pablo Neira Ayuso
759c5c3fc2 UPSTREAM: netfilter: nf_tables: incorrect error path handling with NFT_MSG_NEWRULE
[ Upstream commit 1240eb93f0 ]

In case of error when adding a new rule that refers to an anonymous set,
deactivate expressions via NFT_TRANS_PREPARE state, not NFT_TRANS_RELEASE.
Thus, the lookup expression marks anonymous sets as inactive in the next
generation to ensure it is not reachable in this transaction anymore and
decrement the set refcount as introduced by c1592a8994 ("netfilter:
nf_tables: deactivate anonymous set from preparation phase"). The abort
step takes care of undoing the anonymous set.

This is also consistent with rule deletion, where NFT_TRANS_PREPARE is
used. Note that this error path is exercised in the preparation step of
the commit protocol. This patch replaces nf_tables_rule_release() by the
deactivate and destroy calls, this time with NFT_TRANS_PREPARE.

Due to this incorrect error handling, it is possible to access a
dangling pointer to the anonymous set that remains in the transaction
list.

[1009.379054] BUG: KASAN: use-after-free in nft_set_lookup_global+0x147/0x1a0 [nf_tables]
[1009.379106] Read of size 8 at addr ffff88816c4c8020 by task nft-rule-add/137110
[1009.379116] CPU: 7 PID: 137110 Comm: nft-rule-add Not tainted 6.4.0-rc4+ #256
[1009.379128] Call Trace:
[1009.379132]  <TASK>
[1009.379135]  dump_stack_lvl+0x33/0x50
[1009.379146]  ? nft_set_lookup_global+0x147/0x1a0 [nf_tables]
[1009.379191]  print_address_description.constprop.0+0x27/0x300
[1009.379201]  kasan_report+0x107/0x120
[1009.379210]  ? nft_set_lookup_global+0x147/0x1a0 [nf_tables]
[1009.379255]  nft_set_lookup_global+0x147/0x1a0 [nf_tables]
[1009.379302]  nft_lookup_init+0xa5/0x270 [nf_tables]
[1009.379350]  nf_tables_newrule+0x698/0xe50 [nf_tables]
[1009.379397]  ? nf_tables_rule_release+0xe0/0xe0 [nf_tables]
[1009.379441]  ? kasan_unpoison+0x23/0x50
[1009.379450]  nfnetlink_rcv_batch+0x97c/0xd90 [nfnetlink]
[1009.379470]  ? nfnetlink_rcv_msg+0x480/0x480 [nfnetlink]
[1009.379485]  ? __alloc_skb+0xb8/0x1e0
[1009.379493]  ? __alloc_skb+0xb8/0x1e0
[1009.379502]  ? entry_SYSCALL_64_after_hwframe+0x46/0xb0
[1009.379509]  ? unwind_get_return_address+0x2a/0x40
[1009.379517]  ? write_profile+0xc0/0xc0
[1009.379524]  ? avc_lookup+0x8f/0xc0
[1009.379532]  ? __rcu_read_unlock+0x43/0x60

Bug: 289230343
Fixes: 958bee14d0 ("netfilter: nf_tables: use new transaction infrastructure to handle sets")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 4aaa3b730d)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: Ia62fea0e2c2c2cf944dde80751a9dfb85108e758
2023-07-03 19:13:36 +00:00
Hangyu Hua
be89d165e3 UPSTREAM: net/sched: flower: fix possible OOB write in fl_set_geneve_opt()
[ Upstream commit 4d56304e58 ]

If we send two TCA_FLOWER_KEY_ENC_OPTS_GENEVE packets and their total
size is 252 bytes(key->enc_opts.len = 252) then
key->enc_opts.len = opt->length = data_len / 4 = 0 when the third
TCA_FLOWER_KEY_ENC_OPTS_GENEVE packet enters fl_set_geneve_opt. This
bypasses the next bounds check and results in an out-of-bounds.

Bug: 288660424
Fixes: 0a6e77784f ("net/sched: allow flower to match tunnel options")
Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Link: https://lore.kernel.org/r/20230531102805.27090-1-hbh25y@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 45f47d2cf1)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I53c534b7d43f4c7da5a9f63556c79d35797aa598
2023-07-03 19:13:18 +00:00
Alex Williamson
4ae6b40b7c UPSTREAM: PCI/PM: Extend D3hot delay for NVIDIA HDA controllers
Assignment of NVIDIA Ampere-based GPUs have seen a regression since the
below referenced commit, where the reduced D3hot transition delay appears
to introduce a small window where a D3hot->D0 transition followed by a bus
reset can wedge the device.  The entire device is subsequently unavailable,
returning -1 on config space read and is unrecoverable without a host
reset.

This has been observed with RTX A2000 and A5000 GPU and audio functions
assigned to a Windows VM, where shutdown of the VM places the devices in
D3hot prior to vfio-pci performing a bus reset when userspace releases the
devices.  The issue has roughly a 2-3% chance of occurring per shutdown.

Restoring the HDA controller d3hot_delay to the effective value before the
below commit has been shown to resolve the issue.  NVIDIA confirms this
change should be safe for all of their HDA controllers.

Bug: 254441685
Fixes: 3e347969a5 ("PCI/PM: Reduce D3hot delay with usleep_range()")
Link: https://lore.kernel.org/r/20230413194042.605768-1-alex.williamson@redhat.com
Reported-by: Zhiyi Guo <zhguo@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Tarun Gupta <targupta@nvidia.com>
Cc: Abhishek Sahu <abhsahu@nvidia.com>
Cc: Tarun Gupta <targupta@nvidia.com>
(cherry picked from commit a5a6dd2624)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: Ie8bb6c852e041ce16b4f9086c42030dc24375602
2023-07-03 15:07:32 +01:00
Johannes Berg
738dfcc029 UPSTREAM: wifi: cfg80211: fix MLO connection ownership
When disconnecting from an MLO connection we need the AP
MLD address, not an arbitrary BSSID. Fix the code to do
that.

Bug: 254441685
Fixes: 9ecff10e82 ("wifi: nl80211: refactor BSS lookup in nl80211_associate()")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230301115906.4c1b3b18980e.I008f070c7f3b8e8bde9278101ef9e40706a82902@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
(cherry picked from commit 96c0695083)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I87b3b861644d89f9d4e7274a56cdc68657a1db40
2023-07-03 15:06:40 +01:00
Johannes Berg
d0e0e85d34 UPSTREAM: wifi: nl80211: fix NULL-ptr deref in offchan check
If, e.g. in AP mode, the link was already created by userspace
but not activated yet, it has a chandef but the chandef isn't
valid and has no channel. Check for this and ignore this link.

Bug: 254441685
Fixes: 7b0a0e3c3a ("wifi: cfg80211: do some rework towards MLO link APIs")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230301115906.71bd4803fbb9.Iee39c0f6c2d3a59a8227674dc55d52e38b1090cf@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
(cherry picked from commit f624bb6fad)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I65e12c70a6142b73e98fa716586271c1ddf4ad56
2023-07-03 15:06:34 +01:00
Asutosh Das
9e7678cc60 UPSTREAM: scsi: ufs: mcq: Use active_reqs to check busy in clock scaling
Multi Circular Queue doesn't use outstanding_reqs. However, the UFS clock
scaling functions use outstanding_reqs to determine if there are requests
pending. When MCQ is enabled, this check always returns false.

Hence use active_reqs to check if there are pending requests.

Bug: 254441685
Fixes: eacb139b77 ("scsi: ufs: core: mcq: Enable multi-circular queue")
Signed-off-by: Asutosh Das <quic_asutoshd@quicinc.com>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Link: https://lore.kernel.org/r/a24e0d646aac70eae0fc5e05fac0c58bb7e6e680.1678317160.git.quic_asutoshd@quicinc.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit c6001025d5)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I5b5b5ab61971c60980ed66d05b026fad7100fade
2023-07-03 15:06:27 +01:00
Asutosh Das
9d0d5eacda UPSTREAM: scsi: ufs: mcq: qcom: Clean the return path of ufs_qcom_mcq_config_resource()
Smatch static checker reported:
drivers/ufs/host/ufs-qcom.c:1469
ufs_qcom_mcq_config_resource() info: returning a literal zero is
cleaner

Fix the above warning by returning in place instead of a jump to a label.
Also remove the usage of devm_kfree() as it's unnecessary in this function.

Bug: 254441685
Fixes: c263b4ef73 ("scsi: ufs: core: mcq: Configure resource regions")
Reported-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Asutosh Das <quic_asutoshd@quicinc.com>
Link: https://lore.kernel.org/r/3ebd2582af74b81ef7b57149f57c6a3bf0963953.1677721229.git.quic_asutoshd@quicinc.com
Reviewed-by: Bjorn Andersson <andersson@kernel.org>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit c9507eab9f)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I1cd2d0455ed5c5ce8338a34de9ad7f1baf680a05
2023-07-03 15:06:22 +01:00
Asutosh Das
fa5c4a2186 UPSTREAM: scsi: ufs: mcq: qcom: Fix passing zero to PTR_ERR
Fix an error case in ufs_qcom_mcq_config_resource(), where the return value
is set to 0 before passing it to PTR_ERR.

This led to Smatch warning:

drivers/ufs/host/ufs-qcom.c:1455 ufs_qcom_mcq_config_resource() warn:
passing zero to 'PTR_ERR'

Bug: 254441685
Fixes: c263b4ef73 ("scsi: ufs: core: mcq: Configure resource regions")
Reported-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Asutosh Das <quic_asutoshd@quicinc.com>
Link: https://lore.kernel.org/r/94ca99b327af634799ce5f25d0112c28cd00970d.1677721072.git.quic_asutoshd@quicinc.com
Reviewed-by: Bjorn Andersson <andersson@kernel.org>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit c8be073bd2)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: Icbc0e4ac1e30b70c33a7bde8f5e234c8b3bb1a45
2023-07-03 15:04:24 +01:00
Asutosh Das
63ab8dfd17 UPSTREAM: scsi: ufs: mcq: Fix incorrectly set queue depth
ufshcd_config_mcq() may change the can_queue value. The current code
invokes scsi_add_host() before ufshcd_config_mcq() so the tags are
limited to the original can_queue value.

Fix this by invoking scsi_add_host() after ufshcd_config_mcq().

Bug: 254441685
Link: https://lore.kernel.org/r/8840cea4a57b46dabce18acc39afc50ab826330f.1676567593.git.quic_asutoshd@quicinc.com
Fixes: 2468da61ea ("scsi: ufs: core: mcq: Configure operation and runtime interface")
Signed-off-by: Asutosh Das <quic_asutoshd@quicinc.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 2076f57f2c)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I1bf99387e7d52f7739a64175ff48d30ac61ed39c
2023-07-03 15:04:11 +01:00
Eric Dumazet
6423bd5a46 UPSTREAM: net: use a bounce buffer for copying skb->mark
syzbot found arm64 builds would crash in sock_recv_mark()
when CONFIG_HARDENED_USERCOPY=y

x86 and powerpc are not detecting the issue because
they define user_access_begin.
This will be handled in a different patch,
because a check_object_size() is missing.

Only data from skb->cb[] can be copied directly to/from user space,
as explained in commit 79a8a642bf ("net: Whitelist
the skbuff_head_cache "cb" field")

syzbot report was:
usercopy: Kernel memory exposure attempt detected from SLUB object 'skbuff_head_cache' (offset 168, size 4)!
------------[ cut here ]------------
kernel BUG at mm/usercopy.c:102 !
Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP
Modules linked in:
CPU: 0 PID: 4410 Comm: syz-executor533 Not tainted 6.2.0-rc7-syzkaller-17907-g2d3827b3f393 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/21/2023
pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : usercopy_abort+0x90/0x94 mm/usercopy.c:90
lr : usercopy_abort+0x90/0x94 mm/usercopy.c:90
sp : ffff80000fb9b9a0
x29: ffff80000fb9b9b0 x28: ffff0000c6073400 x27: 0000000020001a00
x26: 0000000000000014 x25: ffff80000cf52000 x24: fffffc0000000000
x23: 05ffc00000000200 x22: fffffc000324bf80 x21: ffff0000c92fe1a8
x20: 0000000000000001 x19: 0000000000000004 x18: 0000000000000000
x17: 656a626f2042554c x16: ffff0000c6073dd0 x15: ffff80000dbd2118
x14: ffff0000c6073400 x13: 00000000ffffffff x12: ffff0000c6073400
x11: ff808000081bbb4c x10: 0000000000000000 x9 : 7b0572d7cc0ccf00
x8 : 7b0572d7cc0ccf00 x7 : ffff80000bf650d4 x6 : 0000000000000000
x5 : 0000000000000001 x4 : 0000000000000001 x3 : 0000000000000000
x2 : ffff0001fefbff08 x1 : 0000000100000000 x0 : 000000000000006c
Call trace:
usercopy_abort+0x90/0x94 mm/usercopy.c:90
__check_heap_object+0xa8/0x100 mm/slub.c:4761
check_heap_object mm/usercopy.c:196 [inline]
__check_object_size+0x208/0x6b8 mm/usercopy.c:251
check_object_size include/linux/thread_info.h:199 [inline]
__copy_to_user include/linux/uaccess.h:115 [inline]
put_cmsg+0x408/0x464 net/core/scm.c:238
sock_recv_mark net/socket.c:975 [inline]
__sock_recv_cmsgs+0x1fc/0x248 net/socket.c:984
sock_recv_cmsgs include/net/sock.h:2728 [inline]
packet_recvmsg+0x2d8/0x678 net/packet/af_packet.c:3482
____sys_recvmsg+0x110/0x3a0
___sys_recvmsg net/socket.c:2737 [inline]
__sys_recvmsg+0x194/0x210 net/socket.c:2767
__do_sys_recvmsg net/socket.c:2777 [inline]
__se_sys_recvmsg net/socket.c:2774 [inline]
__arm64_sys_recvmsg+0x2c/0x3c net/socket.c:2774
__invoke_syscall arch/arm64/kernel/syscall.c:38 [inline]
invoke_syscall+0x64/0x178 arch/arm64/kernel/syscall.c:52
el0_svc_common+0xbc/0x180 arch/arm64/kernel/syscall.c:142
do_el0_svc+0x48/0x110 arch/arm64/kernel/syscall.c:193
el0_svc+0x58/0x14c arch/arm64/kernel/entry-common.c:637
el0t_64_sync_handler+0x84/0xf0 arch/arm64/kernel/entry-common.c:655
el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:591
Code: 91388800 aa0903e1 f90003e8 94e6d752 (d4210000)

Bug: 254441685
Fixes: 6fd1d51cfa ("net: SO_RCVMARK socket option for SO_MARK with recvmsg()")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Erin MacNeil <lnx.erin@gmail.com>
Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Link: https://lore.kernel.org/r/20230213160059.3829741-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
(cherry picked from commit 2558b8039d)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I5efc36c872cc640429a8ef538eb5ce043fc8dbb2
2023-07-03 15:04:00 +01:00
Jens Axboe
656563759a UPSTREAM: io_uring: hold uring mutex around poll removal
Snipped from commit 9ca9fb24d5 upstream.

While reworking the poll hashing in the v6.0 kernel, we ended up
grabbing the ctx->uring_lock in poll update/removal. This also fixed
a bug with linked timeouts racing with timeout expiry and poll
removal.

Bring back just the locking fix for that.

Bug: 289229683
Reported-and-tested-by: Querijn Voet <querijnqyn@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 0e388fce7a)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: Ife3683f26b19af1887ae1c59d3bd8b4e1700c79a
2023-07-03 14:41:04 +01:00
Ulises Mendez Martinez
1f5a89e0cc ANDROID: Set arch attribute for allmodconfig builds
* This sets arch attribute for two builds:
  * kernel_x86_64_allmodconfig
  * kernel_arm_allmodconfig

Bug: 272164611
Change-Id: Ica02082ef53e1b08523b47b879716e94330fe5c4
Signed-off-by: Ulises Mendez Martinez <umendez@google.com>
2023-06-30 15:56:09 +00:00
Will Deacon
ceb26af319 ANDROID: KVM: arm64: Remove 'struct kvm_vcpu' from the KMI
With the addition of 'struct pkvm_module_ops' to the Android-14 KMI, we
inadvertently exposing a number of internal KVM data structures via the
unused '__hyp_running_vcpu' member of 'struct kvm_cpu_context'.

Fix up the KMI by making this field a 'void *' for everybody other than
genksyms.

Cc: Matthias Männich <maennich@google.com>
Cc: Quentin Perret <qperret@google.com>
Cc: Vincent Donnefort <vdonnefort@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bug: 288146090
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I54b7fe055830e22e6118779617de2d9259501833
2023-06-29 15:29:48 +00:00
Marc Zyngier
aad223db39 UPSTREAM: KVM: arm64: Restore GICv2-on-GICv3 functionality
When reworking the vgic locking, the vgic distributor registration
got simplified, which was a very good cleanup. But just a tad too
radical, as we now register the *native* vgic only, ignoring the
GICv2-on-GICv3 that allows pre-historic VMs (or so I thought)
to run.

As it turns out, QEMU still defaults to GICv2 in some cases, and
this breaks Nathan's setup!

Fix it by propagating the *requested* vgic type rather than the
host's version.

Fixes: 59112e9c39 ("KVM: arm64: vgic: Fix a circular locking issue")
Reported-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
link: https://lore.kernel.org/r/20230606221525.GA2269598@dev-arch.thelio-3990X
(cherry picked from commit 1caa71a7a6)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 278750073
Change-Id: I3a74c9de0afd9a38f4ca8dd5d4ce27d1937a5705
2023-06-29 14:02:33 +01:00
Jean-Philippe Brucker
2c17fbc0d9 UPSTREAM: KVM: arm64: vgic: Wrap vgic_its_create() with config_lock
vgic_its_create() changes the vgic state without holding the
config_lock, which triggers a lockdep warning in vgic_v4_init():

[  358.667941] WARNING: CPU: 3 PID: 178 at arch/arm64/kvm/vgic/vgic-v4.c:245 vgic_v4_init+0x15c/0x7a8
...
[  358.707410]  vgic_v4_init+0x15c/0x7a8
[  358.708550]  vgic_its_create+0x37c/0x4a4
[  358.709640]  kvm_vm_ioctl+0x1518/0x2d80
[  358.710688]  __arm64_sys_ioctl+0x7ac/0x1ba8
[  358.711960]  invoke_syscall.constprop.0+0x70/0x1e0
[  358.713245]  do_el0_svc+0xe4/0x2d4
[  358.714289]  el0_svc+0x44/0x8c
[  358.715329]  el0t_64_sync_handler+0xf4/0x120
[  358.716615]  el0t_64_sync+0x190/0x194

Wrap the whole of vgic_its_create() with config_lock since, in addition
to calling vgic_v4_init(), it also modifies the global kvm->arch.vgic
state.

Fixes: f003277311 ("KVM: arm64: Use config_lock to protect vgic state")
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230518100914.2837292-3-jean-philippe@linaro.org
(cherry picked from commit 9cf2f840c4)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 278750073
Change-Id: Id6319d5719181072b7202a814c71e9509c0ba865
2023-06-29 14:02:32 +01:00
Jean-Philippe Brucker
ec0944c324 UPSTREAM: KVM: arm64: vgic: Fix a circular locking issue
Lockdep reports a circular lock dependency between the srcu and the
config_lock:

[  262.179917] -> #1 (&kvm->srcu){.+.+}-{0:0}:
[  262.182010]        __synchronize_srcu+0xb0/0x224
[  262.183422]        synchronize_srcu_expedited+0x24/0x34
[  262.184554]        kvm_io_bus_register_dev+0x324/0x50c
[  262.185650]        vgic_register_redist_iodev+0x254/0x398
[  262.186740]        vgic_v3_set_redist_base+0x3b0/0x724
[  262.188087]        kvm_vgic_addr+0x364/0x600
[  262.189189]        vgic_set_common_attr+0x90/0x544
[  262.190278]        vgic_v3_set_attr+0x74/0x9c
[  262.191432]        kvm_device_ioctl+0x2a0/0x4e4
[  262.192515]        __arm64_sys_ioctl+0x7ac/0x1ba8
[  262.193612]        invoke_syscall.constprop.0+0x70/0x1e0
[  262.195006]        do_el0_svc+0xe4/0x2d4
[  262.195929]        el0_svc+0x44/0x8c
[  262.196917]        el0t_64_sync_handler+0xf4/0x120
[  262.198238]        el0t_64_sync+0x190/0x194
[  262.199224]
[  262.199224] -> #0 (&kvm->arch.config_lock){+.+.}-{3:3}:
[  262.201094]        __lock_acquire+0x2b70/0x626c
[  262.202245]        lock_acquire+0x454/0x778
[  262.203132]        __mutex_lock+0x190/0x8b4
[  262.204023]        mutex_lock_nested+0x24/0x30
[  262.205100]        vgic_mmio_write_v3_misc+0x5c/0x2a0
[  262.206178]        dispatch_mmio_write+0xd8/0x258
[  262.207498]        __kvm_io_bus_write+0x1e0/0x350
[  262.208582]        kvm_io_bus_write+0xe0/0x1cc
[  262.209653]        io_mem_abort+0x2ac/0x6d8
[  262.210569]        kvm_handle_guest_abort+0x9b8/0x1f88
[  262.211937]        handle_exit+0xc4/0x39c
[  262.212971]        kvm_arch_vcpu_ioctl_run+0x90c/0x1c04
[  262.214154]        kvm_vcpu_ioctl+0x450/0x12f8
[  262.215233]        __arm64_sys_ioctl+0x7ac/0x1ba8
[  262.216402]        invoke_syscall.constprop.0+0x70/0x1e0
[  262.217774]        do_el0_svc+0xe4/0x2d4
[  262.218758]        el0_svc+0x44/0x8c
[  262.219941]        el0t_64_sync_handler+0xf4/0x120
[  262.221110]        el0t_64_sync+0x190/0x194

Note that the current report, which can be triggered by the vgic_irq
kselftest, is a triple chain that includes slots_lock, but after
inverting the slots_lock/config_lock dependency, the actual problem
reported above remains.

In several places, the vgic code calls kvm_io_bus_register_dev(), which
synchronizes the srcu, while holding config_lock (#1). And the MMIO
handler takes the config_lock while holding the srcu read lock (#0).

Break dependency #1, by registering the distributor and redistributors
without holding config_lock. The ITS also uses kvm_io_bus_register_dev()
but already relies on slots_lock to serialize calls.

The distributor iodev is created on the first KVM_RUN call. Multiple
threads will race for vgic initialization, and only the first one will
see !vgic_ready() under the lock. To serialize those threads, rely on
slots_lock rather than config_lock.

Redistributors are created earlier, through KVM_DEV_ARM_VGIC_GRP_ADDR
ioctls and vCPU creation. Similarly, serialize the iodev creation with
slots_lock, and the rest with config_lock.

Fixes: f003277311 ("KVM: arm64: Use config_lock to protect vgic state")
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230518100914.2837292-2-jean-philippe@linaro.org
(cherry picked from commit 59112e9c39)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 278750073
Change-Id: Ib3b4846646f148af95746d786fc55b589b3217b6
2023-06-29 14:02:32 +01:00
Oliver Upton
e4b31e748a UPSTREAM: KVM: arm64: vgic: Don't acquire its_lock before config_lock
commit f003277311 ("KVM: arm64: Use config_lock to protect vgic
state") was meant to rectify a longstanding lock ordering issue in KVM
where the kvm->lock is taken while holding vcpu->mutex. As it so
happens, the aforementioned commit introduced yet another locking issue
by acquiring the its_lock before acquiring the config lock.

This is obviously wrong, especially considering that the lock ordering
is well documented in vgic.c. Reshuffle the locks once more to take the
config_lock before the its_lock. While at it, sprinkle in the lockdep
hinting that has become popular as of late to keep lockdep apprised of
our ordering.

Cc: stable@vger.kernel.org
Fixes: f003277311 ("KVM: arm64: Use config_lock to protect vgic state")
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230412062733.988229-1-oliver.upton@linux.dev
(cherry picked from commit 49e5d16b6f)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 278750073
Change-Id: If3a7d338bbcc490a7545ace0a8c039bb5e1dcbf0
2023-06-29 14:02:32 +01:00
Oliver Upton
b7e1f97ef7 BACKPORT: KVM: arm64: Avoid lock inversion when setting the VM register width
kvm->lock must be taken outside of the vcpu->mutex. Of course, the
locking documentation for KVM makes this abundantly clear. Nonetheless,
the locking order in KVM/arm64 has been wrong for quite a while; we
acquire the kvm->lock while holding the vcpu->mutex all over the shop.

All was seemingly fine until commit 42a90008f8 ("KVM: Ensure lockdep
knows about kvm->lock vs. vcpu->mutex ordering rule") caught us with our
pants down, leading to lockdep barfing:

 ======================================================
 WARNING: possible circular locking dependency detected
 6.2.0-rc7+ #19 Not tainted
 ------------------------------------------------------
 qemu-system-aar/859 is trying to acquire lock:
 ffff5aa69269eba0 (&host_kvm->lock){+.+.}-{3:3}, at: kvm_reset_vcpu+0x34/0x274

 but task is already holding lock:
 ffff5aa68768c0b8 (&vcpu->mutex){+.+.}-{3:3}, at: kvm_vcpu_ioctl+0x8c/0xba0

 which lock already depends on the new lock.

Add a dedicated lock to serialize writes to VM-scoped configuration from
the context of a vCPU. Protect the register width flags with the new
lock, thus avoiding the need to grab the kvm->lock while holding
vcpu->mutex in kvm_reset_vcpu().

Cc: stable@vger.kernel.org
Reported-by: Jeremy Linton <jeremy.linton@arm.com>
Link: https://lore.kernel.org/kvmarm/f6452cdd-65ff-34b8-bab0-5c06416da5f6@arm.com/
Tested-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230327164747.2466958-3-oliver.upton@linux.dev
(cherry picked from commit c43120afb5)
[willdeacon@: Fix context conflict with pKVM VM type check]
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 278750073
Change-Id: I26d65f63a5e56399ffc4d1f74f62e0c15b37eea1
2023-06-29 14:02:32 +01:00
Oliver Upton
0c5ec70ec3 UPSTREAM: KVM: arm64: Avoid vcpu->mutex v. kvm->lock inversion in CPU_ON
KVM/arm64 had the lock ordering backwards on vcpu->mutex and kvm->lock
from the very beginning. One such example is the way vCPU resets are
handled: the kvm->lock is acquired while handling a guest CPU_ON PSCI
call.

Add a dedicated lock to serialize writes to kvm_vcpu_arch::{mp_state,
reset_state}. Promote all accessors of mp_state to {READ,WRITE}_ONCE()
as readers do not acquire the mp_state_lock. While at it, plug yet
another race by taking the mp_state_lock in the KVM_SET_MP_STATE ioctl
handler.

As changes to MP state are now guarded with a dedicated lock, drop the
kvm->lock acquisition from the PSCI CPU_ON path. Similarly, move the
reader of reset_state outside of the kvm->lock and instead protect it
with the mp_state_lock. Note that writes to reset_state::reset have been
demoted to regular stores as both readers and writers acquire the
mp_state_lock.

While the kvm->lock inversion still exists in kvm_reset_vcpu(), at least
now PSCI CPU_ON no longer depends on it for serializing vCPU reset.

Cc: stable@vger.kernel.org
Tested-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230327164747.2466958-2-oliver.upton@linux.dev
(cherry picked from commit 0acc7239c2)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 278750073
Change-Id: Iaec5533c5d73195eb5006262e4dcd84454cf5ebe
2023-06-29 14:02:31 +01:00
Oliver Upton
60266126b3 BACKPORT: KVM: arm64: Use config_lock to protect data ordered against KVM_RUN
There are various bits of VM-scoped data that can only be configured
before the first call to KVM_RUN, such as the hypercall bitmaps and
the PMU. As these fields are protected by the kvm->lock and accessed
while holding vcpu->mutex, this is yet another example of lock
inversion.

Change out the kvm->lock for kvm->arch.config_lock in all of these
instances. Opportunistically simplify the locking mechanics of the
PMU configuration by holding the config_lock for the entirety of
kvm_arm_pmu_v3_set_attr().

Note that this also addresses a couple of bugs. There is an unguarded
read of the PMU version in KVM_ARM_VCPU_PMU_V3_FILTER which could race
with KVM_ARM_VCPU_PMU_V3_SET_PMU. Additionally, until now writes to the
per-vCPU vPMU irq were not serialized VM-wide, meaning concurrent calls
to KVM_ARM_VCPU_PMU_V3_IRQ could lead to a false positive in
pmu_irq_is_valid().

Cc: stable@vger.kernel.org
Tested-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230327164747.2466958-4-oliver.upton@linux.dev
(cherry picked from commit 4bba7f7def)
[willdeacon@: Fixed context conflict with moved pkvm trap init]
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 278750073
Change-Id: Ibafb1b975b48c854ab981c93f74de1ab582c314d
2023-06-29 14:02:31 +01:00
Oliver Upton
1536afa216 UPSTREAM: KVM: arm64: Use config_lock to protect vgic state
Almost all of the vgic state is VM-scoped but accessed from the context
of a vCPU. These accesses were serialized on the kvm->lock which cannot
be nested within a vcpu->mutex critical section.

Move over the vgic state to using the config_lock. Tweak the lock
ordering where necessary to ensure that the config_lock is acquired
after the vcpu->mutex. Acquire the config_lock in kvm_vgic_create() to
avoid a race between the converted flows and GIC creation. Where
necessary, continue to acquire kvm->lock to avoid a race with vCPU
creation (i.e. flows that use lock_all_vcpus()).

Finally, promote the locking expectations in comments to lockdep
assertions and update the locking documentation for the config_lock as
well as vcpu->mutex.

Cc: stable@vger.kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230327164747.2466958-5-oliver.upton@linux.dev
(cherry picked from commit f003277311)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 278750073
Change-Id: I20403cc5b0ba6baff6ca3dd3e8db6f337602821e
2023-06-29 14:02:31 +01:00
Gavin Shan
1d194af64a BACKPORT: KVM: arm64: Add helper vgic_write_guest_lock()
Currently, the unknown no-running-vcpu sites are reported when a
dirty page is tracked by mark_page_dirty_in_slot(). Until now, the
only known no-running-vcpu site is saving vgic/its tables through
KVM_DEV_ARM_{VGIC_GRP_CTRL, ITS_SAVE_TABLES} command on KVM device
"kvm-arm-vgic-its". Unfortunately, there are more unknown sites to
be handled and no-running-vcpu context will be allowed in these
sites: (1) KVM_DEV_ARM_{VGIC_GRP_CTRL, ITS_RESTORE_TABLES} command
on KVM device "kvm-arm-vgic-its" to restore vgic/its tables. The
vgic3 LPI pending status could be restored. (2) Save vgic3 pending
table through KVM_DEV_ARM_{VGIC_GRP_CTRL, VGIC_SAVE_PENDING_TABLES}
command on KVM device "kvm-arm-vgic-v3".

In order to handle those unknown cases, we need a unified helper
vgic_write_guest_lock(). struct vgic_dist::save_its_tables_in_progress
is also renamed to struct vgic_dist::save_tables_in_progress.

No functional change intended.

Suggested-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230126235451.469087-3-gshan@redhat.com
(cherry picked from commit a23eaf9368)
[willdeacon@: Drop missing dirty-ring hunks]
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 278750073
Change-Id: Ie0dbb02e4f0f360b7554030e67c80d20ac8c1ca3
2023-06-29 14:02:31 +01:00
t.feng
54b1b225ed UPSTREAM: ipvlan:Fix out-of-bounds caused by unclear skb->cb
[ Upstream commit 90cbed5247 ]

If skb enqueue the qdisc, fq_skb_cb(skb)->time_to_send is changed which
is actually skb->cb, and IPCB(skb_in)->opt will be used in
__ip_options_echo. It is possible that memcpy is out of bounds and lead
to stack overflow.
We should clear skb->cb before ip_local_out or ip6_local_out.

v2:
1. clean the stack info
2. use IPCB/IP6CB instead of skb->cb

crash on stable-5.10(reproduce in kasan kernel).
Stack info:
[ 2203.651571] BUG: KASAN: stack-out-of-bounds in
__ip_options_echo+0x589/0x800
[ 2203.653327] Write of size 4 at addr ffff88811a388f27 by task
swapper/3/0
[ 2203.655460] CPU: 3 PID: 0 Comm: swapper/3 Kdump: loaded Not tainted
5.10.0-60.18.0.50.h856.kasan.eulerosv2r11.x86_64 #1
[ 2203.655466] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS rel-1.10.2-0-g5f4c7b1-20181220_000000-szxrtosci10000 04/01/2014
[ 2203.655475] Call Trace:
[ 2203.655481]  <IRQ>
[ 2203.655501]  dump_stack+0x9c/0xd3
[ 2203.655514]  print_address_description.constprop.0+0x19/0x170
[ 2203.655530]  __kasan_report.cold+0x6c/0x84
[ 2203.655586]  kasan_report+0x3a/0x50
[ 2203.655594]  check_memory_region+0xfd/0x1f0
[ 2203.655601]  memcpy+0x39/0x60
[ 2203.655608]  __ip_options_echo+0x589/0x800
[ 2203.655654]  __icmp_send+0x59a/0x960
[ 2203.655755]  nf_send_unreach+0x129/0x3d0 [nf_reject_ipv4]
[ 2203.655763]  reject_tg+0x77/0x1bf [ipt_REJECT]
[ 2203.655772]  ipt_do_table+0x691/0xa40 [ip_tables]
[ 2203.655821]  nf_hook_slow+0x69/0x100
[ 2203.655828]  __ip_local_out+0x21e/0x2b0
[ 2203.655857]  ip_local_out+0x28/0x90
[ 2203.655868]  ipvlan_process_v4_outbound+0x21e/0x260 [ipvlan]
[ 2203.655931]  ipvlan_xmit_mode_l3+0x3bd/0x400 [ipvlan]
[ 2203.655967]  ipvlan_queue_xmit+0xb3/0x190 [ipvlan]
[ 2203.655977]  ipvlan_start_xmit+0x2e/0xb0 [ipvlan]
[ 2203.655984]  xmit_one.constprop.0+0xe1/0x280
[ 2203.655992]  dev_hard_start_xmit+0x62/0x100
[ 2203.656000]  sch_direct_xmit+0x215/0x640
[ 2203.656028]  __qdisc_run+0x153/0x1f0
[ 2203.656069]  __dev_queue_xmit+0x77f/0x1030
[ 2203.656173]  ip_finish_output2+0x59b/0xc20
[ 2203.656244]  __ip_finish_output.part.0+0x318/0x3d0
[ 2203.656312]  ip_finish_output+0x168/0x190
[ 2203.656320]  ip_output+0x12d/0x220
[ 2203.656357]  __ip_queue_xmit+0x392/0x880
[ 2203.656380]  __tcp_transmit_skb+0x1088/0x11c0
[ 2203.656436]  __tcp_retransmit_skb+0x475/0xa30
[ 2203.656505]  tcp_retransmit_skb+0x2d/0x190
[ 2203.656512]  tcp_retransmit_timer+0x3af/0x9a0
[ 2203.656519]  tcp_write_timer_handler+0x3ba/0x510
[ 2203.656529]  tcp_write_timer+0x55/0x180
[ 2203.656542]  call_timer_fn+0x3f/0x1d0
[ 2203.656555]  expire_timers+0x160/0x200
[ 2203.656562]  run_timer_softirq+0x1f4/0x480
[ 2203.656606]  __do_softirq+0xfd/0x402
[ 2203.656613]  asm_call_irq_on_stack+0x12/0x20
[ 2203.656617]  </IRQ>
[ 2203.656623]  do_softirq_own_stack+0x37/0x50
[ 2203.656631]  irq_exit_rcu+0x134/0x1a0
[ 2203.656639]  sysvec_apic_timer_interrupt+0x36/0x80
[ 2203.656646]  asm_sysvec_apic_timer_interrupt+0x12/0x20
[ 2203.656654] RIP: 0010:default_idle+0x13/0x20
[ 2203.656663] Code: 89 f0 5d 41 5c 41 5d 41 5e c3 cc cc cc cc cc cc cc
cc cc cc cc cc cc 0f 1f 44 00 00 0f 1f 44 00 00 0f 00 2d 9f 32 57 00 fb
f4 <c3> cc cc cc cc 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 54 be 08
[ 2203.656668] RSP: 0018:ffff88810036fe78 EFLAGS: 00000256
[ 2203.656676] RAX: ffffffffaf2a87f0 RBX: ffff888100360000 RCX:
ffffffffaf290191
[ 2203.656681] RDX: 0000000000098b5e RSI: 0000000000000004 RDI:
ffff88811a3c4f60
[ 2203.656686] RBP: 0000000000000000 R08: 0000000000000001 R09:
ffff88811a3c4f63
[ 2203.656690] R10: ffffed10234789ec R11: 0000000000000001 R12:
0000000000000003
[ 2203.656695] R13: ffff888100360000 R14: 0000000000000000 R15:
0000000000000000
[ 2203.656729]  default_idle_call+0x5a/0x150
[ 2203.656735]  cpuidle_idle_call+0x1c6/0x220
[ 2203.656780]  do_idle+0xab/0x100
[ 2203.656786]  cpu_startup_entry+0x19/0x20
[ 2203.656793]  secondary_startup_64_no_verify+0xc2/0xcb

[ 2203.657409] The buggy address belongs to the page:
[ 2203.658648] page:0000000027a9842f refcount:1 mapcount:0
mapping:0000000000000000 index:0x0 pfn:0x11a388
[ 2203.658665] flags:
0x17ffffc0001000(reserved|node=0|zone=2|lastcpupid=0x1fffff)
[ 2203.658675] raw: 0017ffffc0001000 ffffea000468e208 ffffea000468e208
0000000000000000
[ 2203.658682] raw: 0000000000000000 0000000000000000 00000001ffffffff
0000000000000000
[ 2203.658686] page dumped because: kasan: bad access detected

To reproduce(ipvlan with IPVLAN_MODE_L3):
Env setting:
=======================================================
modprobe ipvlan ipvlan_default_mode=1
sysctl net.ipv4.conf.eth0.forwarding=1
iptables -t nat -A POSTROUTING -s 20.0.0.0/255.255.255.0 -o eth0 -j
MASQUERADE
ip link add gw link eth0 type ipvlan
ip -4 addr add 20.0.0.254/24 dev gw
ip netns add net1
ip link add ipv1 link eth0 type ipvlan
ip link set ipv1 netns net1
ip netns exec net1 ip link set ipv1 up
ip netns exec net1 ip -4 addr add 20.0.0.4/24 dev ipv1
ip netns exec net1 route add default gw 20.0.0.254
ip netns exec net1 tc qdisc add dev ipv1 root netem loss 10%
ifconfig gw up
iptables -t filter -A OUTPUT -p tcp --dport 8888 -j REJECT --reject-with
icmp-port-unreachable
=======================================================
And then excute the shell(curl any address of eth0 can reach):

for((i=1;i<=100000;i++))
do
        ip netns exec net1 curl x.x.x.x:8888
done
=======================================================

Bug: 289225588
Fixes: 2ad7bf3638 ("ipvlan: Initial check-in of the IPVLAN driver.")
Signed-off-by: "t.feng" <fengtao40@huawei.com>
Suggested-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 610a433810)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I08a12f6e3b1614210867cd23e9071918dc380faf
2023-06-28 17:36:46 +01:00
Lee Jones
b31675307e UPSTREAM: net/sched: cls_u32: Fix reference counter leak leading to overflow
[ Upstream commit 04c55383fa ]

In the event of a failure in tcf_change_indev(), u32_set_parms() will
immediately return without decrementing the recently incremented
reference counter.  If this happens enough times, the counter will
rollover and the reference freed, leading to a double free which can be
used to do 'bad things'.

In order to prevent this, move the point of possible failure above the
point where the reference counter is incremented.  Also save any
meaningful return values to be applied to the return data at the
appropriate point in time.

This issue was caught with KASAN.

Bug: 273251569
Fixes: 705c709126 ("net: sched: cls_u32: no need to call tcf_exts_change for newly allocated struct")
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Lee Jones <lee@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 07f9cc229b)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I95524bfda9a08a40b3d54515e528419dba18dc55
2023-06-26 12:13:49 +00:00
Wanwei Jiang
eda34db29b ANDROID: GKI: Update symbol list for Amlogic
1 function symbol(s) added
  'int netif_set_xps_queue(struct net_device*, const struct cpumask*, u16)'

3 variable symbol(s) added
  'struct static_key_false rfs_needed'
  'u32 rps_cpu_mask'
  'struct rps_sock_flow_table* rps_sock_flow_table'

Bug: 288824236
Change-Id: I5334879e0879bc7c70f7bfc61387a63cca2b8a5b
Signed-off-by: Wanwei Jiang <wanwei.jiang@amlogic.com>
2023-06-26 11:53:35 +00:00
Ulises Mendez Martinez
d8eb5e7ca9 ANDROID: db845c: Fix build when using --kgdb
* CONFIG_WATCHDOG is disabled when compiling with
  --kgdb option, hence the list of modules produced is
  adjusted conditionally.

Bug: 270320056
Change-Id: I0eafb118836e6a31dc3b0392ab7d60b5597b9367
Signed-off-by: Ulises Mendez Martinez <umendez@google.com>
2023-06-23 11:38:47 +00:00
Yifan Hong
d40f3254b6 FROMLIST: kheaders: dereferences the source tree
When the kernel is built inside a sandbox container,
a forest of symlinks to the source files may be
created in the container. In this case, the generated
kheaders.tar.xz should follow these symlinks
to access the source files, instead of packing
the symlinks themselves.

Test: manual (add kheaders_data.tar.xz to the output,
  then examine the contents)
Bug: 276339429
Fixes: b0acbba3f489 ("Revert "Revert "Revert "FROMLIST: kheaders: Follow symlinks to source files."""")
Link: https://lore.kernel.org/lkml/20230420010029.2702543-1-elsk@google.com/
Signed-off-by: Yifan Hong <elsk@google.com>
(cherry picked from https://android-review.googlesource.com/q/commit:28fa7afc424f3dc53358c0e9b080433d78f0cd54)
Merged-In: Ie4db22dfa13d05fdccb3ad8f4fae2fe3fead994e
Change-Id: Ie4db22dfa13d05fdccb3ad8f4fae2fe3fead994e
2023-06-23 09:24:00 +00:00
Jaegeuk Kim
2ebd113814 FROMLIST: f2fs: remove i_xattr_sem to avoid deadlock and fix the original issue
This reverts commit 27161f13e3 "f2fs: avoid race in between read xattr & write xattr".

That introduced a deadlock case:

Thread #1:

[122554.641906][   T92]  f2fs_getxattr+0xd4/0x5fc
    -> waiting for f2fs_down_read(&F2FS_I(inode)->i_xattr_sem);

[122554.641927][   T92]  __f2fs_get_acl+0x50/0x284
[122554.641948][   T92]  f2fs_init_acl+0x84/0x54c
[122554.641969][   T92]  f2fs_init_inode_metadata+0x460/0x5f0
[122554.641990][   T92]  f2fs_add_inline_entry+0x11c/0x350
    -> Locked dir->inode_page by f2fs_get_node_page()

[122554.642009][   T92]  f2fs_do_add_link+0x100/0x1e4
[122554.642025][   T92]  f2fs_create+0xf4/0x22c
[122554.642047][   T92]  vfs_create+0x130/0x1f4

Thread #2:

[123996.386358][   T92]  __get_node_page+0x8c/0x504
    -> waiting for dir->inode_page lock

[123996.386383][   T92]  read_all_xattrs+0x11c/0x1f4
[123996.386405][   T92]  __f2fs_setxattr+0xcc/0x528
[123996.386424][   T92]  f2fs_setxattr+0x158/0x1f4
    -> f2fs_down_write(&F2FS_I(inode)->i_xattr_sem);

[123996.386443][   T92]  __f2fs_set_acl+0x328/0x430
[123996.386618][   T92]  f2fs_set_acl+0x38/0x50
[123996.386642][   T92]  posix_acl_chmod+0xc8/0x1c8
[123996.386669][   T92]  f2fs_setattr+0x5e0/0x6bc
[123996.386689][   T92]  notify_change+0x4d8/0x580
[123996.386717][   T92]  chmod_common+0xd8/0x184
[123996.386748][   T92]  do_fchmodat+0x60/0x124
[123996.386766][   T92]  __arm64_sys_fchmodat+0x28/0x3c

Let's take a look at the original issue back.

Thread A:                                       Thread B:
-f2fs_getxattr
   -lookup_all_xattrs
      -xnid = F2FS_I(inode)->i_xattr_nid;
                                                -f2fs_setxattr
                                                    -__f2fs_setxattr
                                                        -write_all_xattrs
                                                            -truncate_xattr_node
                                                                  ...  ...
                                                -write_checkpoint
                                                                  ...  ...
                                                -alloc_nid   <- nid reuse
          -get_node_page
              -f2fs_bug_on  <- nid != node_footer->nid

I think we don't need to truncate xattr pages eagerly which introduces lots of
data races without big benefits.

Bug: 280545073
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/linux-f2fs-devel/20230613233940.3643362-1-jaegeuk@kernel.org/T/#u
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Change-Id: Ifdbaf7defa50b479d82d2c945aa9d48e2e2317ed
2023-06-23 09:08:35 +00:00
Ulises Mendez Martinez
258f11319b ANDROID: db845c: Local define for db845c targets
Generally DAMP is a best practice in Bazel, for this
specific case, it helps with:

* Better target discoverability and auto-completion.
* It's possible to use `select` for KGDB fixes later on
  without encountering name expectations broken.

Bug: 256196368
Bug: 270320056
Change-Id: I300404a9b2b4b7c6569145a942ecb445d23e8e9a
Signed-off-by: Ulises Mendez Martinez <umendez@google.com>
2023-06-23 09:07:26 +00:00
Qais Yousef
2af5c43333 ANDROID: Update the ABI symbol list
Adding the following symbols:
  - push_cpu_stop
  - stop_one_cpu_nowait
  - __traceiter_android_rvh_can_migrate_task
  - __traceiter_android_rvh_find_lowest_rq
  - __tracepoint_android_rvh_can_migrate_task
  - __tracepoint_android_rvh_find_lowest_rq

Bug: 286099809
Change-Id: Iae8ce08e13215aa985577bf83c69750416924c42
Signed-off-by: Qais Yousef <qyousef@google.com>
2023-06-22 21:29:55 +00:00
Qais Yousef
5af00d8531 ANDROID: Export cpu_push_stop
To enable handling RT tasks that are stuck on wrong CPU after changing
uclamp_min value.

Bug: 286099809
Signed-off-by: Qais Yousef <qyousef@google.com>
Change-Id: Ie223d34df6f21640e38b123d2dc3e674ee7c5e79
2023-06-22 21:29:55 +00:00
Robin Peng
3c328a636a ANDROID: Update the ABI symbol list
Adding the following symbols:
  - __traceiter_android_rvh_can_migrate_task
  - __tracepoint_android_rvh_can_migrate_task

Bug: 268296149
Change-Id: Id7a19cd76037db162e592dea80c9865b09eb4a4d
Signed-off-by: Robin Peng <robinpeng@google.com>
2023-06-22 21:13:14 +00:00
Ulises Mendez Martinez
bdd2312e95 ANDROID: rockpi4: Fix build when using --kgdb
* CONFIG_WATCHDOG is disabled when compiling with
--kgdb option, hence the list of modules produced is
adjusted conditionally on its value.

Bug: 270320056
Change-Id: I4db55fdf6b91a65209d2e0ae3bbb5f384c7eca22
Signed-off-by: Ulises Mendez Martinez <umendez@google.com>
2023-06-22 13:43:15 +00:00
Yifan Hong
d1601b50e6 ANDROID: kleaf: android/gki_system_dlkm_modules is generated.
modules.bzl is the source of truth for the list of GKI
modules. There is no need to keep two lists.

Test: TH
Bug: 287697703
Signed-off-by: Yifan Hong <elsk@google.com>
(cherry picked from https://android-review.googlesource.com/q/commit:a8a61755f67730af45d50f6173a4eedbcefa1c87)
Merged-In: I8953e92696833cf8ec27aa80724ec468c08736f1
Change-Id: I8953e92696833cf8ec27aa80724ec468c08736f1
2023-06-22 00:48:30 +00:00
Paul Lawrence
a7068670a7 ANDROID: fuse-bpf: Move FUSE_RELEASE to correct place
The existing fuse-bpf freeing logic would free the fuse_file struct
immediately. However, this would break readahead. Move freeing logic
to the same place as done in classic fuse.

Bug: 286287652
Test: fuse_test passes, android boots, cts tests run
Change-Id: If13519f0e956a8da0dc98e7ac4aed2036070e969
Signed-off-by: Paul Lawrence <paullawrence@google.com>
2023-06-21 18:36:43 +00:00