linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-05 10:31:46 +09:00

Author	SHA1	Message	Date
Yu Kuai	404522c763	UPSTREAM: blk-ioc: protect ioc_destroy_icq() by 'queue_lock' Currently, icq is tracked by both request_queue(icq->q_node) and task(icq->ioc_node), and ioc_clear_queue() from elevator exit is not safe because it can access the list without protection: ioc_clear_queue ioc_release_fn lock queue_lock list_splice /* move queue list to a local list / unlock queue_lock / * lock is released, the local list * can be accessed through task exit. / lock ioc->lock while (!hlist_empty) icq = hlist_entry lock queue_lock ioc_destroy_icq delete icq->ioc_node while (!list_empty) icq = list_entry() list_del icq->q_node / * This is not protected by any lock, * list_entry concurrent with list_del * is not safe. */ unlock queue_lock unlock ioc->lock Fix this problem by protecting list 'icq->q_node' by queue_lock from ioc_clear_queue(). Reported-and-tested-by: Pradeep Pragallapati <quic_pragalla@quicinc.com> Link: https://lore.kernel.org/lkml/20230517084434.18932-1-quic_pragalla@quicinc.com/ Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20230531073435.2923422-1-yukuai1@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Bug: 285274586 (cherry picked from commit `5a0ac57c48`) Change-Id: I60f3acfaa32f18bed58c8190178cdca5ebd91100 Signed-off-by: Pradeep P V K <quic_pragalla@quicinc.com>	2023-09-05 15:27:33 +00:00
Wei Liu	bd0308e36b	ANDROID: GKI: Update symbols to symbol list Update symbols to symbol list externed by oppo network group. 5 Added function: [A] 'function int __rtnl_link_register(rtnl_link_ops)' [A] 'function int ip_local_deliver(struct sk_buff )' [A] 'function iov_iter_advance(struct iov_iter i, size_t size)' [A] 'function int nf_register_net_hook(struct net net, const struct nf_hook_ops reg)' [A] 'function void nf_unregister_net_hook(struct net , const struct nf_hook_ops *)' These functions have been merged in lower versions of the kernel and are still needed by oppo in higher versions. These functions are needed by other modules that provide functionality for oppo's network, such as the network tracking module, the network warm-up module, etc. Bug: 297979024 Change-Id: Ic1a4c869b3894a06f7cab7b5120574ed94d519b2 Signed-off-by: Wei Liu <liuwei.a@oppo.com>	2023-09-05 12:38:42 +00:00
John Stultz	87647c0c54	ANDROID: uid_sys_stats: Use llist for deferred work A use-after-free bug was found in the previous custom lock-free list implementation for the deferred work, so switch functionality to llist implementation. While the previous approach atomically handled the list head, it did not assure the new node's next pointer was assigned before the head was pointed to the node, allowing the consumer to traverse to an invalid next pointer. Additionally, in switching to llists, this patch pulls the entire list off the list head once and processes it separately, reducing the number of atomic operations compared with the custom lists's implementation which pulled one node at a time atomically from the list head. BUG: KASAN: use-after-free in process_notifier+0x270/0x2dc Write of size 8 at addr d4ffff89545c3c58 by task Blocking Thread/3431 Pointer tag: [d4], memory tag: [fe] call trace: dump_backtrace+0xf8/0x118 show_stack+0x18/0x24 dump_stack_lvl+0x60/0x78 print_report+0x178/0x470 kasan_report+0x8c/0xbc kasan_tag_mismatch+0x28/0x3c __hwasan_tag_mismatch+0x30/0x60 process_notifier+0x270/0x2dc notifier_call_chain+0xb4/0x108 blocking_notifier_call_chain+0x54/0x80 profile_task_exit+0x20/0x2c do_exit+0xec/0x1114 __arm64_sys_exit_group+0x0/0x24 get_signal+0x93c/0xa78 do_notify_resume+0x158/0x3fc el0_svc+0x54/0x78 el0t_64_sync_handler+0x44/0xe4 el0t_64_sync+0x190/0x194 Bug: 294468796 Bug: 295787403 Fixes: `8e86825eec` ("ANDROID: uid_sys_stats: Use a single work for deferred updates") Change-Id: Id377348c239ec720a5237726bc3632544d737e3b Signed-off-by: John Stultz <jstultz@google.com> [nkapron: Squashed with other changes and rewrote the commit message] Signed-off-by: Neill Kapron <nkapron@google.com>	2023-09-05 12:07:09 +00:00
Lin Ma	4b3ab91671	UPSTREAM: net: nfc: Fix use-after-free caused by nfc_llcp_find_local [ Upstream commit `6709d4b7bc` ] This commit fixes several use-after-free that caused by function nfc_llcp_find_local(). For example, one UAF can happen when below buggy time window occurs. // nfc_genl_llc_get_params \| // nfc_unregister_device \| dev = nfc_get_device(idx); \| device_lock(...) if (!dev) \| dev->shutting_down = true; return -ENODEV; \| device_unlock(...); \| device_lock(...); \| // nfc_llcp_unregister_device \| nfc_llcp_find_local() nfc_llcp_find_local(...); \| \| local_cleanup() if (!local) { \| rc = -ENODEV; \| // nfc_llcp_local_put goto exit; \| kref_put(.., local_release) } \| \| // local_release \| list_del(&local->list) // nfc_genl_send_params \| kfree() local->dev->idx !!!UAF!!! \| \| and the crash trace for the one of the discussed UAF like: BUG: KASAN: slab-use-after-free in nfc_genl_llc_get_params+0x72f/0x780 net/nfc/netlink.c:1045 Read of size 8 at addr ffff888105b0e410 by task 20114 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x72/0xa0 lib/dump_stack.c:106 print_address_description mm/kasan/report.c:319 [inline] print_report+0xcc/0x620 mm/kasan/report.c:430 kasan_report+0xb2/0xe0 mm/kasan/report.c:536 nfc_genl_send_params net/nfc/netlink.c:999 [inline] nfc_genl_llc_get_params+0x72f/0x780 net/nfc/netlink.c:1045 genl_family_rcv_msg_doit.isra.0+0x1ee/0x2e0 net/netlink/genetlink.c:968 genl_family_rcv_msg net/netlink/genetlink.c:1048 [inline] genl_rcv_msg+0x503/0x7d0 net/netlink/genetlink.c:1065 netlink_rcv_skb+0x161/0x430 net/netlink/af_netlink.c:2548 genl_rcv+0x28/0x40 net/netlink/genetlink.c:1076 netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline] netlink_unicast+0x644/0x900 net/netlink/af_netlink.c:1365 netlink_sendmsg+0x934/0xe70 net/netlink/af_netlink.c:1913 sock_sendmsg_nosec net/socket.c:724 [inline] sock_sendmsg+0x1b6/0x200 net/socket.c:747 ____sys_sendmsg+0x6e9/0x890 net/socket.c:2501 ___sys_sendmsg+0x110/0x1b0 net/socket.c:2555 __sys_sendmsg+0xf7/0x1d0 net/socket.c:2584 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3f/0x90 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x72/0xdc RIP: 0033:0x7f34640a2389 RSP: 002b:00007f3463415168 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007f34641c1f80 RCX: 00007f34640a2389 RDX: 0000000000000000 RSI: 0000000020000240 RDI: 0000000000000006 RBP: 00007f34640ed493 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 00007ffe38449ecf R14: 00007f3463415300 R15: 0000000000022000 </TASK> Allocated by task 20116: kasan_save_stack+0x22/0x50 mm/kasan/common.c:45 kasan_set_track+0x25/0x30 mm/kasan/common.c:52 ____kasan_kmalloc mm/kasan/common.c:374 [inline] __kasan_kmalloc+0x7f/0x90 mm/kasan/common.c:383 kmalloc include/linux/slab.h:580 [inline] kzalloc include/linux/slab.h:720 [inline] nfc_llcp_register_device+0x49/0xa40 net/nfc/llcp_core.c:1567 nfc_register_device+0x61/0x260 net/nfc/core.c:1124 nci_register_device+0x776/0xb20 net/nfc/nci/core.c:1257 virtual_ncidev_open+0x147/0x230 drivers/nfc/virtual_ncidev.c:148 misc_open+0x379/0x4a0 drivers/char/misc.c:165 chrdev_open+0x26c/0x780 fs/char_dev.c:414 do_dentry_open+0x6c4/0x12a0 fs/open.c:920 do_open fs/namei.c:3560 [inline] path_openat+0x24fe/0x37e0 fs/namei.c:3715 do_filp_open+0x1ba/0x410 fs/namei.c:3742 do_sys_openat2+0x171/0x4c0 fs/open.c:1356 do_sys_open fs/open.c:1372 [inline] __do_sys_openat fs/open.c:1388 [inline] __se_sys_openat fs/open.c:1383 [inline] __x64_sys_openat+0x143/0x200 fs/open.c:1383 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3f/0x90 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x72/0xdc Freed by task 20115: kasan_save_stack+0x22/0x50 mm/kasan/common.c:45 kasan_set_track+0x25/0x30 mm/kasan/common.c:52 kasan_save_free_info+0x2e/0x50 mm/kasan/generic.c:521 ____kasan_slab_free mm/kasan/common.c:236 [inline] ____kasan_slab_free mm/kasan/common.c:200 [inline] __kasan_slab_free+0x10a/0x190 mm/kasan/common.c:244 kasan_slab_free include/linux/kasan.h:162 [inline] slab_free_hook mm/slub.c:1781 [inline] slab_free_freelist_hook mm/slub.c:1807 [inline] slab_free mm/slub.c:3787 [inline] __kmem_cache_free+0x7a/0x190 mm/slub.c:3800 local_release net/nfc/llcp_core.c:174 [inline] kref_put include/linux/kref.h:65 [inline] nfc_llcp_local_put net/nfc/llcp_core.c:182 [inline] nfc_llcp_local_put net/nfc/llcp_core.c:177 [inline] nfc_llcp_unregister_device+0x206/0x290 net/nfc/llcp_core.c:1620 nfc_unregister_device+0x160/0x1d0 net/nfc/core.c:1179 virtual_ncidev_close+0x52/0xa0 drivers/nfc/virtual_ncidev.c:163 __fput+0x252/0xa20 fs/file_table.c:321 task_work_run+0x174/0x270 kernel/task_work.c:179 resume_user_mode_work include/linux/resume_user_mode.h:49 [inline] exit_to_user_mode_loop kernel/entry/common.c:171 [inline] exit_to_user_mode_prepare+0x108/0x110 kernel/entry/common.c:204 __syscall_exit_to_user_mode_work kernel/entry/common.c:286 [inline] syscall_exit_to_user_mode+0x21/0x50 kernel/entry/common.c:297 do_syscall_64+0x4c/0x90 arch/x86/entry/common.c:86 entry_SYSCALL_64_after_hwframe+0x72/0xdc Last potentially related work creation: kasan_save_stack+0x22/0x50 mm/kasan/common.c:45 __kasan_record_aux_stack+0x95/0xb0 mm/kasan/generic.c:491 kvfree_call_rcu+0x29/0xa80 kernel/rcu/tree.c:3328 drop_sysctl_table+0x3be/0x4e0 fs/proc/proc_sysctl.c:1735 unregister_sysctl_table.part.0+0x9c/0x190 fs/proc/proc_sysctl.c:1773 unregister_sysctl_table+0x24/0x30 fs/proc/proc_sysctl.c:1753 neigh_sysctl_unregister+0x5f/0x80 net/core/neighbour.c:3895 addrconf_notify+0x140/0x17b0 net/ipv6/addrconf.c:3684 notifier_call_chain+0xbe/0x210 kernel/notifier.c:87 call_netdevice_notifiers_info+0xb5/0x150 net/core/dev.c:1937 call_netdevice_notifiers_extack net/core/dev.c:1975 [inline] call_netdevice_notifiers net/core/dev.c:1989 [inline] dev_change_name+0x3c3/0x870 net/core/dev.c:1211 dev_ifsioc+0x800/0xf70 net/core/dev_ioctl.c:376 dev_ioctl+0x3d9/0xf80 net/core/dev_ioctl.c:542 sock_do_ioctl+0x160/0x260 net/socket.c:1213 sock_ioctl+0x3f9/0x670 net/socket.c:1316 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:870 [inline] __se_sys_ioctl fs/ioctl.c:856 [inline] __x64_sys_ioctl+0x19e/0x210 fs/ioctl.c:856 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3f/0x90 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x72/0xdc The buggy address belongs to the object at ffff888105b0e400 which belongs to the cache kmalloc-1k of size 1024 The buggy address is located 16 bytes inside of freed 1024-byte region [ffff888105b0e400, ffff888105b0e800) The buggy address belongs to the physical page: head:ffffea000416c200 order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0 flags: 0x200000000010200(slab\|head\|node=0\|zone=2) raw: 0200000000010200 ffff8881000430c0 ffffea00044c7010 ffffea0004510e10 raw: 0000000000000000 00000000000a000a 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff888105b0e300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff888105b0e380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff888105b0e400: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff888105b0e480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff888105b0e500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb In summary, this patch solves those use-after-free by 1. Re-implement the nfc_llcp_find_local(). The current version does not grab the reference when getting the local from the linked list. For example, the llcp_sock_bind() gets the reference like below: // llcp_sock_bind() local = nfc_llcp_find_local(dev); // A ..... \ \| raceable ..... / llcp_sock->local = nfc_llcp_local_get(local); // B There is an apparent race window that one can drop the reference and free the local object fetched in (A) before (B) gets the reference. 2. Some callers of the nfc_llcp_find_local() do not grab the reference at all. For example, the nfc_genl_llc_{{get/set}_params/sdreq} functions. We add the nfc_llcp_local_put() for them. Moreover, we add the necessary error handling function to put the reference. 3. Add the nfc_llcp_remove_local() helper. The local object is removed from the linked list in local_release() when all reference is gone. This patch removes it when nfc_llcp_unregister_device() is called. Therefore, every caller of nfc_llcp_find_local() will get a reference even when the nfc_llcp_unregister_device() is called. This promises no use-after-free for the local object is ever possible. Bug: 294167961 Fixes: `52feb444a9` ("NFC: Extend netlink interface for LTO, RW, and MIUX parameters support") Fixes: `c7aa12252f` ("NFC: Take a reference on the LLCP local pointer when creating a socket") Signed-off-by: Lin Ma <linma@zju.edu.cn> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit `425d9d3a92`) Signed-off-by: Lee Jones <joneslee@google.com> Change-Id: I8e7e7101ce0d5c81da9b8febd4ad78dd1affc4a5	2023-09-04 12:09:23 +01:00
Pablo Neira Ayuso	c603880bd5	UPSTREAM: netfilter: nf_tables: disallow rule addition to bound chain via NFTA_RULE_CHAIN_ID [ Upstream commit `0ebc1064e4` ] Bail out with EOPNOTSUPP when adding rule to bound chain via NFTA_RULE_CHAIN_ID. The following warning splat is shown when adding a rule to a deleted bound chain: WARNING: CPU: 2 PID: 13692 at net/netfilter/nf_tables_api.c:2013 nf_tables_chain_destroy+0x1f7/0x210 [nf_tables] CPU: 2 PID: 13692 Comm: chain-bound-rul Not tainted 6.1.39 #1 RIP: 0010:nf_tables_chain_destroy+0x1f7/0x210 [nf_tables] Bug: 296128351 Fixes: `d0e2c7de92` ("netfilter: nf_tables: add NFT_CHAIN_BINDING") Reported-by: Kevin Rich <kevinrich1337@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit `268cb07ef3`) Signed-off-by: Lee Jones <joneslee@google.com> Change-Id: Icf97f57d18bb2b30ed28a3de6cdd18661d7f1c3d	2023-09-04 09:47:17 +00:00
Laszlo Ersek	d95b2b008e	UPSTREAM: net: tap_open(): set sk_uid from current_fsuid() commit `5c9241f3ce` upstream. Commit `66b2c338ad` initializes the "sk_uid" field in the protocol socket (struct sock) from the "/dev/tapX" device node's owner UID. Per original commit `86741ec254` ("net: core: Add a UID field to struct sock.", 2016-11-04), that's wrong: the idea is to cache the UID of the userspace process that creates the socket. Commit `86741ec254` mentions socket() and accept(); with "tap", the action that creates the socket is open("/dev/tapX"). Therefore the device node's owner UID is irrelevant. In most cases, "/dev/tapX" will be owned by root, so in practice, commit `66b2c338ad` has no observable effect: - before, "sk_uid" would be zero, due to undefined behavior (CVE-2023-1076), - after, "sk_uid" would be zero, due to "/dev/tapX" being owned by root. What matters is the (fs)UID of the process performing the open(), so cache that in "sk_uid". Bug: 295995961 Cc: Eric Dumazet <edumazet@google.com> Cc: Lorenzo Colitti <lorenzo@google.com> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Pietro Borrello <borrello@diag.uniroma1.it> Cc: netdev@vger.kernel.org Cc: stable@vger.kernel.org Fixes: `66b2c338ad` ("tap: tap_open(): correctly initialize socket uid") Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2173435 Signed-off-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit `767800fc40`) Signed-off-by: Lee Jones <joneslee@google.com> Change-Id: Ib5f80015e5c0280acf9f35124d3ff267ff0420f0	2023-09-04 09:44:46 +00:00
Heikki Krogerus	b15c3a3df0	UPSTREAM: usb: typec: ucsi: Fix command cancellation The Cancel command was passed to the write callback as the offset instead of as the actual command which caused NULL pointer dereference. Reported-by: Stephan Bolten <stephan.bolten@gmx.net> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217517 Fixes: `094902bc6a` ("usb: typec: ucsi: Always cancel the command if PPM reports BUSY condition") Cc: stable@vger.kernel.org Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Message-ID: <20230606115802.79339-1-heikki.krogerus@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Bug: 298597334 Change-Id: I7f23e49c58b566f462ba34f76966db662308a5bc (cherry picked from commit `c4a8bfabef`) Signed-off-by: Udipto Goswami <quic_ugoswami@quicinc.com>	2023-09-01 09:53:15 +00:00
Will Shiu	0c34d588af	UPSTREAM: locks: fix KASAN: use-after-free in trace_event_raw_event_filelock_lock As following backtrace, the struct file_lock request , in posix_lock_inode is free before ftrace function using. Replace the ftrace function ahead free flow could fix the use-after-free issue. [name:report&]=============================================== BUG:KASAN: use-after-free in trace_event_raw_event_filelock_lock+0x80/0x12c [name:report&]Read at addr f6ffff8025622620 by task NativeThread/16753 [name:report_hw_tags&]Pointer tag: [f6], memory tag: [fe] [name:report&] BT: Hardware name: MT6897 (DT) Call trace: dump_backtrace+0xf8/0x148 show_stack+0x18/0x24 dump_stack_lvl+0x60/0x7c print_report+0x2c8/0xa08 kasan_report+0xb0/0x120 __do_kernel_fault+0xc8/0x248 do_bad_area+0x30/0xdc do_tag_check_fault+0x1c/0x30 do_mem_abort+0x58/0xbc el1_abort+0x3c/0x5c el1h_64_sync_handler+0x54/0x90 el1h_64_sync+0x68/0x6c trace_event_raw_event_filelock_lock+0x80/0x12c posix_lock_inode+0xd0c/0xd60 do_lock_file_wait+0xb8/0x190 fcntl_setlk+0x2d8/0x440 ... [name:report&] [name:report&]Allocated by task 16752: ... slab_post_alloc_hook+0x74/0x340 kmem_cache_alloc+0x1b0/0x2f0 posix_lock_inode+0xb0/0xd60 ... [name:report&] [name:report&]Freed by task 16752: ... kmem_cache_free+0x274/0x5b0 locks_dispose_list+0x3c/0x148 posix_lock_inode+0xc40/0xd60 do_lock_file_wait+0xb8/0x190 fcntl_setlk+0x2d8/0x440 do_fcntl+0x150/0xc18 ... Bug: 290585450 Link:https://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux.git/commit/?h=locks-6.6&id=74f6f5912693ce454384eaeec48705646a21c74f (cherry picked from commit `74f6f59126`) Change-Id: I7daa6e72d1815daff30dd39726e14b1d57b60f5f Signed-off-by: Will Shiu <Will.Shiu@mediatek.com>	2023-08-31 21:20:34 +00:00
Ramji Jiyani	20266a0652	ANDROID: kleaf: Remove ptp_kvm.ko from i386 modules commit `638804ea1c` ("ANDROID: kleaf: get_gki_modules_list add i386 option") introduced i386 as an option for get_gki_modules_list() with ptp_kvm.ko as i386 module. ptp_kvm.ko is not a module on anrdoid14-6.1, and cherry pick from android15-6.1 should have been worked to remove it. Remove ptp_kvm.ko from i386 list and make it empty for android14-6.1. Fixes: `638804ea1c` ("ANDROID: kleaf: get_gki_modules_list add i386 option") Bug: 293529933 Test: TH Change-Id: Ied9d8c06c9f38dc271d541275afee053a87ecd79 Signed-off-by: Ramji Jiyani <ramjiyani@google.com>	2023-08-31 17:47:21 +00:00
zhengtangquan	ce18fe6f29	ANDROID: GKI: Add symbols to symbol list for oplus 1 function symbol(s) added 'int __traceiter_android_vh_tune_swappiness(void, int)' 1 variable symbol(s) added 'struct tracepoint __tracepoint_android_vh_tune_swappiness' Bug: 297985476 Change-Id: I63e0e77b71df1b81eaa7d7370c6f739337d6c7e3 Signed-off-by: Tangquan Zheng <zhengtangquan@oppo.com>	2023-08-31 17:38:17 +00:00
Tangquan Zheng	8e6550add2	ANDROID: vendor_hooks: Add tune swappiness hook in get_scan_count() Add hook in get_scan_count() for customized swappiness. Partial cherry-pick of aosp/2119426. Bug: 297985476 Change-Id: I9d4074cf1a4097ff2a96be04646a01624cbd8dc3 Signed-off-by: Tangquan Zheng <zhengtangquan@oppo.com>	2023-08-31 17:38:17 +00:00
ying zuxin	dd87a7122c	ANDROID: GKI: Update symbol list for VIVO INFO: 1 function symbol(s) added 'void blk_fill_rwbs(char*, blk_opf_t)' Bug: 298155651 Change-Id: If30ac266aff8ba370e3064a59f082a02035c9dff Signed-off-by: ying zuxin <yingzuxin@vivo.com>	2023-08-31 13:09:07 +00:00
Ramji Jiyani	638804ea1c	ANDROID: kleaf: get_gki_modules_list add i386 option Adds "i386" as an option to get the list of 32-bit x86 modules in get_gki_modules_list(). virtual_device_i686 Cuttlefish target is a consumer. Option is named i386 to match the `arch` attributes in kernel_build rule. Bug: 293529933 Test: TH Change-Id: Ic5278aa687999a2bb2d98b97b204b99d1fcd809a Signed-off-by: Ramji Jiyani <ramjiyani@google.com> (cherry picked from commit 2a9967e15f99010ec06ac089b42a2ac20f2a57cb)	2023-08-31 03:24:10 +00:00
Ramji Jiyani	264e2973a4	ANDROID: arm as an option for get_gki_modules_list If driver config depends on ARM64, driver is not available for the ARM targets as module. Introduce arm as an option for get_gki_modules_list() to separate ARM64 dependent modules. virtual_device_arm Cuttlefish target is the current consumer of this; and it fails when there is ARM64 dependent module is introduced like OEM hypervisors. Bug: 293529933 Test: TH Change-Id: I462e8968faa48d58721d884688af62ff603c9a3d Signed-off-by: Ramji Jiyani <ramjiyani@google.com> (cherry picked from commit b0e30c021b79d9cb9a67b12a94d1fe2f61126f14)	2023-08-31 03:21:33 +00:00
David Gow	37edfbc5c4	UPSTREAM: um: Only disable SSE on clang to work around old GCC bugs As part of the Rust support for UML, we disable SSE (and similar flags) to match the normal x86 builds. This both makes sense (we ideally want a similar configuration to x86), and works around a crash bug with SSE generation under Rust with LLVM. However, this breaks compiling stdlib.h under gcc < 11, as the x86_64 ABI requires floating-point return values be stored in an SSE register. gcc 11 fixes this by only doing register allocation when a function is actually used, and since we never use atof(), it shouldn't be a problem: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99652 Nevertheless, only disable SSE on clang setups, as that's a simple way of working around everyone's bugs. Fixes: `8849818679` ("rust: arch/um: Disable FP/SIMD instruction to match x86") Reported-by: Roberto Sassu <roberto.sassu@huaweicloud.com> Link: https://lore.kernel.org/linux-um/6df2ecef9011d85654a82acd607fdcbc93ad593c.camel@huaweicloud.com/ Tested-by: Roberto Sassu <roberto.sassu@huaweicloud.com> Tested-by: SeongJae Park <sj@kernel.org> Signed-off-by: David Gow <davidgow@google.com> Reviewed-by: Vincenzo Palazzo <vincenzopalazzodev@gmail.com> Tested-by: Arthur Grillo <arthurgrillo@riseup.net> Signed-off-by: Richard Weinberger <richard@nod.at> Bug: 296671039 Change-Id: Ie71e5c59ca9fb6a480895af233fae9a15f5c5ddc (cherry picked from commit `a3046a618a`) Signed-off-by: Dongseok Yi <dseok.yi@samsung.com>	2023-08-30 12:58:15 +00:00
Pratyush Brahma	2a13641a14	ANDROID: GKI: Update abi_gki_aarch64_qcom for page_owner symbols Update abi_gki_aarch64_qcom to include __set_page_owner and page_owner_inited symbols. Bug: 296348400 Change-Id: I3dec65fb596764e51897dd0251aada539a34feca Signed-off-by: Pratyush Brahma <quic_pbrahma@quicinc.com>	2023-08-29 23:06:24 +00:00
Pratyush Brahma	f08623648a	ANDROID: mm: Export page_owner_inited and __set_page_owner Export page_owner_inited and __set_page_owner symbol for loadable vendor modules. Bug: 296348400 Change-Id: I220ec1b94326ca3c6cc809d54646c51194645197 Signed-off-by: Pratyush Brahma <quic_pbrahma@quicinc.com>	2023-08-29 23:06:13 +00:00
Ulises Mendez Martinez	e44e3955f7	ANDROID: Use alias for old rules. * This is in preparation for removal of these targets. Bug: 293529933 Change-Id: I7b7400bb95b0d2c571be18b97727d878996ab575 Signed-off-by: Ulises Mendez Martinez <umendez@google.com> (cherry picked from commit 83379c35cd0f39f65d89aacb7fbd4166b4cc9e9a)	2023-08-29 18:19:28 +00:00
Yi-De Wu	67018dd4e4	ANDROID: virt: geniezone: Enable as GKI module for arm64 Enables CONFIG_MTK_GZVM (gzvm.ko) as protected GKI module for arm64. Depends on ARM64 so no need to explicitly disable for other architecture's gki_defconfig files. Change-Id: I7bbef9192d92db295623f491e2a923147473a196 Signed-off-by: Yingshiuan Pan <yingshiuan.pan@mediatek.com> Signed-off-by: Yi-De Wu <yi-de.wu@mediatek.com> Bug: 280363874	2023-08-29 18:02:40 +00:00
Ulises Mendez Martinez	9a399ca713	ANDROID: Add arch specific gki module list targets * This is a no-op change preparing for the split of target and files based on the architecture used. Bug: 293529933 Change-Id: I7783b60e591aaad23b5446af5cb04af5765f4b3f Signed-off-by: Ulises Mendez Martinez <umendez@google.com>	2023-08-29 18:02:40 +00:00
Yi-De Wu	3e079b7691	FROMLIST: virt: geniezone: Add dtb config support Hypervisor might need to know the accurate address and size of dtb passed from userspace. And then hypervisor would parse the dtb and get vm information. Change-Id: I23194d45f5c60555ba7fde9dd8d393443fd41310 Signed-off-by: Jerry Wang <ze-yu.wang@mediatek.com> Signed-off-by: Liju-clr Chen <liju-clr.chen@mediatek.com> Signed-off-by: Yi-De Wu <yi-de.wu@mediatek.com> Bug: 280363874 Link: https://lore.kernel.org/lkml/20230727080005.14474-10-yi-de.wu@mediatek.com/	2023-08-29 18:02:40 +00:00
Yi-De Wu	39bd65ec1d	FROMLIST: virt: geniezone: Add memory region support Hypervisor might need to know the precise purpose of each memory region, so that it can provide specific memory protection. We add a new uapi to pass address and size of a memory region and its purpose. Change-Id: I53cc0953fd1e3f0aa3c0a91bb5877b2fb297c858 Signed-off-by: Jerry Wang <ze-yu.wang@mediatek.com> Signed-off-by: Liju-clr Chen <liju-clr.chen@mediatek.com> Signed-off-by: Yi-De Wu <yi-de.wu@mediatek.com> Bug: 280363874 Link: https://lore.kernel.org/lkml/20230727080005.14474-9-yi-de.wu@mediatek.com/	2023-08-29 18:02:40 +00:00
Yi-De Wu	c26057e351	FROMLIST: virt: geniezone: Add ioeventfd support Ioeventfd leverages eventfd to provide asynchronous notification mechanism for VMM. VMM can register a mmio address and bind with an eventfd. Once a mmio trap occurs on this registered region, its corresponding eventfd will be notified. Change-Id: Iff6bb7dd8ba42d08813e531ab40629492a1218bc Signed-off-by: Yingshiuan Pan <yingshiuan.pan@mediatek.com> Signed-off-by: Liju Chen <liju-clr.chen@mediatek.com> Signed-off-by: Yi-De Wu <yi-de.wu@mediatek.com> Bug: 280363874 Link: https://lore.kernel.org/lkml/20230727080005.14474-8-yi-de.wu@mediatek.com/	2023-08-29 18:02:40 +00:00
Yi-De Wu	e73a5222e6	FROMLIST: virt: geniezone: Add irqfd support irqfd enables other threads than vcpu threads to inject virtual interrupt through irqfd asynchronously rather through ioctl interface. This interface is necessary for VMM which creates separated thread for IO handling or uses vhost devices. Change-Id: I3a77cdcec0530193a518352f30c162d08b5b35ef Signed-off-by: Yingshiuan Pan <yingshiuan.pan@mediatek.com> Signed-off-by: Liju Chen <liju-clr.chen@mediatek.com> Signed-off-by: Yi-De Wu <yi-de.wu@mediatek.com> Bug: 280363874 Link: https://lore.kernel.org/lkml/20230727080005.14474-7-yi-de.wu@mediatek.com/	2023-08-29 18:02:40 +00:00
Yi-De Wu	7427b76faa	FROMLIST: virt: geniezone: Add irqchip support for virtual interrupt injection Enable GenieZone to handle virtual interrupt injection request. Change-Id: I2dc99a1d30309864eb7bbc91c97570cbb7c548a2 Signed-off-by: Yingshiuan Pan <yingshiuan.pan@mediatek.com> Signed-off-by: Liju Chen <liju-clr.chen@mediatek.com> Signed-off-by: Yi-De Wu <yi-de.wu@mediatek.com> Bug: 280363874 Link: https://lore.kernel.org/lkml/20230727080005.14474-6-yi-de.wu@mediatek.com/	2023-08-29 18:02:40 +00:00
Yi-De Wu	540cff0872	FROMLIST: virt: geniezone: Add vcpu support VMM use this interface to create vcpu instance which is a fd, and this fd will be for any vcpu operations, such as setting vcpu registers and accepts the most important ioctl GZVM_VCPU_RUN which requests GenieZone hypervisor to do context switch to execute VM's vcpu context. Change-Id: I76e6e5b3a33b30eb0b841288c3aa041e63564da2 Signed-off-by: Yingshiuan Pan <yingshiuan.pan@mediatek.com> Signed-off-by: Jerry Wang <ze-yu.wang@mediatek.com> Signed-off-by: Liju Chen <liju-clr.chen@mediatek.com> Signed-off-by: Yi-De Wu <yi-de.wu@mediatek.com> Bug: 280363874 Link: https://lore.kernel.org/lkml/20230727080005.14474-5-yi-de.wu@mediatek.com/	2023-08-29 18:02:40 +00:00
Yi-De Wu	6ce86d075e	FROMLIST: virt: geniezone: Add GenieZone hypervisor support GenieZone is MediaTek hypervisor solution, and it is running in EL2 stand alone as a type-I hypervisor. This patch exports a set of ioctl interfaces for userspace VMM (e.g., crosvm) to operate guest VMs lifecycle (creation and destroy) on GenieZone. Change-Id: I4fbc79bab120fe5ad90e2832f70562e97bbf40c0 Signed-off-by: Yingshiuan Pan <yingshiuan.pan@mediatek.com> Signed-off-by: Jerry Wang <ze-yu.wang@mediatek.com> Signed-off-by: Liju Chen <liju-clr.chen@mediatek.com> Signed-off-by: Yi-De Wu <yi-de.wu@mediatek.com> Bug: 280363874 Link: https://lore.kernel.org/lkml/20230727080005.14474-4-yi-de.wu@mediatek.com/	2023-08-29 18:02:40 +00:00
Yi-De Wu	40107a0081	FROMLIST: dt-bindings: hypervisor: Add MediaTek GenieZone hypervisor Add documentation for GenieZone(gzvm) node. This node informs gzvm driver to start probing if geniezone hypervisor is available and able to do virtual machine operations. Change-Id: Ie448a33b8981ee25fe36231a10af5c1372d23012 Signed-off-by: Yingshiuan Pan <yingshiuan.pan@mediatek.com> Signed-off-by: Liju Chen <liju-clr.chen@mediatek.com> Signed-off-by: Yi-De Wu <yi-de.wu@mediatek.com> Bug: 280363874 Link: https://lore.kernel.org/lkml/20230727080005.14474-3-yi-de.wu@mediatek.com/	2023-08-29 18:02:40 +00:00
Yi-De Wu	beaffb638b	FROMLIST: docs: geniezone: Introduce GenieZone hypervisor GenieZone is MediaTek proprietary hypervisor solution, and it is running in EL2 stand alone as a type-I hypervisor. It is a pure EL2 implementation which implies it does not rely any specific host VM, and this behavior improves GenieZone's security as it limits its interface. Change-Id: I8326093b5be79af5f87285fc74ee0cd7f5827808 Signed-off-by: Yingshiuan Pan <yingshiuan.pan@mediatek.com> Signed-off-by: Liju Chen <liju-clr.chen@mediatek.com> Signed-off-by: Yi-De Wu <yi-de.wu@mediatek.com> Bug: 280363874 Link: https://lore.kernel.org/lkml/20230727080005.14474-2-yi-de.wu@mediatek.com/	2023-08-29 18:02:40 +00:00
valis	e0c4636bd2	UPSTREAM: net/sched: cls_route: No longer copy tcf_result on update to avoid use-after-free [ Upstream commit `b80b829e9e` ] When route4_change() is called on an existing filter, the whole tcf_result struct is always copied into the new instance of the filter. This causes a problem when updating a filter bound to a class, as tcf_unbind_filter() is always called on the old instance in the success path, decreasing filter_cnt of the still referenced class and allowing it to be deleted, leading to a use-after-free. Fix this by no longer copying the tcf_result struct from the old filter. Bug: 296347075 Fixes: `1109c00547` ("net: sched: RCU cls_route") Reported-by: valis <sec@valis.email> Reported-by: Bing-Jhong Billy Jheng <billy@starlabs.sg> Signed-off-by: valis <sec@valis.email> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Victor Nogueira <victor@mojatatu.com> Reviewed-by: Pedro Tammela <pctammela@mojatatu.com> Reviewed-by: M A Ramdhan <ramdhan@starlabs.sg> Link: https://lore.kernel.org/r/20230729123202.72406-4-jhs@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit `d4d3b53a4c`) Signed-off-by: Lee Jones <joneslee@google.com> Change-Id: Iefbd201b92847ec1349f92c107d7ef5aec3fb359	2023-08-29 16:54:04 +00:00
Laszlo Ersek	ec1f17ddac	UPSTREAM: net: tun_chr_open(): set sk_uid from current_fsuid() commit `9bc3047374` upstream. Commit `a096ccca6e` initializes the "sk_uid" field in the protocol socket (struct sock) from the "/dev/net/tun" device node's owner UID. Per original commit `86741ec254` ("net: core: Add a UID field to struct sock.", 2016-11-04), that's wrong: the idea is to cache the UID of the userspace process that creates the socket. Commit `86741ec254` mentions socket() and accept(); with "tun", the action that creates the socket is open("/dev/net/tun"). Therefore the device node's owner UID is irrelevant. In most cases, "/dev/net/tun" will be owned by root, so in practice, commit `a096ccca6e` has no observable effect: - before, "sk_uid" would be zero, due to undefined behavior (CVE-2023-1076), - after, "sk_uid" would be zero, due to "/dev/net/tun" being owned by root. What matters is the (fs)UID of the process performing the open(), so cache that in "sk_uid". Bug: 295995961 Cc: Eric Dumazet <edumazet@google.com> Cc: Lorenzo Colitti <lorenzo@google.com> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Pietro Borrello <borrello@diag.uniroma1.it> Cc: netdev@vger.kernel.org Cc: stable@vger.kernel.org Fixes: `a096ccca6e` ("tun: tun_chr_open(): correctly initialize socket uid") Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2173435 Signed-off-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit `b6846d7c40`) Signed-off-by: Lee Jones <joneslee@google.com> Change-Id: I2540ac5876ca7dad39e1b867a5e09a5c9c69bb86	2023-08-29 16:47:55 +00:00
Namjae Jeon	0adc759b0c	UPSTREAM: exfat: check if filename entries exceeds max filename length [ Upstream commit `d42334578e` ] exfat_extract_uni_name copies characters from a given file name entry into the 'uniname' variable. This variable is actually defined on the stack of the exfat_readdir() function. According to the definition of the 'exfat_uni_name' type, the file name should be limited 255 characters (+ null teminator space), but the exfat_get_uniname_from_ext_entry() function can write more characters because there is no check if filename entries exceeds max filename length. This patch add the check not to copy filename characters when exceeding max filename length. Bug: 296393077 Cc: stable@vger.kernel.org Cc: Yuezhang Mo <Yuezhang.Mo@sony.com> Reported-by: Maxim Suhanov <dfirblog@gmail.com> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit `c2fdf827f8`) Signed-off-by: Lee Jones <joneslee@google.com> Change-Id: I57a9ab007a5eac9c3415aa460df324c9044908c0	2023-08-29 16:46:10 +00:00
valis	f4ba064f76	UPSTREAM: net/sched: cls_fw: No longer copy tcf_result on update to avoid use-after-free [ Upstream commit `76e42ae831` ] When fw_change() is called on an existing filter, the whole tcf_result struct is always copied into the new instance of the filter. This causes a problem when updating a filter bound to a class, as tcf_unbind_filter() is always called on the old instance in the success path, decreasing filter_cnt of the still referenced class and allowing it to be deleted, leading to a use-after-free. Fix this by no longer copying the tcf_result struct from the old filter. Bug: 296347075 Fixes: `e35a8ee599` ("net: sched: fw use RCU") Reported-by: valis <sec@valis.email> Reported-by: Bing-Jhong Billy Jheng <billy@starlabs.sg> Signed-off-by: valis <sec@valis.email> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Victor Nogueira <victor@mojatatu.com> Reviewed-by: Pedro Tammela <pctammela@mojatatu.com> Reviewed-by: M A Ramdhan <ramdhan@starlabs.sg> Link: https://lore.kernel.org/r/20230729123202.72406-3-jhs@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit `7f691439b2`) Signed-off-by: Lee Jones <joneslee@google.com> Change-Id: I33c91c83d1cd8e889a7261adfa3779ca6c141088	2023-08-29 16:43:40 +00:00
Stephen Dickey	5b0878fc61	ANDROID: abi_gki_aarch64_qcom: update abi symbols Add android_rvh_cgroup_force_migration and other symbols. Symbols added: __traceiter_android_rvh_cgroup_force_kthread_migration __tracepoint_android_rvh_cgroup_force_kthread_migration Bug: 184594949 Change-Id: I8ffed8f422a33f141edc95d1b65a07b8fe30b424 Signed-off-by: Stephen Dickey <quic_dickey@quicinc.com>	2023-08-28 23:28:11 +00:00
Pavankumar Kondeti	7551a1a2a1	ANDROID: cgroup: Add android_rvh_cgroup_force_kthread_migration In Android GKI, CONFIG_FAIR_GROUP_SCHED is enabled [1] to help prioritize important work. Given that CPU shares of root cgroup can't be changed, leaving the tasks inside root cgroup will give them higher share compared to the other tasks inside important cgroups. This is mitigated by moving all tasks inside root cgroup to a different cgroup after Android is booted. However, there are many kernel tasks stuck in the root cgroup after the boot. It is possible to relax kernel threads and kworkers migrations under certain scenarios. However the patch [2] posted at upstream is not accepted. Hence add a restricted vendor hook to notify modules when a kernel thread is requested for cgroup migration. The modules can relax the restrictions forced by the kernel and allow the cgroup migration. [1] `f08f049de1` [2] https://lore.kernel.org/lkml/1617714261-18111-1-git-send-email-pkondeti@codeaurora.org Bug: 184594949 Change-Id: I445a170ba797c8bece3b4b59b7a42cdd85438f1f Signed-off-by: Pavankumar Kondeti <quic_pkondeti@quicinc.com> [quic_dickey@quicinc.com: port to android-mainline kernel] Signed-off-by: Stephen Dickey <quic_dickey@quicinc.com>	2023-08-28 23:28:01 +00:00
Enlin Mu	cd018c99fa	FROMGIT: pstore/ram: Check start of empty przs during init After commit `30696378f6` ("pstore/ram: Do not treat empty buffers as valid"), initialization would assume a prz was valid after seeing that the buffer_size is zero (regardless of the buffer start position). This unchecked start value means it could be outside the bounds of the buffer, leading to future access panics when written to: sysdump_panic_event+0x3b4/0x5b8 atomic_notifier_call_chain+0x54/0x90 panic+0x1c8/0x42c die+0x29c/0x2a8 die_kernel_fault+0x68/0x78 __do_kernel_fault+0x1c4/0x1e0 do_bad_area+0x40/0x100 do_translation_fault+0x68/0x80 do_mem_abort+0x68/0xf8 el1_da+0x1c/0xc0 __raw_writeb+0x38/0x174 __memcpy_toio+0x40/0xac persistent_ram_update+0x44/0x12c persistent_ram_write+0x1a8/0x1b8 ramoops_pstore_write+0x198/0x1e8 pstore_console_write+0x94/0xe0 ... To avoid this, also check if the prz start is 0 during the initialization phase. If not, the next prz sanity check case will discover it (start > size) and zap the buffer back to a sane state. Bug: 293538531 Fixes: `30696378f6` ("pstore/ram: Do not treat empty buffers as valid") Cc: Yunlong Xing <yunlong.xing@unisoc.com> Cc: stable@vger.kernel.org Change-Id: I6ff3a11b8b21f6f5ab37d8432751e5d33a441d8c Signed-off-by: Enlin Mu <enlin.mu@unisoc.com> Link: https://lore.kernel.org/r/20230801060432.1307717-1-yunlong.xing@unisoc.com [kees: update commit log with backtrace and clarifications] (cherry picked from commit `fe8c3623ab` https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git for-next/pstore) Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Chunhui Li <chunhui.li@mediatek.com>	2023-08-28 23:15:44 +00:00
sunshijie	ffaab71302	UPSTREAM: erofs: avoid infinite loop in z_erofs_do_read_page() when reading beyond EOF z_erofs_do_read_page() may loop infinitely due to the inappropriate truncation in the below statement. Since the offset is 64 bits and min_t() truncates the result to 32 bits. The solution is to replace unsigned int with a 64-bit type, such as erofs_off_t. cur = end - min_t(unsigned int, offset + end - map->m_la, end); - For example: - offset = 0x400160000 - end = 0x370 - map->m_la = 0x160370 - offset + end - map->m_la = 0x400000000 - offset + end - map->m_la = 0x00000000 (truncated as unsigned int) - Expected result: - cur = 0 - Actual result: - cur = 0x370 Signed-off-by: Chunhai Guo <guochunhai@vivo.com> Fixes: `3883a79abd` ("staging: erofs: introduce VLE decompression support") Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20230710093410.44071-1-guochunhai@vivo.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> (cherry picked from commit `8191213a58` https://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs.git dev) Bug: 296824280 Change-Id: I152508ba4c0eb83aeae5d753e22b0ca8d3ada56d Signed-off-by: sunshijie <sunshijie@xiaomi.corp-partner.google.com> Signed-off-by: sunshijie <sunshijie@xiaomi.com>	2023-08-28 17:42:48 +00:00
sunshijie	8497f46a87	UPSTREAM: erofs: avoid useless loops in z_erofs_pcluster_readmore() when reading beyond EOF z_erofs_pcluster_readmore() may take a long time to loop when the page offset is large enough, which is unnecessary should be prevented. For example, when the following case is encountered, it will loop 4691368 times, taking about 27 seconds: - offset = 19217289215 - inode_size = 1442672 Signed-off-by: Chunhai Guo <guochunhai@vivo.com> Fixes: `386292919c` ("erofs: introduce readmore decompression strategy") Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20230710042531.28761-1-guochunhai@vivo.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> (cherry picked from commit `936aa701d8` https://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs.git dev) Bug: 296824280 Change-Id: I279b0fadcfa8c0ff0d638a86c7bb2c6b4d07f194 Signed-off-by: sunshijie <sunshijie@xiaomi.corp-partner.google.com> Signed-off-by: sunshijie <sunshijie@xiaomi.com>	2023-08-28 17:42:48 +00:00
sunshijie	2f805fb912	UPSTREAM: erofs: Fix detection of atomic context Current check for atomic context is not sufficient as z_erofs_decompressqueue_endio can be called under rcu lock from blk_mq_flush_plug_list(). See the stacktrace [1] In such case we should hand off the decompression work for async processing rather than trying to do sync decompression in current context. Patch fixes the detection by checking for rcu_read_lock_any_held() and while at it use more appropriate !in_task() check than in_atomic(). Background: Historically erofs would always schedule a kworker for decompression which would incur the scheduling cost regardless of the context. But z_erofs_decompressqueue_endio() may not always be in atomic context and we could actually benefit from doing the decompression in z_erofs_decompressqueue_endio() if we are in thread context, for example when running with dm-verity. This optimization was later added in patch [2] which has shown improvement in performance benchmarks. ============================================== [1] Problem stacktrace [name:core&]BUG: sleeping function called from invalid context at kernel/locking/mutex.c:291 [name:core&]in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 1615, name: CpuMonitorServi [name:core&]preempt_count: 0, expected: 0 [name:core&]RCU nest depth: 1, expected: 0 CPU: 7 PID: 1615 Comm: CpuMonitorServi Tainted: G S W OE 6.1.25-android14-5-maybe-dirty-mainline #1 Hardware name: MT6897 (DT) Call trace: dump_backtrace+0x108/0x15c show_stack+0x20/0x30 dump_stack_lvl+0x6c/0x8c dump_stack+0x20/0x48 __might_resched+0x1fc/0x308 __might_sleep+0x50/0x88 mutex_lock+0x2c/0x110 z_erofs_decompress_queue+0x11c/0xc10 z_erofs_decompress_kickoff+0x110/0x1a4 z_erofs_decompressqueue_endio+0x154/0x180 bio_endio+0x1b0/0x1d8 __dm_io_complete+0x22c/0x280 clone_endio+0xe4/0x280 bio_endio+0x1b0/0x1d8 blk_update_request+0x138/0x3a4 blk_mq_plug_issue_direct+0xd4/0x19c blk_mq_flush_plug_list+0x2b0/0x354 __blk_flush_plug+0x110/0x160 blk_finish_plug+0x30/0x4c read_pages+0x2fc/0x370 page_cache_ra_unbounded+0xa4/0x23c page_cache_ra_order+0x290/0x320 do_sync_mmap_readahead+0x108/0x2c0 filemap_fault+0x19c/0x52c __do_fault+0xc4/0x114 handle_mm_fault+0x5b4/0x1168 do_page_fault+0x338/0x4b4 do_translation_fault+0x40/0x60 do_mem_abort+0x60/0xc8 el0_da+0x4c/0xe0 el0t_64_sync_handler+0xd4/0xfc el0t_64_sync+0x1a0/0x1a4 [2] Link: https://lore.kernel.org/all/20210317035448.13921-1-huangjianan@oppo.com/ Reported-by: Will Shiu <Will.Shiu@mediatek.com> Suggested-by: Gao Xiang <xiang@kernel.org> Signed-off-by: Sandeep Dhavale <dhavale@google.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Alexandre Mergnat <amergnat@baylibre.com> Link: https://lore.kernel.org/r/20230621220848.3379029-1-dhavale@google.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> (cherry picked from commit `12d0a24afd` https://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs.git dev) Bug: 296824280 Change-Id: I652b189e316b26ca56e1d7b6f1e4c52ae20bb3b7 Signed-off-by: sunshijie <sunshijie@xiaomi.corp-partner.google.com> Signed-off-by: sunshijie <sunshijie@xiaomi.com>	2023-08-28 17:42:48 +00:00
sunshijie	cc6111a287	UPSTREAM: erofs: fix compact 4B support for 16k block size In compact 4B, two adjacent lclusters are packed together as a unit to form on-disk indexes for effective random access, as below: (amortized = 4, vcnt = 2) _____________________________________________ \|___@_____ encoded bits __________\|_ blkaddr _\| 0 . amortized * vcnt = 8 . . . . amortized * vcnt - 4 = 4 . . .____________________________. \|_type (2 bits)_\|_clusterofs_\| Therefore, encoded bits for each pack are 32 bits (4 bytes). IOWs, since each lcluster can get 16 bits for its type and clusterofs, the maximum supported lclustersize for compact 4B format is 16k (14 bits). Fix this to enable compact 4B format for 16k lclusters (blocks), which is tested on an arm64 server with 16k page size. Fixes: `152a333a58` ("staging: erofs: add compacted compression indexes support") Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20230601112341.56960-1-hsiangkao@linux.alibaba.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> (cherry picked from commit `001b8ccd06` https://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs.git dev) Bug: 296824280 Change-Id: I97918294a1d00a65223e741c3d153f375ab50507 Signed-off-by: sunshijie <sunshijie@xiaomi.corp-partner.google.com> Signed-off-by: sunshijie <sunshijie@xiaomi.com>	2023-08-28 17:42:48 +00:00
sunshijie	f11ccb03a0	UPSTREAM: erofs: kill hooked chains to avoid loops on deduplicated compressed images After heavily stressing EROFS with several images which include a hand-crafted image of repeated patterns for more than 46 days, I found two chains could be linked with each other almost simultaneously and form a loop so that the entire loop won't be submitted. As a consequence, the corresponding file pages will remain locked forever. It can be _only_ observed on data-deduplicated compressed images. For example, consider two chains with five pclusters in total: Chain 1: 2->3->4->5 -- The tail pcluster is 5; Chain 2: 5->1->2 -- The tail pcluster is 2. Chain 2 could link to Chain 1 with pcluster 5; and Chain 1 could link to Chain 2 at the same time with pcluster 2. Since hooked chains are all linked locklessly now, I have no idea how to simply avoid the race. Instead, let's avoid hooked chains completely until I could work out a proper way to fix this and end users finally tell us that it's needed to add it back. Actually, this optimization can be found with multi-threaded workloads (especially even more often on deduplicated compressed images), yet I'm not sure about the overall system impacts of not having this compared with implementation complexity. Fixes: `267f2492c8` ("erofs: introduce multi-reference pclusters (fully-referenced)") Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Yue Hu <huyue2@coolpad.com> Link: https://lore.kernel.org/r/20230526201459.128169-4-hsiangkao@linux.alibaba.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> (cherry picked from commit `967c28b23f` https://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs.git dev) Bug: 296824280 Change-Id: I33607c174bfeb54119c6de271b44c9fe2a7399e6 Signed-off-by: sunshijie <sunshijie@xiaomi.corp-partner.google.com> Signed-off-by: sunshijie <sunshijie@xiaomi.com>	2023-08-28 17:42:48 +00:00
sunshijie	7521b904dc	UPSTREAM: erofs: fix potential overflow calculating xattr_isize Given on-disk i_xattr_icount is 16 bits and xattr_isize is calculated from i_xattr_icount multiplying 4, xattr_isize has a theoretical maximum of 256K (64K * 4). Thus declare xattr_isize as unsigned int to avoid the potential overflow. Fixes: `bfb8674dc0` ("staging: erofs: add erofs in-memory stuffs") Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20230414061810.6479-1-jefflexu@linux.alibaba.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> (cherry picked from commit `1b3567a196` https://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs.git dev) Bug: 296824280 Change-Id: I43d88c7ebc3b320e226ab4d7bc6717432ef5ad82 Signed-off-by: sunshijie <sunshijie@xiaomi.corp-partner.google.com> Signed-off-by: sunshijie <sunshijie@xiaomi.com>	2023-08-28 17:42:48 +00:00
sunshijie	6ec6eee87e	UPSTREAM: erofs: stop parsing non-compact HEAD index if clusterofs is invalid Syzbot generated a crafted image [1] with a non-compact HEAD index of clusterofs 33024 while valid numbers should be 0 ~ lclustersize-1, which causes the following unexpected behavior as below: BUG: unable to handle page fault for address: fffff52101a3fff9 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 23ffed067 P4D 23ffed067 PUD 0 Oops: 0000 [#1] PREEMPT SMP KASAN CPU: 1 PID: 4398 Comm: kworker/u5:1 Not tainted 6.3.0-rc6-syzkaller-g09a9639e56c0 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/30/2023 Workqueue: erofs_worker z_erofs_decompressqueue_work RIP: 0010:z_erofs_decompress_queue+0xb7e/0x2b40 ... Call Trace: <TASK> z_erofs_decompressqueue_work+0x99/0xe0 process_one_work+0x8f6/0x1170 worker_thread+0xa63/0x1210 kthread+0x270/0x300 ret_from_fork+0x1f/0x30 Note that normal images or images using compact indexes are not impacted. Let's fix this now. [1] https://lore.kernel.org/r/000000000000ec75b005ee97fbaa@google.com Reported-and-tested-by: syzbot+aafb3f37cfeb6534c4ac@syzkaller.appspotmail.com Fixes: `02827e1796` ("staging: erofs: add erofs_map_blocks_iter") Fixes: `152a333a58` ("staging: erofs: add compacted compression indexes support") Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20230410173714.104604-1-hsiangkao@linux.alibaba.com (cherry picked from commit `cc4efd3dd2` https://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs.git dev) Bug: 296824280 Change-Id: I8e4d7d3f30d70f8c4ab42b33f215af1292c57fcf Signed-off-by: sunshijie <sunshijie@xiaomi.corp-partner.google.com> Signed-off-by: sunshijie <sunshijie@xiaomi.com>	2023-08-28 17:42:48 +00:00
sunshijie	9089c10d9c	UPSTREAM: erofs: initialize packed inode after root inode is assigned As commit `8f7acdae2c` ("staging: erofs: kill all failure handling in fill_super()"), move the initialization of packed inode after root inode is assigned, so that the iput() in .put_super() is adequate as the failure handling. Otherwise, iput() is also needed in .kill_sb(), in case of the mounting fails halfway. Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com> Reviewed-by: Yue Hu <huyue2@coolpad.com> Fixes: `b15b2e307c` ("erofs: support on-disk compressed fragments data") Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Acked-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20230407141710.113882-3-jefflexu@linux.alibaba.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> (cherry picked from commit `cb9bce7951` https://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs.git dev) Bug: 296824280 Change-Id: I3cec91605b42c588e2c8f69629f0bdcc20078de2 Signed-off-by: sunshijie <sunshijie@xiaomi.corp-partner.google.com> Signed-off-by: sunshijie <sunshijie@xiaomi.com>	2023-08-28 17:42:48 +00:00
Kalesh Singh	797dac42cc	ANDROID: GKI: Update ABI for zsmalloc fixes zs_pool->lock was added upstream as a replacement for the size_class locks. The tooling over-cautiously reports this as a ABI breakage but both of these structs (zs_pool and size_class) are internal to zsmalloc.c. Update the ABI to allow these changes. Bug: 297093100 Change-Id: Ib9fc5a036f75d89fb6bee4c146034f6c81759e04 Signed-off-by: Kalesh Singh <kaleshsingh@google.com>	2023-08-28 16:43:44 +00:00
Andrew Yang	cb440cecb2	BACKPORT: zsmalloc: fix races between modifications of fullness and isolated We encountered many kernel exceptions of VM_BUG_ON(zspage->isolated == 0) in dec_zspage_isolation() and BUG_ON(!pages[1]) in zs_unmap_object() lately. This issue only occurs when migration and reclamation occur at the same time. With our memory stress test, we can reproduce this issue several times a day. We have no idea why no one else encountered this issue. BTW, we switched to the new kernel version with this defect a few months ago. Since fullness and isolated share the same unsigned int, modifications of them should be protected by the same lock. [andrew.yang@mediatek.com: move comment] Link: https://lkml.kernel.org/r/20230727062910.6337-1-andrew.yang@mediatek.com Link: https://lkml.kernel.org/r/20230721063705.11455-1-andrew.yang@mediatek.com Fixes: `c4549b8711` ("zsmalloc: remove zspage isolation for migration") Change-Id: I4aeda0715d65f828bb88ad6fbf36b9927c7a5c4b Signed-off-by: Andrew Yang <andrew.yang@mediatek.com> Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> (cherry picked from commit `4b5d1e47b6`) Bug: 297093100 [ Kalesh Singh - Fix trivial conflicts in zs_page_putback()] Signed-off-by: Kalesh Singh <kaleshsingh@google.com>	2023-08-28 16:43:44 +00:00
yue.shen	c0e84be923	ANDROID: ABI: Update symbols to unisoc whitelist for A14-6.1 Update whitelist for the symbols used by the unisoc in abi_gki_aarch64_unisoc. 1 variable symbol(s) added 'int percpu_counter_batch' Bug: 296338673 Change-Id: Idd1d03e9482c5f9c3ea2184066371cd6705ddd0e Signed-off-by: Yue Shen <yue.shen@unisoc.com>	2023-08-28 16:09:20 +00:00
Nhat Pham	5ef132d564	UPSTREAM: zsmalloc: consolidate zs_pool's migrate_lock and size_class's locks Currently, zsmalloc has a hierarchy of locks, which includes a pool-level migrate_lock, and a lock for each size class. We have to obtain both locks in the hotpath in most cases anyway, except for zs_malloc. This exception will no longer exist when we introduce a LRU into the zs_pool for the new writeback functionality - we will need to obtain a pool-level lock to synchronize LRU handling even in zs_malloc. In preparation for zsmalloc writeback, consolidate these locks into a single pool-level lock, which drastically reduces the complexity of synchronization in zsmalloc. We have also benchmarked the lock consolidation to see the performance effect of this change on zram. First, we ran a synthetic FS workload on a server machine with 36 cores (same machine for all runs), using fs_mark -d ../zram1mnt -s 100000 -n 2500 -t 32 -k before and after for btrfs and ext4 on zram (FS usage is 80%). Here is the result (unit is file/second): With lock consolidation (btrfs): Average: 13520.2, Median: 13531.0, Stddev: 137.5961482019028 Without lock consolidation (btrfs): Average: 13487.2, Median: 13575.0, Stddev: 309.08283679298665 With lock consolidation (ext4): Average: 16824.4, Median: 16839.0, Stddev: 89.97388510006668 Without lock consolidation (ext4) Average: 16958.0, Median: 16986.0, Stddev: 194.7370021336469 As you can see, we observe a 0.3% regression for btrfs, and a 0.9% regression for ext4. This is a small, barely measurable difference in my opinion. For a more realistic scenario, we also tries building the kernel on zram. Here is the time it takes (in seconds): With lock consolidation (btrfs): real Average: 319.6, Median: 320.0, Stddev: 0.8944271909999159 user Average: 6894.2, Median: 6895.0, Stddev: 25.528415540334656 sys Average: 521.4, Median: 522.0, Stddev: 1.51657508881031 Without lock consolidation (btrfs): real Average: 319.8, Median: 320.0, Stddev: 0.8366600265340756 user Average: 6896.6, Median: 6899.0, Stddev: 16.04057355583023 sys Average: 520.6, Median: 521.0, Stddev: 1.140175425099138 With lock consolidation (ext4): real Average: 320.0, Median: 319.0, Stddev: 1.4142135623730951 user Average: 6896.8, Median: 6878.0, Stddev: 28.621670111997307 sys Average: 521.2, Median: 521.0, Stddev: 1.7888543819998317 Without lock consolidation (ext4) real Average: 319.6, Median: 319.0, Stddev: 0.8944271909999159 user Average: 6886.2, Median: 6887.0, Stddev: 16.93221781102523 sys Average: 520.4, Median: 520.0, Stddev: 1.140175425099138 The difference is entirely within the noise of a typical run on zram. This hardly justifies the complexity of maintaining both the pool lock and the class lock. In fact, for writeback, we would need to introduce yet another lock to prevent data races on the pool's LRU, further complicating the lock handling logic. IMHO, it is just better to collapse all of these into a single pool-level lock. Link: https://lkml.kernel.org/r/20221128191616.1261026-4-nphamcs@gmail.com Change-Id: Ib0eb09d7a69190fc4ffea8f819423c7f66d83379 Signed-off-by: Nhat Pham <nphamcs@gmail.com> Suggested-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Minchan Kim <minchan@kernel.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Dan Streetman <ddstreet@ieee.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Seth Jennings <sjenning@redhat.com> Cc: Vitaly Wool <vitaly.wool@konsulko.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> (cherry picked from commit `c0547d0b6a`) Bug: 297093100 Signed-off-by: Kalesh Singh <kaleshsingh@google.com>	2023-08-26 05:43:02 +00:00
Maciej Żenczykowski	ec6b3d552a	UPSTREAM: netfilter: nfnetlink_log: always add a timestamp Compared to all the other work we're already doing to deliver an skb to userspace this is very cheap - at worse an extra call to ktime_get_real() - and very useful. (and indeed it may even be cheaper if we're running from other hooks) (background: Android occasionally logs packets which caused wake from sleep/suspend and we'd like to have timestamps reliably associated with these events) Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Florian Westphal <fw@strlen.de> Signed-off-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> (cherry picked from commit `1d85594fd3`) Change-Id: Id9b8bc046204c11bf3321e73a67b444777d387dd	2023-08-26 01:40:43 +00:00
Prakruthi Deepak Heragu	4db95aa21a	ANDROID: virt: gunyah: Do not allocate irq for GH_RM_RESOURCE_NO_VIRQ Resource manager can now return GH_RM_RESOURCE_NO_VIRQ (-1) instead of 0 as the value to mean "there's no vIRQ for this resource". Bug: 297100131 Change-Id: I93c4f41b881bfc9e094fa6115df7ba6fcdaa7e6e Signed-off-by: Elliot Berman <quic_eberman@quicinc.com> Signed-off-by: Prakruthi Deepak Heragu <quic_pheragu@quicinc.com>	2023-08-26 00:39:19 +00:00

1 2 3 4 5 ...

1150063 Commits