linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-10 21:07:02 +09:00

Author	SHA1	Message	Date
Edward Cree	e37f3b1561	sfc: use a dynamic m-port for representor RX and set it promisc Representors do not want to be subject to the PF's Ethernet address filters, since traffic from VFs will typically have a destination either elsewhere on the link segment or on an overlay network. So, create a dynamic m-port with promiscuous and all-multicast filters, and set it as the egress port of representor default rules. Since the m-port is an alias of the calling PF's own m-port, traffic will still be delivered to the PF's RXQs, but it will be subject to the VNRX filter rules installed on the dynamic m-port (specified by the v-port ID field of the filter spec). Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-07-29 21:22:07 -07:00
Edward Cree	77eb40749d	sfc: move table locking into filter_table_{probe,remove} methods We need to be able to drop the efx->filter_sem in ef100_filter_table_up() so that we can call functions that insert filters (and thus take that rwsem for read), which means the efx->type->filter_table_probe method needs to be responsible for taking the lock in the first place. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-07-29 21:22:06 -07:00
Edward Cree	67ab160ed0	sfc: insert default MAE rules to connect VFs to representors Default rules are low-priority switching rules which the hardware uses in the absence of higher-priority rules. Each representor requires a corresponding rule matching traffic from its representee VF and delivering to the PF (where a check on INGRESS_MPORT in __ef100_rx_packet() will direct it to the representor). No rule is required in the reverse direction, because representor TX uses a TX override descriptor to bypass the MAE and deliver directly to the VF. Since inserting any rule into the MAE disables the firmware's own default rules, also insert a pair of rules to connect the PF to the physical network port and vice-versa. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-07-29 21:22:06 -07:00
Edward Cree	f50e8fcda6	sfc: receive packets from EF100 VFs into representors If the source m-port of a packet in __ef100_rx_packet() is a VF, hand off the packet to the corresponding representor with efx_ef100_rep_rx_packet(). Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-07-29 21:22:06 -07:00
Edward Cree	08d0b16ecb	sfc: check ef100 RX packets are from the wire If not, for now drop them and warn. A subsequent patch will look up the source m-port to try and find a representor to deliver them to. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-07-29 21:22:06 -07:00
Edward Cree	6f6838aabf	sfc: determine wire m-port at EF100 PF probe time Traffic delivered to the (MAE admin) PF could be from either the wire or a VF. The INGRESS_MPORT field of the RX prefix distinguishes these; base_mport is the value this field will have for traffic from the wire (which should be delivered to the PF's netdevice, not a representor). Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-07-29 21:22:06 -07:00
Edward Cree	9fe00c800e	sfc: ef100 representor RX top half Representor RX uses a NAPI context driven by a 'fake interrupt': when the parent PF receives a packet destined for the representor, it adds it to an SKB list (efv->rx_list), and schedules NAPI if the 'fake interrupt' is primed. The NAPI poll then pulls packets off this list and feeds them to the stack with netif_receive_skb_list(). This scheme allows us to decouple representor RX from the parent PF's RX fast-path. This patch implements the 'top half', which builds an SKB, copies data into it from the RX buffer (which can then be released), adds it to the queue and fires the 'fake interrupt' if necessary. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-07-29 21:22:05 -07:00
Edward Cree	69bb5fa73d	sfc: ef100 representor RX NAPI poll This patch adds the 'bottom half' napi->poll routine for representor RX. See the next patch (with the top half) for an explanation of the 'fake interrupt' scheme used to drive this NAPI context. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-07-29 21:22:05 -07:00
Edward Cree	a95115c407	sfc: plumb ef100 representor stats Implement .ndo_get_stats64() method to read values out of struct efx_rep_sw_stats. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-07-29 21:22:05 -07:00
Linus Torvalds	620725263f	Merge tag 'mm-hotfixes-stable-2022-07-29' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "Two hotfixes, both cc:stable" * tag 'mm-hotfixes-stable-2022-07-29' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mm/hmm: fault non-owner device private entries page_alloc: fix invalid watermark check on a negative value	2022-07-29 21:02:35 -07:00
Dan Carpenter	71930846b3	net: marvell: prestera: uninitialized variable bug The "ret" variable needs to be initialized at the start. Fixes: `52323ef754` ("net: marvell: prestera: add phylink support") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/YuKeBBuGtsmd7QdT@kili Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-07-29 20:27:39 -07:00
Yu Zhe	0f14a8351a	dn_route: replace "jiffies-now>0" with "jiffies!=now" Use "jiffies != now" to replace "jiffies - now > 0" to make code more readable. We want to put a limit on how long the loop can run for before rescheduling. Signed-off-by: Yu Zhe <yuzhe@nfschina.com> Link: https://lore.kernel.org/r/20220729061712.22666-1-yuzhe@nfschina.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-07-29 20:12:49 -07:00
Jakub Kicinski	ff4970b130	Merge tag 'wireless-next-2022-07-29' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next Kalle Valo says: ==================== wireless-next patches for v5.20 Fourth set of patches for v5.20, last few patches before the merge window. Only driver changes this time, mostly just fixes and cleanup. Major changes: brcmfmac - support brcm,ccode-map-trivial DT property wcn36xx - add debugfs file to show firmware feature strings * tag 'wireless-next-2022-07-29' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (36 commits) wifi: rtw88: check the return value of alloc_workqueue() wifi: rtw89: 8852a: adjust IMR for SER L1 wifi: rtw89: 8852a: update RF radio A/B R56 wifi: wcn36xx: Add debugfs entry to read firmware feature strings wifi: wcn36xx: Move capability bitmap to string translation function to firmware.c wifi: wcn36xx: Move firmware feature bit storage to dedicated firmware.c file wifi: wcn36xx: Rename clunky firmware feature bit enum wifi: brcmfmac: prevent double-free on hardware-reset wifi: brcmfmac: support brcm,ccode-map-trivial DT property dt-bindings: bcm4329-fmac: add optional brcm,ccode-map-trivial wifi: brcmfmac: Replace default (not configured) MAC with a random MAC wifi: brcmfmac: Add brcmf_c_set_cur_etheraddr() helper wifi: brcmfmac: Remove #ifdef guards for PM related functions wifi: brcmfmac: use strreplace() in brcmf_of_probe() wifi: plfxlc: Use eth_zero_addr() to assign zero address wifi: wilc1000: use existing iftype variable to store the interface type wifi: wilc1000: add 'isinit' flag for SDIO bus similar to SPI wifi: wilc1000: cancel the connect operation during interface down wifi: wilc1000: get correct length of string WID from received config packet wifi: wilc1000: set station_info flag only when signal value is valid ... ==================== Link: https://lore.kernel.org/r/20220729192832.A5011C433D6@smtp.kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-07-29 19:34:46 -07:00
Jakub Kicinski	5fc7c5887c	Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Andrii Nakryiko says: ==================== bpf-next 2022-07-29 We've added 22 non-merge commits during the last 4 day(s) which contain a total of 27 files changed, 763 insertions(+), 120 deletions(-). The main changes are: 1) Fixes to allow setting any source IP with bpf_skb_set_tunnel_key() helper, from Paul Chaignon. 2) Fix for bpf_xdp_pointer() helper when doing sanity checking, from Joanne Koong. 3) Fix for XDP frame length calculation, from Lorenzo Bianconi. 4) Libbpf BPF_KSYSCALL docs improvements and fixes to selftests to accommodate s390x quirks with socketcall(), from Ilya Leoshkevich. 5) Allow/denylist and CI configs additions to selftests/bpf to improve BPF CI, from Daniel Müller. 6) BPF trampoline + ftrace follow up fixes, from Song Liu and Xu Kuohai. 7) Fix allocation warnings in netdevsim, from Jakub Kicinski. 8) bpf_obj_get_opts() libbpf API allowing to provide file flags, from Joe Burton. 9) vsnprintf usage fix in bpf_snprintf_btf(), from Fedor Tokarev. 10) Various small fixes and clean ups, from Daniel Müller, Rongguang Wei, Jörn-Thorben Hinz, Yang Li. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (22 commits) bpf: Remove unneeded semicolon libbpf: Add bpf_obj_get_opts() netdevsim: Avoid allocation warnings triggered from user space bpf: Fix NULL pointer dereference when registering bpf trampoline bpf: Fix test_progs -j error with fentry/fexit tests selftests/bpf: Bump internal send_signal/send_signal_tracepoint timeout bpftool: Don't try to return value from void function in skeleton bpftool: Replace sizeof(arr)/sizeof(arr[0]) with ARRAY_SIZE macro bpf: btf: Fix vsnprintf return value check libbpf: Support PPC in arch_specific_syscall_pfx selftests/bpf: Adjust vmtest.sh to use local kernel configuration selftests/bpf: Copy over libbpf configs selftests/bpf: Sort configuration selftests/bpf: Attach to socketcall() in test_probe_user libbpf: Extend BPF_KSYSCALL documentation bpf, devmap: Compute proper xdp_frame len redirecting frames bpf: Fix bpf_xdp_pointer return pointer selftests/bpf: Don't assign outer source IP to host bpf: Set flow flag to allow any source IP in bpf_tunnel_key geneve: Use ip_tunnel_key flow flags in route lookups ... ==================== Link: https://lore.kernel.org/r/20220729230948.1313527-1-andrii@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-07-29 19:04:29 -07:00
Aaron Tomlin	b99695580b	scripts/gdb: ensure the absolute path is generated on initial source Post 'make scripts_gdb' a symbolic link to scripts/gdb/vmlinux-gdb.py is created. Currently 'os.path.dirname(__file__)' does not generate the absolute path to scripts/gdb resulting in the following: (gdb) source vmlinux-gdb.py Traceback (most recent call last): File "scripts/gdb/vmlinux-gdb.py", line 25, in <module> import linux.utils ModuleNotFoundError: No module named 'linux' This patch ensures that the absolute path to scripts/gdb in relation to the given file is generated so each module can be located accordingly. Link: https://lkml.kernel.org/r/20220712110248.1404125-1-atomlin@redhat.com Signed-off-by: Aaron Tomlin <atomlin@redhat.com> Reviewed-by: Douglas Anderson <dianders@chromium.org> Cc: Jan Kiszka <jan.kiszka@siemens.com> Cc: Kieran Bingham <kbingham@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:12:37 -07:00
Brendan Higgins	9f98911a9d	MAINTAINERS: kunit: add David Gow as a maintainer of KUnit David has been a de facto maintainer of KUnit for a long time now. Formalize this in the MAINTAINERS file. Link: https://lkml.kernel.org/r/20220725220737.790976-1-brendan.higgins@linux.dev Signed-off-by: Brendan Higgins <brendan.higgins@linux.dev> Reviewed-by: David Gow <davidgow@google.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Daniel Latypov <dlatypov@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:12:36 -07:00
Brendan Higgins	9f3cebf0bb	mailmap: add linux.dev alias for Brendan Higgins Because of my new work remote setup at Google, I can no longer use command line tools with my google.com email address, for this reason I got a linux.dev account. So update the mailmap to show the new alias I will be using. Link: https://lkml.kernel.org/r/20220725215833.789133-1-brendan.higgins@linux.dev Signed-off-by: Brendan Higgins <brendan.higgins@linux.dev> Reviewed-by: David Gow <davidgow@google.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Daniel Latypov <dlatypov@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:12:36 -07:00
Kirill Tkhai	50feece7f7	mailmap: update Kirill's email I disconnected from both Virtuozzo and OpenVZ, so this updates my email to point to my own. I haven't used @openvz address for patches, so let's rewrite the line instead of to add a new one. CC all previous addresses. Link: https://lkml.kernel.org/r/14ca895b-e745-6ba2-8be8-652feacbc907@ya.ru Signed-off-by: Kirill Tkhai <tkhai@ya.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:12:36 -07:00
Ben Dooks	787dbea11a	profile: setup_profiling_timer() is moslty not implemented The setup_profiling_timer() is mostly un-implemented by many architectures. In many places it isn't guarded by CONFIG_PROFILE which is needed for it to be used. Make it a weak symbol in kernel/profile.c and remove the 'return -EINVAL' implementations from the kenrel. There are a couple of architectures which do return 0 from the setup_profiling_timer() function but they don't seem to do anything else with it. To keep the /proc compatibility for now, leave these for a future update or removal. On ARM, this fixes the following sparse warning: arch/arm/kernel/smp.c:793:5: warning: symbol 'setup_profiling_timer' was not declared. Should it be static? Link: https://lkml.kernel.org/r/20220721195509.418205-1-ben-linux@fluff.org Signed-off-by: Ben Dooks <ben-linux@fluff.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:12:36 -07:00
Christophe JAILLET	45ee6d1e93	ocfs2: fix a typo in a comment s/heartbaet/heartbeat Link: https://lkml.kernel.org/r/4d4a6786e8ad522bfad6d2401b7f6634f8af0e5d.1658436259.git.christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:12:36 -07:00
Christophe JAILLET	702f3cf374	ocfs2: use the bitmap API to simplify code Use bitmap_zero() instead of hand-writing it. It is less verbose. While at it, add an explicit #include <linux/bitmap.h>. Link: https://lkml.kernel.org/r/86d2a027c319db12055c98f00c65f7d01e703722.1658436259.git.christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:12:36 -07:00
Christophe JAILLET	97d3b2676f	ocfs2: remove some useless functions Patch series "ocfs2: A few clean_ups", v2. __ocfs2_node_map_set_bit() and __ocfs2_node_map_clear_bit() are just wrapper around set_bit() and clear_bit(). The leading __ also makes think that these functions are non-atomic just like __set_bit() and __clear_bit(). So, just remove these wrappers and call set_bit() and clear_bit() directly. Link: https://lkml.kernel.org/r/cover.1658436259.git.christophe.jaillet@wanadoo.fr Link: https://lkml.kernel.org/r/bd1429c84ec7d174c96dbb67a2b42b1b456d9394.1658436259.git.christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:12:35 -07:00
Slark Xiao	cf069c3b47	lib/mpi: fix typo 'the the' in comment Replace 'the the' with 'the' in the comment. Link: https://lkml.kernel.org/r/20220722101922.81126-1-slark_xiao@163.com Signed-off-by: Slark Xiao <slark_xiao@163.com> Cc: Hongbo Li <herberthbli@tencent.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:12:35 -07:00
Alexey Dobriyan	ed8fb78d7e	proc: add some (hopefully) insightful comments * /proc/${pid}/net status * removing PDE vs last close stuff (again!) * random small stuff Link: https://lkml.kernel.org/r/YtwrM6sDC0OQ53YB@localhost.localdomain Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:12:35 -07:00
Xiu Jianfeng	fa7d574ba4	bdi: remove enum wb_congested_state enum wb_congested_state and the member 'congested' in bdi_writeback are useless since commit `a88f2096d5` ("remove congestion tracking framework"), so remove it. Link: https://lkml.kernel.org/r/20220719083349.87547-1-xiujianfeng@huawei.com Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: NeilBrown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:12:35 -07:00
Ben Dooks	591c32bddb	kernel/hung_task: fix address space of proc_dohung_task_timeout_secs The proc_dohung_task_timeout_secs() function is incorrectly marked as having a __user buffer as argument 3. However this is not the case and it is casing multiple sparse warnings. Fix the following warnings by removing __user from the argument: kernel/hung_task.c:237:52: warning: incorrect type in argument 3 (different address spaces) kernel/hung_task.c:237:52: expected void * kernel/hung_task.c:237:52: got void [noderef] __user buffer kernel/hung_task.c:287:35: warning: incorrect type in initializer (incompatible argument 3 (different address spaces)) kernel/hung_task.c:287:35: expected int ( [usertype] proc_handler )( ... ) kernel/hung_task.c:287:35: got int ( * )( ... ) kernel/hung_task.c:295:35: warning: incorrect type in initializer (incompatible argument 3 (different address spaces)) kernel/hung_task.c:295:35: expected int ( [usertype] proc_handler )( ... ) kernel/hung_task.c:295:35: got int ( )( ... ) Link: https://lkml.kernel.org/r/20220714074744.189017-1-ben.dooks@sifive.com Signed-off-by: Ben Dooks <ben.dooks@sifive.com> Cc: <Conor.Dooley@microchip.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:12:35 -07:00
Jiangshan Yi	a10c9ede99	lib/lzo/lzo1x_compress.c: replace ternary operator with min() and min_t() Fix the following coccicheck warning: lib/lzo/lzo1x_compress.c:54: WARNING opportunity for min(). lib/lzo/lzo1x_compress.c:329: WARNING opportunity for min(). min() and min_t() macro is defined in include/linux/minmax.h. It avoids multiple evaluations of the arguments when non-constant and performs strict type-checking. Link: https://lkml.kernel.org/r/20220714015441.1313036-1-13667453960@163.com Signed-off-by: Jiangshan Yi <yijiangshan@kylinos.cn> Tested-by: Dave Rodgman <dave.rodgman@arm.com> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:12:34 -07:00
Phillip Lougher	b09a7a036d	squashfs: support reading fragments in readahead call Add a function which can be used to read fragments in the readahead call. This function is necessary because filesystems built with the -tailends (or -always-use-fragments) option may have fragments present which cannot be currently handled. Link: https://lkml.kernel.org/r/20220617083810.337573-5-hsinyi@chromium.org Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk> Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org> Cc: Hou Tao <houtao1@huawei.com> Cc: kernel test robot <lkp@intel.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Miao Xie <miaoxie@huawei.com> Cc: Xiongwei Song <Xiongwei.Song@windriver.com> Cc: Zhang Yi <yi.zhang@huawei.com> Cc: Zheng Liang <zhengliang6@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:12:34 -07:00
Hsin-Yi Wang	8fc78b6fe2	squashfs: implement readahead Implement readahead callback for squashfs. It will read datablocks which cover pages in readahead request. For a few cases it will not mark page as uptodate, including: - file end is 0. - zero filled blocks. - current batch of pages isn't in the same datablock. - decompressor error. Otherwise pages will be marked as uptodate. The unhandled pages will be updated by readpage later. Link: https://lkml.kernel.org/r/20220617083810.337573-4-hsinyi@chromium.org Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org> Suggested-by: Matthew Wilcox <willy@infradead.org> Reported-by: Matthew Wilcox <willy@infradead.org> Reported-by: Phillip Lougher <phillip@squashfs.org.uk> Reported-by: Xiongwei Song <Xiongwei.Song@windriver.com> Reported-by: Andrew Morton <akpm@linux-foundation.org> Cc: Hou Tao <houtao1@huawei.com> Cc: kernel test robot <lkp@intel.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Miao Xie <miaoxie@huawei.com> Cc: Zhang Yi <yi.zhang@huawei.com> Cc: Zheng Liang <zhengliang6@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:12:34 -07:00
Phillip Lougher	db98b43086	squashfs: always build "file direct" version of page actor Squashfs_readahead uses the "file direct" version of the page actor, and so build it unconditionally. Link: https://lkml.kernel.org/r/20220617083810.337573-3-hsinyi@chromium.org Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk> Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org> Reported-by: kernel test robot <lkp@intel.com> Cc: Hou Tao <houtao1@huawei.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Miao Xie <miaoxie@huawei.com> Cc: Xiongwei Song <Xiongwei.Song@windriver.com> Cc: Zhang Yi <yi.zhang@huawei.com> Cc: Zheng Liang <zhengliang6@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:12:34 -07:00
Hsin-Yi Wang	0c12185728	Revert "squashfs: provide backing_dev_info in order to disable read-ahead" Patch series "Implement readahead for squashfs", v7. Commit 9eec1d897139("squashfs: provide backing_dev_info in order to disable read-ahead") mitigates the performance drop issue for squashfs by closing readahead for it. This series implements readahead callback for squashfs. This patch (of 4): This reverts `9eec1d8971` ("squashfs: provide backing_dev_info in order to disable read-ahead"). Revert closing the readahead to squashfs since the readahead callback for squashfs is implemented. Link: https://lkml.kernel.org/r/20220617083810.337573-1-hsinyi@chromium.org Link: https://lkml.kernel.org/r/20220617083810.337573-2-hsinyi@chromium.org Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org> Suggested-by: Xiongwei Song <Xiongwei.Song@windriver.com> Cc: Phillip Lougher <phillip@squashfs.org.uk> Cc: Matthew Wilcox <willy@infradead.org> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Zheng Liang <zhengliang6@huawei.com> Cc: Zhang Yi <yi.zhang@huawei.com> Cc: Hou Tao <houtao1@huawei.com> Cc: Miao Xie <miaoxie@huawei.com> Cc: kernel test robot <lkp@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:12:34 -07:00
Sophia Gabriella	1a44131d4f	mm: Kconfig: fix typo Fixes a typo in the help section for ZSWAP. Link: https://lkml.kernel.org/r/Message-ID: Signed-off-by: Sophia Gabriella <sophia.gabriellla@outlook.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:07:20 -07:00
Kefeng Wang	96f96763de	mm: memory-failure: convert to pr_fmt() Use pr_fmt to prefix all pr_<level> output, but unpoison_memory() and soft_offline_page() are used by error injection, which have own prefixes like "Unpoison:" and "soft offline:", meanwhile, soft_offline_page() could be used by memory hotremove, so reset pr_fmt before unpoison_pr_info definition to keep the original output for them. [wangkefeng.wang@huawei.com: v3] Link: https://lkml.kernel.org/r/20220729031919.72331-1-wangkefeng.wang@huawei.com Link: https://lkml.kernel.org/r/20220726081046.10742-1-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:07:20 -07:00
Kefeng Wang	07252dfea2	mm: use is_zone_movable_page() helper Use is_zone_movable_page() helper to simplify code. Link: https://lkml.kernel.org/r/20220726131135.146912-1-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Pankaj Gupta <pankaj.gupta@amd.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:07:20 -07:00
Miaohe Lin	1168076345	hugetlbfs: fix inaccurate comment in hugetlbfs_statfs() In some cases, e.g. when size option is not specified, f_blocks, f_bavail and f_bfree will be set to -1 instead of 0. Likewise, when nr_inodes isn't specified, f_files and f_ffree will be set to -1 too. Update the comment to make this clear. Link: https://lkml.kernel.org/r/20220726142918.51693-6-linmiaohe@huawei.com Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:07:19 -07:00
Miaohe Lin	445c809829	hugetlbfs: cleanup some comments in inode.c The function generic_file_buffered_read has been renamed to filemap_read since commit `87fa0f3eb2` ("mm/filemap: rename generic_file_buffered_read to filemap_read"). Update the corresponding comment. And duplicated taken in hugetlbfs_fill_super is removed. Link: https://lkml.kernel.org/r/20220726142918.51693-5-linmiaohe@huawei.com Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:07:19 -07:00
Miaohe Lin	990e52b17d	hugetlbfs: remove unneeded header file The header file signal.h is unneeded now. Remove it. Link: https://lkml.kernel.org/r/20220726142918.51693-4-linmiaohe@huawei.com Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:07:19 -07:00
Miaohe Lin	7ec3c362cf	hugetlbfs: remove unneeded hugetlbfs_ops forward declaration The forward declaration for hugetlbfs_ops is unnecessary. Remove it. Link: https://lkml.kernel.org/r/20220726142918.51693-3-linmiaohe@huawei.com Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:07:19 -07:00
Miaohe Lin	d00365175e	hugetlbfs: use helper macro SZ_1{K,M} Patch series "A few cleanup and fixup patches for hugetlbfs", v2. This series contains a few cleaup patches to remove unneeded forward declaration, use helper macro and so on. More details can be found in the respective changelogs. This patch (of 5): Use helper macro SZ_1K and SZ_1M to do the size conversion. Minor readability improvement. Link: https://lkml.kernel.org/r/20220726142918.51693-1-linmiaohe@huawei.com Link: https://lkml.kernel.org/r/20220726142918.51693-2-linmiaohe@huawei.com Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:07:19 -07:00
Kefeng Wang	bb077c3ffd	mm: cleanup is_highmem() It is unnecessary to add CONFIG_HIGHMEM check in is_highmem(), which has been done in is_highmem_idx(), and move is_highmem() close to is_highmem_idx(). This has no functional impact. Link: https://lkml.kernel.org/r/20220726131816.149075-1-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:07:19 -07:00
Ralph Campbell	f6c3e1ae01	mm/hmm: add a test for cross device private faults Add a simple test case for when hmm_range_fault() is called with the HMM_PFN_REQ_FAULT flag and a device private PTE is found for a device other than the hmm_range::dev_private_owner. This should cause the page to be faulted back to system memory from the other device and the PFN returned in the output array. Also, remove a piece of code that unnecessarily unmaps part of the buffer. Link: https://lkml.kernel.org/r/20220727000837.4128709-3-rcampbell@nvidia.com Link: https://lkml.kernel.org/r/20220725183615.4118795-3-rcampbell@nvidia.com Signed-off-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: Alistair Popple <apopple@nvidia.com> Cc: Felix Kuehling <felix.kuehling@amd.com> Cc: Philip Yang <Philip.Yang@amd.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:07:18 -07:00
Peter Xu	68deb82a7b	selftests: add soft-dirty into run_vmtests.sh Link: https://lkml.kernel.org/r/20220725142048.30450-4-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Nadav Amit <nadav.amit@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:07:18 -07:00
Peter Xu	c942f5bd17	selftests: soft-dirty: add test for mprotect Add two soft-dirty test cases for mprotect() on both anon or file. Link: https://lkml.kernel.org/r/20220725142048.30450-3-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Nadav Amit <nadav.amit@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:07:18 -07:00
Peter Xu	76aefad628	mm/mprotect: fix soft-dirty check in can_change_pte_writable() Patch series "mm/mprotect: Fix soft-dirty checks", v4. This patch (of 3): The check wanted to make sure when soft-dirty tracking is enabled we won't grant write bit by accident, as a page fault is needed for dirty tracking. The intention is correct but we didn't check it right because VM_SOFTDIRTY set actually means soft-dirty tracking disabled. Fix it. There's another thing tricky about soft-dirty is that, we can't check the vma flag !(vma_flags & VM_SOFTDIRTY) directly but only check it after we checked CONFIG_MEM_SOFT_DIRTY because otherwise VM_SOFTDIRTY will be defined as zero, and !(vma_flags & VM_SOFTDIRTY) will constantly return true. To avoid misuse, introduce a helper for checking whether vma has soft-dirty tracking enabled. We can easily verify this with any exclusive anonymous page, like program below: =======8<====== #include <stdio.h> #include <unistd.h> #include <stdlib.h> #include <assert.h> #include <inttypes.h> #include <stdint.h> #include <sys/types.h> #include <sys/mman.h> #include <sys/types.h> #include <sys/stat.h> #include <unistd.h> #include <fcntl.h> #include <stdbool.h> #define BIT_ULL(nr) (1ULL << (nr)) #define PM_SOFT_DIRTY BIT_ULL(55) unsigned int psize; char page; uint64_t pagemap_read_vaddr(int fd, void vaddr) { uint64_t value; int ret; ret = pread(fd, &value, sizeof(uint64_t), ((uint64_t)vaddr >> 12) * sizeof(uint64_t)); assert(ret == sizeof(uint64_t)); return value; } void clear_refs_write(void) { int fd = open("/proc/self/clear_refs", O_RDWR); assert(fd >= 0); write(fd, "4", 2); close(fd); } #define check_soft_dirty(str, expect) do { \ bool dirty = pagemap_read_vaddr(fd, page) & PM_SOFT_DIRTY; \ if (dirty != expect) { \ printf("ERROR: %s, soft-dirty=%d (expect: %d) ", str, dirty, expect); \ exit(-1); \ } \ } while (0) int main(void) { int fd = open("/proc/self/pagemap", O_RDONLY); assert(fd >= 0); psize = getpagesize(); page = mmap(NULL, psize, PROT_READ\|PROT_WRITE, MAP_ANONYMOUS\|MAP_PRIVATE, -1, 0); assert(page != MAP_FAILED); page = 1; check_soft_dirty("Just faulted in page", 1); clear_refs_write(); check_soft_dirty("Clear_refs written", 0); mprotect(page, psize, PROT_READ); check_soft_dirty("Marked RO", 0); mprotect(page, psize, PROT_READ\|PROT_WRITE); check_soft_dirty("Marked RW", 0); page = 2; check_soft_dirty("Wrote page again", 1); munmap(page, psize); close(fd); printf("Test passed. "); return 0; } =======8<====== Here we attach a Fixes to commit `64fe24a3e0` only for easy tracking, as this patch won't apply to a tree before that point. However the commit wasn't the source of problem, but instead `64e455079e`. It's just that after `64fe24a3e0` anonymous memory will also suffer from this problem with mprotect(). Link: https://lkml.kernel.org/r/20220725142048.30450-1-peterx@redhat.com Link: https://lkml.kernel.org/r/20220725142048.30450-2-peterx@redhat.com Fixes: `64e455079e` ("mm: softdirty: enable write notifications on VMAs after VM_SOFTDIRTY cleared") Fixes: `64fe24a3e0` ("mm/mprotect: try avoiding write faults for exclusive anonymous pages when changing protection") Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Nadav Amit <nadav.amit@gmail.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:07:18 -07:00
Tetsuo Handa	68aaee147e	mm: memcontrol: fix potential oom_lock recursion deadlock syzbot is reporting GFP_KERNEL allocation with oom_lock held when reporting memcg OOM [1]. If this allocation triggers the global OOM situation then the system can livelock because the GFP_KERNEL allocation with oom_lock held cannot trigger the global OOM killer because __alloc_pages_may_oom() fails to hold oom_lock. Fix this problem by removing the allocation from memory_stat_format() completely, and pass static buffer when calling from memcg OOM path. Note that the caller holding filesystem lock was the trigger for syzbot to report this locking dependency. Doing GFP_KERNEL allocation with filesystem lock held can deadlock the system even without involving OOM situation. Link: https://syzkaller.appspot.com/bug?extid=2d2aeadc6ce1e1f11d45 [1] Link: https://lkml.kernel.org/r/86afb39f-8c65-bec2-6cfc-c5e3cd600c0b@I-love.SAKURA.ne.jp Fixes: `c8713d0b23` ("mm: memcontrol: dump memory.stat during cgroup OOM") Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by: syzbot <syzbot+2d2aeadc6ce1e1f11d45@syzkaller.appspotmail.com> Suggested-by: Michal Hocko <mhocko@suse.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Shakeel Butt <shakeelb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:07:18 -07:00
Alistair Popple	65974cb910	mm/gup.c: fix formatting in check_and_migrate_movable_page() Commit `b05a79d437` ("mm/gup: migrate device coherent pages when pinning instead of failing") added a badly formatted if statement. Fix it. Link: https://lkml.kernel.org/r/20220721020552.1397598-2-apopple@nvidia.com Signed-off-by: Alistair Popple <apopple@nvidia.com> Reported-by: David Hildenbrand <david@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:07:18 -07:00
Shiyang Ruan	35fcd75af3	xfs: fail dax mount if reflink is enabled on a partition Failure notification is not supported on partitions. So, when we mount a reflink enabled xfs on a partition with dax option, let it fail with -EINVAL code. Link: https://lkml.kernel.org/r/20220609143435.393724-1-ruansy.fnst@fujitsu.com Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Cc: Dave Chinner <david@fromorbit.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:07:17 -07:00
Jiebin Sun	873f64b791	mm/memcontrol.c: remove the redundant updating of stats_flush_threshold Remove the redundant updating of stats_flush_threshold. If the global var stats_flush_threshold has exceeded the trigger value for __mem_cgroup_flush_stats, further increment is unnecessary. Apply the patch and test the pts/hackbench-1.0.0 Count:4 (160 threads). Score gain: 1.95x Reduce CPU cycles in __mod_memcg_lruvec_state (44.88% -> 0.12%) CPU: ICX 8380 x 2 sockets Core number: 40 x 2 physical cores Benchmark: pts/hackbench-1.0.0 Count:4 (160 threads) Link: https://lkml.kernel.org/r/20220722164949.47760-1-jiebin.sun@intel.com Signed-off-by: Jiebin Sun <jiebin.sun@intel.com> Acked-by: Shakeel Butt <shakeelb@google.com> Reviewed-by: Roman Gushchin <roman.gushchin@linux.dev> Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com> Acked-by: Muchun Song <songmuchun@bytedance.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Amadeusz Sawiski <amadeuszx.slawinski@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:07:17 -07:00
Axel Rasmussen	914eedcb9b	userfaultfd: don't fail on unrecognized features The basic interaction for setting up a userfaultfd is, userspace issues a UFFDIO_API ioctl, and passes in a set of zero or more feature flags, indicating the features they would prefer to use. Of course, different kernels may support different sets of features (depending on kernel version, kconfig options, architecture, etc). Userspace's expectations may also not match: perhaps it was built against newer kernel headers, which defined some features the kernel it's running on doesn't support. Currently, if userspace passes in a flag we don't recognize, the initialization fails and we return -EINVAL. This isn't great, though. Userspace doesn't have an obvious way to react to this; sure, one of the features I asked for was unavailable, but which one? The only option it has is to turn off things "at random" and hope something works. Instead, modify UFFDIO_API to just ignore any unrecognized feature flags. The interaction is now that the initialization will succeed, and as always we return the subset of feature flags that can actually be used back to userspace. Now userspace has an obvious way to react: it checks if any flags it asked for are missing. If so, it can conclude this kernel doesn't support those, and it can either resign itself to not using them, or fail with an error on its own, or whatever else. Link: https://lkml.kernel.org/r/20220722201513.1624158-1-axelrasmussen@google.com Signed-off-by: Axel Rasmussen <axelrasmussen@google.com> Cc: Peter Xu <peterx@redhat.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:07:17 -07:00
Miaohe Lin	2727cfe407	hugetlb_cgroup: fix wrong hugetlb cgroup numa stat We forget to set cft->private for numa stat file. As a result, numa stat of hstates[0] is always showed for all hstates. Encode the hstates index into cft->private to fix this issue. Link: https://lkml.kernel.org/r/20220723073804.53035-1-linmiaohe@huawei.com Fixes: `f477619990` ("hugetlb: add hugetlb.*.numa_stat file") Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Acked-by: Muchun Song <songmuchun@bytedance.com> Cc: Kees Cook <keescook@chromium.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Mina Almasry <almasrymina@google.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:07:17 -07:00

... 45 46 47 48 49 ...

1122266 Commits