When executing following commands like what document said, but the log
"#### all functions enabled ####" was not shown as expect:
1. Set a 'mod' filter:
$ echo 'write*:mod:ext3' > /sys/kernel/tracing/set_ftrace_filter
2. Invert above filter:
$ echo '!write*:mod:ext3' >> /sys/kernel/tracing/set_ftrace_filter
3. Read the file:
$ cat /sys/kernel/tracing/set_ftrace_filter
By some debugging, I found that flag FTRACE_HASH_FL_MOD was not unset
after inversion like above step 2 and then result of ftrace_hash_empty()
is incorrect.
Link: https://lkml.kernel.org/r/20220926152008.2239274-1-zhengyejian1@huawei.com
Cc: <mingo@redhat.com>
Cc: stable@vger.kernel.org
Fixes: 8c08f0d5c6 ("ftrace: Have cached module filters be an active filter")
Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
An unused macro reported by [-Wunused-macros].
This macro is used to access the sp in pt_regs because at that time
x86_32 can only get sp by kernel_stack_pointer(regs).
'3c88c692c287 ("x86/stackframe/32: Provide consistent pt_regs")'
This commit have unified the pt_regs and from them we can get sp from
pt_regs with regs->sp easily. Nowhere is using this macro anymore.
Refrencing pt_regs directly is more clear. Remove this macro for
code cleaning.
Link: https://lkml.kernel.org/r/20220924072629.104759-1-chenzhongjin@huawei.com
Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
It was found that some tracing functions in kernel/trace/trace.c acquire
an arch_spinlock_t with preemption and irqs enabled. An example is the
tracing_saved_cmdlines_size_read() function which intermittently causes
a "BUG: using smp_processor_id() in preemptible" warning when the LTP
read_all_proc test is run.
That can be problematic in case preemption happens after acquiring the
lock. Add the necessary preemption or interrupt disabling code in the
appropriate places before acquiring an arch_spinlock_t.
The convention here is to disable preemption for trace_cmdline_lock and
interupt for max_lock.
Link: https://lkml.kernel.org/r/20220922145622.1744826-1-longman@redhat.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Will Deacon <will@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: stable@vger.kernel.org
Fixes: a35873a099 ("tracing: Add conditional snapshot")
Fixes: 939c7a4f04 ("tracing: Introduce saved_cmdlines_size file")
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Drop the requirement for system-wide kernel UAPI headers to provide full
struct btf_enum64 definition. This is an unexpected requirement that
slipped in libbpf 1.0 and put unnecessary pressure ([0]) on users to have
a bleeding-edge kernel UAPI header from unreleased Linux 6.0.
To achieve this, we forward declare struct btf_enum64. But that's not
enough as there is btf_enum64_value() helper that expects to know the
layout of struct btf_enum64. So we get a bit creative with
reinterpreting memory layout as array of __u32 and accesing lo32/hi32
fields as array elements. Alternative way would be to have a local
pointer variable for anonymous struct with exactly the same layout as
struct btf_enum64, but that gets us into C++ compiler errors complaining
about invalid type casts. So play it safe, if ugly.
[0] Closes: https://github.com/libbpf/libbpf/issues/562
Fixes: d90ec262b3 ("libbpf: Add enum64 support for btf_dump")
Reported-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Toke Høiland-Jørgensen <toke@toke.dk>
Link: https://lore.kernel.org/bpf/20220927042940.147185-1-andrii@kernel.org
In the v2022.10 reference design, the seg registers are going to be
changed, resulting in a required change to the memory map in Linux.
A small 4M reservation is made at the end of 32-bit DDR to provide some
memory for the HSS to use, so that it can cache its payload.bin between
reboots of a specific context.
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
For the v2022.09 release of the reference design, the fic3 clock rate
been reduced from 62.5 MHz to 50 MHz as it allows timing to be closed
significantly more quickly by customers who chose to build the
reference design themselves.
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
When users try to add onto the reference design, they find that the
current addresses that peripherals connected to Fabric InterConnect
(FIC) 3 use are restrictive. For the v2022.09 reference design, the
peripherals have been shifted down, leaving more contiguous address
space for their custom IP/peripherals.
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
For the v2022.09 reference design the PCI root port's data region has
been moved to FIC1 from FIC0. This is a shorter path, allowing for
higher clock rates and improved through-put. As a result, the address at
which the PCIe's data region appears to the core complex has changed.
The config region's address is unchanged.
As FIC0 is no longer used, its clock can be removed too.
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
In today's edition of moving things around:
The PCIe root port on PolarFire SoC is more part of the FPGA than of
the Core Complex. It is located on the other side of the chip and,
apart from its interrupts, most of its configuration is determined
by the FPGA bitstream rather. This includes:
- address translation in both directions
- the addresses at which the config and data regions appear to the
core complex
- the clocks used by the AXI bus
- the plic interrupt used
Moving the PCIe node to the -fabric.dtsi makes it clearer than a
singular configuration for root port is not correct & allows the
base SoC dtsi to be more easily included.
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
The recently removed, accidentally included, "matr0" property was used
in place of a dma-ranges property. The PCI controller is non-functional
with mainline Linux in the v2022.02 or later reference designs and has
not worked without configuration of address-translation since v2021.08.
Add the address translation that will be used by the v2022.09 reference
design & update the compatible used by the dts. Since this change is not
backwards compatible, update the compatible to denote this, jumping over
v2022.09 directly to v2022.10.
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
The icicle kit reference design's v2022.09 release made some changes
to the memory map - including adding the ability to read the fabric
clock controllers via the system controller bus & making the PCI
controller work with upstream Linux.
While the PCI was not working in the v2022.03 design, so nothing is
broken there in terms of backwards compatibility, the fabric clocks
used in the v2022.03 design were chosen by the individual run of the
synthesis tool. In the v2022.09 reference design, the clocks are fixed
to use the "north west" fabric Clock Conditioning Circuitry.
In the v2022.10 release, the memory map on the DDR side is also
changing, so to avoid making a breaking change here twice, jump over the
v2022.09 release and straight to the v2022.10 one.
Make use of a new compatible to denote that v2022.{09,10} reference
design releases are not backwards compatible.
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
These mount option flags are obsolete since commit 12085b14a4 ("smack:
switch to private smack_mnt_opts"), remove them.
Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Return the value smk_ptrace_rule_check() directly instead of storing it
in another redundant variable.
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Xu Panda <xu.panda@zte.com.cn>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Using smk_of_current() during sk_alloc_security hook leads in
rare cases to a faulty initialization of the security context
of the created socket.
By adding the LSM hook sk_clone_security to SMACK this initialization
fault is corrected by copying the security context of the old socket
pointer to the newly cloned one.
Co-authored-by: Martin Ostertag: <martin.ostertag@elektrobit.com>
Signed-off-by: Lontke Michael <michael.lontke@elektrobit.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
nvmet is a consumer of the block layer and should not directly look at
the request_queue. Use the bdev_ helpers to retrieve the device limits
instead.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
nvmet is a consumer of the block layer and should not directly look at
the request_queue. Just use the NUMA node ID from the gendisk instead of
the request_queue.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
The hctx's run_work may be racing with the elevator switch when
reinitializing hardware queues. The queue is merely frozen in this
context, but that only prevents requests from allocating and doesn't
stop the hctx work from running. The work may get an elevator pointer
that's being torn down, and can result in use-after-free errors and
kernel panics (example below). Use the quiesced elevator switch instead,
and make the previous one static since it is now only used locally.
nvme nvme0: resetting controller
nvme nvme0: 32/0/0 default/read/poll queues
BUG: kernel NULL pointer dereference, address: 0000000000000008
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 80000020c8861067 P4D 80000020c8861067 PUD 250f8c8067 PMD 0
Oops: 0000 [#1] SMP PTI
Workqueue: kblockd blk_mq_run_work_fn
RIP: 0010:kyber_has_work+0x29/0x70
...
Call Trace:
__blk_mq_do_dispatch_sched+0x83/0x2b0
__blk_mq_sched_dispatch_requests+0x12e/0x170
blk_mq_sched_dispatch_requests+0x30/0x60
__blk_mq_run_hw_queue+0x2b/0x50
process_one_work+0x1ef/0x380
worker_thread+0x2d/0x3e0
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220927155652.3260724-1-kbusch@fb.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Add a make target, dt_compatible_check, to extract compatible strings
from kernel sources and check if they are documented by a schema.
At least version v2022.08 of dtschema with dt-check-compatible is
required.
This check can also be run manually on specific files or directories:
scripts/dtc/dt-extract-compatibles drivers/clk/ | \
xargs dt-check-compatible -v -s Documentation/devicetree/bindings/processed-schema.json
Currently, there are about 3800 undocumented compatible strings. Most of
these are cases where the binding is not yet converted (given there
are 1900 .txt binding files remaining).
Link: https://lore.kernel.org/all/20220916012510.2718170-1-robh@kernel.org/
Signed-off-by: Rob Herring <robh@kernel.org>
AF_XDP Tx descriptor cleaning in ice driver currently works in a "lazy"
way - descriptors are not cleaned immediately after send. We rather hold
on with cleaning until we see that free space in ring drops below
particular threshold. This was supposed to reduce the amount of
unnecessary work related to cleaning and instead of keeping the ring
empty, ring was rather saturated.
In AF_XDP realm cleaning Tx descriptors implies producing them to CQ.
This is a way of letting know user space that particular descriptor has
been sent, as John points out in [0].
We tried to implement serial descriptor cleaning which would be used in
conjunction with batched cleaning but it made code base more convoluted
and probably harder to maintain in future. Therefore we step away from
batched cleaning in a current form in favor of an approach where we set
RS bit on every last descriptor from a batch and clean always at the
beginning of ice_xmit_zc().
This means that we give up a bit of Tx performance, but this doesn't
hurt l2fwd scenario which is way more meaningful than txonly as this can
be treaten as AF_XDP based packet generator. l2fwd is not hurt due to
the fact that Tx side is much faster than Rx and Rx is the one that has
to catch Tx up.
FWIW Tx descriptors are still produced in a batched way.
[0]: https://lore.kernel.org/bpf/62b0a20232920_3573208ab@john.notmuch/
Fixes: 126cdfe100 ("ice: xsk: Improve AF_XDP ZC Tx and use batching API")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>