KVM/riscv changes for 6.8 part #2
- Zbc extension support for Guest/VM
- Scalar crypto extensions support for Guest/VM
- Vector crypto extensions support for Guest/VM
- Zfh[min] extensions support for Guest/VM
- Zihintntl extension support for Guest/VM
- Zvfh[min] extensions support for Guest/VM
- Zfa extension support for Guest/VM
Add cfg80211, mac80211 and iwlwifi to the all_tests config
so that the unit tests (enabled by KUNIT_ALL_TESTS) will be
run by default.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
ksm_tests was previously mmapping a region of memory, aligning the
returned pointer to a PMD boundary, then setting MADV_HUGEPAGE, but was
setting it past the end of the mmapped area due to not taking the pointer
alignment into consideration. Fix this behaviour.
Up until commit efa7df3e3b ("mm: align larger anonymous mappings on THP
boundaries"), this buggy behavior was (usually) masked because the
alignment difference was always less than PMD-size. But since the
mentioned commit, `ksm_tests -H -s 100` started failing.
Link: https://lkml.kernel.org/r/20240122120554.3108022-1-ryan.roberts@arm.com
Fixes: 3252548996 ("selftests: vm: add KSM huge pages merging time test")
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Cc: Pedro Demarchi Gomes <pedrodemargomes@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
In order for the page table level 5 to be in use, the CPU must have the
setting enabled in addition to the CONFIG option. Check for the flag to be
set to avoid false test failures on systems that do not have this cpu flag
set.
The test does a series of mmap calls including three using the
MAP_FIXED flag and specifying an address that is 1<<47 or 1<<48. These
addresses are only available if you are using level 5 page tables,
which requires both the CPU to have the capabiltiy (la57 flag) and the
kernel to be configured. Currently the test only checks for the kernel
configuration option, so this test can still report a false positive.
Here are the three failing lines:
$ ./va_high_addr_switch | grep FAILED
mmap(ADDR_SWITCH_HINT, 2 * PAGE_SIZE, MAP_FIXED): 0xffffffffffffffff - FAILED
mmap(HIGH_ADDR, MAP_FIXED): 0xffffffffffffffff - FAILED
mmap(ADDR_SWITCH_HINT, 2 * PAGE_SIZE, MAP_FIXED): 0xffffffffffffffff - FAILED
I thought (for about a second) refactoring the test so that these three
mmap calls will only be run on systems with the level 5 page tables
available, but the whole point of the test is to check the level 5
feature...
Link: https://lkml.kernel.org/r/20240119205801.62769-1-audra@redhat.com
Fixes: 4f2930c671 ("selftests/vm: only run 128TBswitch with 5-level paging")
Signed-off-by: Audra Mitchell <audra@redhat.com>
Cc: Rafael Aquini <raquini@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Adam Sindelar <adam@wowsignal.io>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The busywait timeout value is a millisecond, not a second. So the
current setting 2 is too small. On slow/busy host (or VMs) the
current timeout can expire even on "correct" execution, causing random
failures. Let's copy the WAIT_TIMEOUT from forwarding/lib.sh and set
BUSYWAIT_TIMEOUT here.
Fixes: 25ae948b44 ("selftests/net: add lib.sh")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240124061344.1864484-1-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The blamed commit below introduce a dependency in some net self-tests
towards a newly introduce helper script.
Such script is currently not included into the TEST_PROGS_EXTENDED list
and thus is not installed, causing failure for the relevant tests when
executed from the install dir.
Fix the issue updating the install targets.
Fixes: 3bdd9fd29c ("selftests/net: synchronize udpgro tests' tx and rx connection")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/076e8758e21ff2061cc9f81640e7858df775f0a9.1706131762.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Still a bit unclear whether each directory should have its own
config file, but assuming they should lets add one for tcp_ao.
The following tests still fail with this config in place:
- rst_ipv4,
- rst_ipv6,
- bench-lookups_ipv6.
other 21 pass.
Fixes: d11301f659 ("selftests/net: Add TCP-AO ICMPs accept test")
Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com>
Link: https://lore.kernel.org/r/20240124192550.1865743-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Pull networking fixes from Paolo Abeni:
"Including fixes from bpf, netfilter and WiFi.
Jakub is doing a lot of work to include the self-tests in our CI, as a
result a significant amount of self-tests related fixes is flowing in
(and will likely continue in the next few weeks).
Current release - regressions:
- bpf: fix a kernel crash for the riscv 64 JIT
- bnxt_en: fix memory leak in bnxt_hwrm_get_rings()
- revert "net: macsec: use skb_ensure_writable_head_tail to expand
the skb"
Previous releases - regressions:
- core: fix removing a namespace with conflicting altnames
- tc/flower: fix chain template offload memory leak
- tcp:
- make sure init the accept_queue's spinlocks once
- fix autocork on CPUs with weak memory model
- udp: fix busy polling
- mlx5e:
- fix out-of-bound read in port timestamping
- fix peer flow lists corruption
- iwlwifi: fix a memory corruption
Previous releases - always broken:
- netfilter:
- nft_chain_filter: handle NETDEV_UNREGISTER for inet/ingress
basechain
- nft_limit: reject configurations that cause integer overflow
- bpf: fix bpf_xdp_adjust_tail() with XSK zero-copy mbuf, avoiding a
NULL pointer dereference upon shrinking
- llc: make llc_ui_sendmsg() more robust against bonding changes
- smc: fix illegal rmb_desc access in SMC-D connection dump
- dpll: fix pin dump crash for rebound module
- bnxt_en: fix possible crash after creating sw mqprio TCs
- hv_netvsc: calculate correct ring size when PAGE_SIZE is not 4kB
Misc:
- several self-tests fixes for better integration with the netdev CI
- added several missing modules descriptions"
* tag 'net-6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (88 commits)
tsnep: Fix XDP_RING_NEED_WAKEUP for empty fill ring
tsnep: Remove FCS for XDP data path
net: fec: fix the unhandled context fault from smmu
selftests: bonding: do not test arp/ns target with mode balance-alb/tlb
fjes: fix memleaks in fjes_hw_setup
i40e: update xdp_rxq_info::frag_size for ZC enabled Rx queue
i40e: set xdp_rxq_info::frag_size
xdp: reflect tail increase for MEM_TYPE_XSK_BUFF_POOL
ice: update xdp_rxq_info::frag_size for ZC enabled Rx queue
intel: xsk: initialize skb_frag_t::bv_offset in ZC drivers
ice: remove redundant xdp_rxq_info registration
i40e: handle multi-buffer packets that are shrunk by xdp prog
ice: work on pre-XDP prog frag count
xsk: fix usage of multi-buffer BPF helpers for ZC XDP
xsk: make xsk_buff_pool responsible for clearing xdp_buff::flags
xsk: recycle buffer in case Rx queue was full
net: fill in MODULE_DESCRIPTION()s for rvu_mbox
net: fill in MODULE_DESCRIPTION()s for litex
net: fill in MODULE_DESCRIPTION()s for fsl_pq_mdio
net: fill in MODULE_DESCRIPTION()s for fec
...
One issue comes up while building selftest/landlock/fs_test on my side
(gcc 7.3/glibc-2.28/kernel-4.19).
gcc -Wall -O2 -isystem fs_test.c -lcap -o selftests/landlock/fs_test
fs_test.c:4575:9: error: initializer element is not constant
.mnt = mnt_tmp,
^~~~~~~
Signed-off-by: Hu Yadi <hu.yadi@h3c.com>
Suggested-by: Jiao <jiaoxupo@h3c.com>
Reviewed-by: Berlin <berlin@h3c.com>
Link: https://lore.kernel.org/r/20240124022908.42100-1-hu.yadi@h3c.com
Fixes: 04f9070e99 ("selftests/landlock: Add tests for pseudo filesystems")
[mic: Factor out mount's data string and make mnt_tmp static]
Signed-off-by: Mickaël Salaün <mic@digikod.net>
One issue comes up while building selftest/landlock/net_test on my side
(gcc 7.3/glibc-2.28/kernel-4.19).
net_test.c: In function ‘set_service’:
net_test.c:91:45: warning: implicit declaration of function ‘gettid’; [-Wimplicit-function-declaration]
"_selftests-landlock-net-tid%d-index%d", gettid(),
^~~~~~
getgid
net_test.c:(.text+0x4e0): undefined reference to `gettid'
Signed-off-by: Hu Yadi <hu.yadi@h3c.com>
Suggested-by: Jiao <jiaoxupo@h3c.com>
Reviewed-by: Berlin <berlin@h3c.com>
Fixes: a549d055a2 ("selftests/landlock: Add network tests")
Link: https://lore.kernel.org/r/20240123062621.25082-1-hu.yadi@h3c.com
[mic: Cosmetic fixes]
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Add tests for LSM interactions (both bpf_token_capable and bpf_token_cmd
LSM hooks) with BPF token in bpf() subsystem. Now child process passes
back token FD for parent to be able to do tests with token originating
in "wrong" userns. But we also create token in initns and check that
token LSMs don't accidentally reject BPF operations when capable()
checks pass without BPF token.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-31-andrii@kernel.org
Add a selftest that attempts to conceptually replicate intended BPF
token use cases inside user namespaced container.
Child process is forked. It is then put into its own userns and mountns.
Child creates BPF FS context object. This ensures child userns is
captured as the owning userns for this instance of BPF FS. Given setting
delegation mount options is privileged operation, we ensure that child
cannot set them.
This context is passed back to privileged parent process through Unix
socket, where parent sets up delegation options, creates, and mounts it
as a detached mount. This mount FD is passed back to the child to be
used for BPF token creation, which allows otherwise privileged BPF
operations to succeed inside userns.
We validate that all of token-enabled privileged commands (BPF_BTF_LOAD,
BPF_MAP_CREATE, and BPF_PROG_LOAD) work as intended. They should only
succeed inside the userns if a) BPF token is provided with proper
allowed sets of commands and types; and b) namespaces CAP_BPF and other
privileges are set. Lacking a) or b) should lead to -EPERM failures.
Based on suggested workflow by Christian Brauner ([0]).
[0] https://lore.kernel.org/bpf/20230704-hochverdient-lehne-eeb9eeef785e@brauner/
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-17-andrii@kernel.org
Add basic support of BPF token to BPF_PROG_LOAD. BPF_F_TOKEN_FD flag
should be set in prog_flags field when providing prog_token_fd.
Wire through a set of allowed BPF program types and attach types,
derived from BPF FS at BPF token creation time. Then make sure we
perform bpf_token_capable() checks everywhere where it's relevant.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-7-andrii@kernel.org
Allow providing token_fd for BPF_MAP_CREATE command to allow controlled
BPF map creation from unprivileged process through delegated BPF token.
New BPF_F_TOKEN_FD flag is added to specify together with BPF token FD
for BPF_MAP_CREATE command.
Wire through a set of allowed BPF map types to BPF token, derived from
BPF FS at BPF token creation time. This, in combination with allowed_cmds
allows to create a narrowly-focused BPF token (controlled by privileged
agent) with a restrictive set of BPF maps that application can attempt
to create.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-5-andrii@kernel.org
If there is more than 32 cpus the bitmask will start to contain
commas, leading to:
./rps_default_mask.sh: line 36: [: 00000000,00000000: integer expression expected
Remove the commas, bash doesn't interpret leading zeroes as oct
so that should be good enough. Switch to bash, Simon reports that
not all shells support this type of substitution.
Fixes: c12e0d5f26 ("self-tests: introduce self-tests for RPS default mask")
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240122195815.638997-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
After the previous patch that speeded up the test (by avoiding neigh
discovery in IPv6), the BPF CI occasionally hits this error:
rcv tstamp unexpected pkt rcv tstamp: actual 0 == expected 0
The test complains about the cmsg returned from the recvmsg() does not
have the rcv timestamp. Setting skb->tstamp or not is
controlled by a kernel static key "netstamp_needed_key". The static
key is enabled whenever this is at least one sk with the SOCK_TIMESTAMP
set.
The test_redirect_dtime does use setsockopt() to turn on
the SOCK_TIMESTAMP for the reading sk. In the kernel
net_enable_timestamp() has a delay to enable the "netstamp_needed_key"
when CONFIG_JUMP_LABEL is set. This potential delay is the likely reason
for packet missing rcv timestamp occasionally.
This patch is to create udp sockets with SOCK_TIMESTAMP set.
It sends and receives some packets until the received packet
has a rcv timestamp. It currently retries at most 5 times with 1s
in between. This should be enough to wait for the "netstamp_needed_key".
It then holds on to the socket and only closes it at the end of the test.
This guarantees that the test has the "netstamp_needed_key" key turned
on from the beginning.
To simplify the udp sockets setup, they are sending/receiving packets
in the same netns (ns_dst is used) and communicate over the "lo" dev.
Hence, the patch enables the "lo" dev in the ns_dst.
Fixes: c803475fd8 ("bpf: selftests: test skb->tstamp in redirect_neigh")
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240120060518.3604920-2-martin.lau@linux.dev
BPF CI has been reporting the tc_redirect_dtime test failing
from time to time:
test_inet_dtime:PASS:setns src 0 nsec
(network_helpers.c:253: errno: No route to host) Failed to connect to server
close_netns:PASS:setns 0 nsec
test_inet_dtime:FAIL:connect_to_fd unexpected connect_to_fd: actual -1 < expected 0
test_tcp_clear_dtime:PASS:tcp ip6 clear dtime ingress_fwdns_p100 0 nsec
The connect_to_fd failure (EHOSTUNREACH) is from the
test_tcp_clear_dtime() test and it is the very first IPv6 traffic
after setting up all the links, addresses, and routes.
The symptom is this first connect() is always slow. In my setup, it
could take ~3s.
After some tracing and tcpdump, the slowness is mostly spent in
the neighbor solicitation in the "ns_fwd" namespace while
the "ns_src" and "ns_dst" are fine.
I forced the kernel to drop the neighbor solicitation messages.
I can then reproduce EHOSTUNREACH. What actually happen could be:
- the neighbor advertisement came back a little slow.
- the "ns_fwd" namespace concluded a neighbor discovery failure
and triggered the ndisc_error_report() => ip6_link_failure() =>
icmpv6_send(skb, ICMPV6_DEST_UNREACH, ICMPV6_ADDR_UNREACH, 0)
- the client's connect() reports EHOSTUNREACH after receiving
the ICMPV6_DEST_UNREACH message.
The neigh table of both "ns_src" and "ns_dst" namespace has already
been manually populated but not the "ns_fwd" namespace. This patch
fixes it by manually populating the neigh table also in the "ns_fwd"
namespace.
Although the namespace configuration part had been existed before
the tc_redirect_dtime test, still Fixes-tagging the patch when
the tc_redirect_dtime test was added since it is the only test
hitting it so far.
Fixes: c803475fd8 ("bpf: selftests: test skb->tstamp in redirect_neigh")
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240120060518.3604920-1-martin.lau@linux.dev
If CONFIG_BPF_JIT_ALWAYS_ON is not set and bpf_jit_enable is 0, there
exist 6 failed tests.
[root@linux bpf]# echo 0 > /proc/sys/net/core/bpf_jit_enable
[root@linux bpf]# echo 0 > /proc/sys/kernel/unprivileged_bpf_disabled
[root@linux bpf]# ./test_verifier | grep FAIL
#106/p inline simple bpf_loop call FAIL
#107/p don't inline bpf_loop call, flags non-zero FAIL
#108/p don't inline bpf_loop call, callback non-constant FAIL
#109/p bpf_loop_inline and a dead func FAIL
#110/p bpf_loop_inline stack locations for loop vars FAIL
#111/p inline bpf_loop call in a big program FAIL
Summary: 768 PASSED, 15 SKIPPED, 6 FAILED
The test log shows that callbacks are not allowed in non-JITed programs,
interpreter doesn't support them yet, thus these tests should be skipped
if jit is disabled.
Add an explicit flag F_NEEDS_JIT_ENABLED to those tests to mark that they
require JIT enabled in bpf_loop_inline.c, check the flag and jit_disabled
at the beginning of do_test_single() to handle this case.
With this patch:
[root@linux bpf]# echo 0 > /proc/sys/net/core/bpf_jit_enable
[root@linux bpf]# echo 0 > /proc/sys/kernel/unprivileged_bpf_disabled
[root@linux bpf]# ./test_verifier | grep FAIL
Summary: 768 PASSED, 21 SKIPPED, 0 FAILED
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240123090351.2207-3-yangtiezhu@loongson.cn
Some of the BPF selftests use the "p" constraint in inline assembly
snippets, for input operands for MOV (rN = rM) instructions.
This is mainly done via the __imm_ptr macro defined in
tools/testing/selftests/bpf/progs/bpf_misc.h:
#define __imm_ptr(name) [name]"p"(&name)
Example:
int consume_first_item_only(void *ctx)
{
struct bpf_iter_num iter;
asm volatile (
/* create iterator */
"r1 = %[iter];"
[...]
:
: __imm_ptr(iter)
: CLOBBERS);
[...]
}
The "p" constraint is a tricky one. It is documented in the GCC manual
section "Simple Constraints":
An operand that is a valid memory address is allowed. This is for
``load address'' and ``push address'' instructions.
p in the constraint must be accompanied by address_operand as the
predicate in the match_operand. This predicate interprets the mode
specified in the match_operand as the mode of the memory reference for
which the address would be valid.
There are two problems:
1. It is questionable whether that constraint was ever intended to be
used in inline assembly templates, because its behavior really
depends on compiler internals. A "memory address" is not the same
than a "memory operand" or a "memory reference" (constraint "m"), and
in fact its usage in the template above results in an error in both
x86_64-linux-gnu and bpf-unkonwn-none:
foo.c: In function ‘bar’:
foo.c:6:3: error: invalid 'asm': invalid expression as operand
6 | asm volatile ("r1 = %[jorl]" : : [jorl]"p"(&jorl));
| ^~~
I would assume the same happens with aarch64, riscv, and most/all
other targets in GCC, that do not accept operands of the form A + B
that are not wrapped either in a const or in a memory reference.
To avoid that error, the usage of the "p" constraint in internal GCC
instruction templates is supposed to be complemented by the 'a'
modifier, like in:
asm volatile ("r1 = %a[jorl]" : : [jorl]"p"(&jorl));
Internally documented (in GCC's final.cc) as:
%aN means expect operand N to be a memory address
(not a memory reference!) and print a reference
to that address.
That works because when the modifier 'a' is found, GCC prints an
"operand address", which is not the same than an "operand".
But...
2. Even if we used the internal 'a' modifier (we shouldn't) the 'rN =
rM' instruction really requires a register argument. In cases
involving automatics, like in the examples above, we easily end with:
bar:
#APP
r1 = r10-4
#NO_APP
In other cases we could conceibly also end with a 64-bit label that
may overflow the 32-bit immediate operand of `rN = imm32'
instructions:
r1 = foo
All of which is clearly wrong.
clang happens to do "the right thing" in the current usage of __imm_ptr
in the BPF tests, because even with -O2 it seems to "reload" the
fp-relative address of the automatic to a register like in:
bar:
r1 = r10
r1 += -4
#APP
r1 = r1
#NO_APP
Which is what GCC would generate with -O0. Whether this is by chance
or by design, the compiler shouln't be expected to do that reload
driven by the "p" constraint.
This patch changes the usage of the "p" constraint in the BPF
selftests macros to use the "r" constraint instead. If a register is
what is required, we should let the compiler know.
Previous discussion in bpf@vger:
https://lore.kernel.org/bpf/87h6p5ebpb.fsf@oracle.com/T/#ef0df83d6975c34dff20bf0dd52e078f5b8ca2767
Tested in bpf-next master.
No regressions.
Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com>
Cc: Yonghong Song <yonghong.song@linux.dev>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20240123181309.19853-1-jose.marchesi@oracle.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>