linux

Author	SHA1	Message	Date
Andrii Nakryiko	5532dfd42e	libbpf: Simplify BPF program auto-attach code Remove the need to explicitly pass bpf_sec_def for auto-attachable BPF programs, as it is already recorded at bpf_object__open() time for all recognized type of BPF programs. This further reduces number of explicit calls to find_sec_def(), simplifying further refactorings. No functional changes are done by this patch. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20210914014733.2768-4-andrii@kernel.org	2021-09-14 15:49:24 -07:00
Andrii Nakryiko	91b4d1d1d5	libbpf: Ensure BPF prog types are set before relocations Refactor bpf_object__open() sequencing to perform BPF program type detection based on SEC() definitions before we get to relocations collection. This allows to have more information about BPF program by the time we get to, say, struct_ops relocation gathering. This, subsequently, simplifies struct_ops logic and removes the need to perform extra find_sec_def() resolution. With this patch libbpf will require all struct_ops BPF programs to be marked with SEC("struct_ops") or SEC("struct_ops/xxx") annotations. Real-world applications are already doing that through something like selftests's BPF_STRUCT_OPS() macro. This change streamlines libbpf's internal handling of SEC() definitions and is in the sprit of upcoming libbpf-1.0 section strictness changes ([0]). [0] https://github.com/libbpf/libbpf/wiki/Libbpf:-the-road-to-v1.0#stricter-and-more-uniform-bpf-program-section-name-sec-handling Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20210914014733.2768-3-andrii@kernel.org	2021-09-14 15:49:24 -07:00
Andrii Nakryiko	53df63ccdc	selftests/bpf: Update selftests to always provide "struct_ops" SEC Update struct_ops selftests to always specify "struct_ops" section prefix. Libbpf will require a proper BPF program type set in the next patch, so this prevents tests breaking. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20210914014733.2768-2-andrii@kernel.org	2021-09-14 15:49:24 -07:00
Rafael David Tinoco	ca304b40c2	libbpf: Introduce legacy kprobe events support Allow kprobe tracepoint events creation through legacy interface, as the kprobe dynamic PMUs support, used by default, was only created in v4.17. Store legacy kprobe name in struct bpf_perf_link, instead of creating a new "subclass" off of bpf_perf_link. This is ok as it's just two new fields, which are also going to be reused for legacy uprobe support in follow up patches. Signed-off-by: Rafael David Tinoco <rafaeldtinoco@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210912064844.3181742-1-rafaeldtinoco@gmail.com	2021-09-14 14:44:45 -07:00
Andrii Nakryiko	2f38304127	libbpf: Make libbpf_version.h non-auto-generated Turn previously auto-generated libbpf_version.h header into a normal header file. This prevents various tricky Makefile integration issues, simplifies the overall build process, but also allows to further extend it with some more versioning-related APIs in the future. To prevent accidental out-of-sync versions as defined by libbpf.map and libbpf_version.h, Makefile checks their consistency at build time. Simultaneously with this change bump libbpf.map to v0.6. Also undo adding libbpf's output directory into include path for kernel/bpf/preload, bpftool, and resolve_btfids, which is not necessary because libbpf_version.h is just a normal header like any other. Fixes: `0b46b75505` ("libbpf: Add LIBBPF_DEPRECATED_SINCE macro for scheduling API deprecations") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210913222309.3220849-1-andrii@kernel.org	2021-09-13 15:36:47 -07:00
Daniel Borkmann	dbd7eb14e0	bpf, selftests: Replicate tailcall limit test for indirect call case The tailcall_3 test program uses bpf_tail_call_static() where the JIT would patch a direct jump. Add a new tailcall_6 test program replicating exactly the same test just ensuring that bpf_tail_call() uses a map index where the verifier cannot make assumptions this time. In other words, this will now cover both on x86-64 JIT, meaning, JIT images with emit_bpf_tail_call_direct() emission as well as JIT images with emit_bpf_tail_call_indirect() emission. # echo 1 > /proc/sys/net/core/bpf_jit_enable # ./test_progs -t tailcalls #136/1 tailcalls/tailcall_1:OK #136/2 tailcalls/tailcall_2:OK #136/3 tailcalls/tailcall_3:OK #136/4 tailcalls/tailcall_4:OK #136/5 tailcalls/tailcall_5:OK #136/6 tailcalls/tailcall_6:OK #136/7 tailcalls/tailcall_bpf2bpf_1:OK #136/8 tailcalls/tailcall_bpf2bpf_2:OK #136/9 tailcalls/tailcall_bpf2bpf_3:OK #136/10 tailcalls/tailcall_bpf2bpf_4:OK #136/11 tailcalls/tailcall_bpf2bpf_5:OK #136 tailcalls:OK Summary: 1/11 PASSED, 0 SKIPPED, 0 FAILED # echo 0 > /proc/sys/net/core/bpf_jit_enable # ./test_progs -t tailcalls #136/1 tailcalls/tailcall_1:OK #136/2 tailcalls/tailcall_2:OK #136/3 tailcalls/tailcall_3:OK #136/4 tailcalls/tailcall_4:OK #136/5 tailcalls/tailcall_5:OK #136/6 tailcalls/tailcall_6:OK [...] For interpreter, the tailcall_1-6 tests are passing as well. The later tailcall_bpf2bpf_* are failing due lack of bpf2bpf + tailcall support in interpreter, so this is expected. Also, manual inspection shows that both loaded programs from tailcall_3 and tailcall_6 test case emit the expected opcodes: * tailcall_3 disasm, emit_bpf_tail_call_direct(): [...] b: push %rax c: push %rbx d: push %r13 f: mov %rdi,%rbx 12: movabs $0xffff8d3f5afb0200,%r13 1c: mov %rbx,%rdi 1f: mov %r13,%rsi 22: xor %edx,%edx _ 24: mov -0x4(%rbp),%eax \| limit check 2a: cmp $0x20,%eax \| 2d: ja 0x0000000000000046 \| 2f: add $0x1,%eax \| 32: mov %eax,-0x4(%rbp) \|_ 38: nopl 0x0(%rax,%rax,1) 3d: pop %r13 3f: pop %rbx 40: pop %rax 41: jmpq 0xffffffffffffe377 [...] * tailcall_6 disasm, emit_bpf_tail_call_indirect(): [...] 47: movabs $0xffff8d3f59143a00,%rsi 51: mov %edx,%edx 53: cmp %edx,0x24(%rsi) 56: jbe 0x0000000000000093 _ 58: mov -0x4(%rbp),%eax \| limit check 5e: cmp $0x20,%eax \| 61: ja 0x0000000000000093 \| 63: add $0x1,%eax \| 66: mov %eax,-0x4(%rbp) \|_ 6c: mov 0x110(%rsi,%rdx,8),%rcx 74: test %rcx,%rcx 77: je 0x0000000000000093 79: pop %rax 7a: mov 0x30(%rcx),%rcx 7e: add $0xb,%rcx 82: callq 0x000000000000008e 87: pause 89: lfence 8c: jmp 0x0000000000000087 8e: mov %rcx,(%rsp) 92: retq [...] Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Tested-by: Tiezhu Yang <yangtiezhu@loongson.cn> Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> Acked-by: Paul Chaignon <paul@cilium.io> Link: https://lore.kernel.org/bpf/CAM1=_QRyRVCODcXo_Y6qOm1iT163HoiSj8U2pZ8Rj3hzMTT=HQ@mail.gmail.com Link: https://lore.kernel.org/bpf/20210910091900.16119-1-daniel@iogearbox.net	2021-09-13 14:52:22 -07:00
Song Liu	025bd7c753	selftests/bpf: Add test for bpf_get_branch_snapshot This test uses bpf_get_branch_snapshot from a fexit program. The test uses a target function (bpf_testmod_loop_test) and compares the record against kallsyms. If there isn't enough record matching kallsyms, the test fails. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210910183352.3151445-4-songliubraving@fb.com	2021-09-13 10:53:50 -07:00
Song Liu	856c02dbce	bpf: Introduce helper bpf_get_branch_snapshot Introduce bpf_get_branch_snapshot(), which allows tracing pogram to get branch trace from hardware (e.g. Intel LBR). To use the feature, the user need to create perf_event with proper branch_record filtering on each cpu, and then calls bpf_get_branch_snapshot in the bpf function. On Intel CPUs, VLBR event (raw event 0x1b00) can be use for this. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210910183352.3151445-3-songliubraving@fb.com	2021-09-13 10:53:50 -07:00
Vadim Fedorenko	3384c7c764	selftests/bpf: Test new __sk_buff field hwtstamp Analogous to the gso_segs selftests introduced in commit `d9ff286a0f` ("bpf: allow BPF programs access skb_shared_info->gso_segs field"). Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20210909220409.8804-3-vfedorenko@novek.ru	2021-09-10 23:20:13 +02:00
Vadim Fedorenko	f64c4acea5	bpf: Add hardware timestamp field to __sk_buff BPF programs may want to know hardware timestamps if NIC supports such timestamping. Expose this data as hwtstamp field of __sk_buff the same way as gso_segs/gso_size. This field could be accessed from the same programs as tstamp field, but it's read-only field. Explicit test to deny access to padding data is added to bpf_skb_is_valid_access. Also update BPF_PROG_TEST_RUN tests of the feature. Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20210909220409.8804-2-vfedorenko@novek.ru	2021-09-10 23:19:58 +02:00
Magnus Karlsson	909f0e2820	selftests: xsk: Add tests for 2K frame size Add tests for 2K frame size. Both a standard send and receive test and one testing for invalid descriptors when the frame size is 2K. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/bpf/20210907071928.9750-21-magnus.karlsson@gmail.com	2021-09-10 21:15:32 +02:00
Magnus Karlsson	0d1b7f3a00	selftests: xsk: Add tests for invalid xsk descriptors Add tests for invalid xsk descriptors in the Tx ring. A number of handcrafted nasty invalid descriptors are created and submitted to the tx ring to check that they are validated correctly. Corner case valid ones are also sent. The tests are run for both aligned and unaligned mode. pkt_stream_set() is introduced to be able to create a hand-crafted packet stream where every single packet is specified in detail. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/bpf/20210907071928.9750-20-magnus.karlsson@gmail.com	2021-09-10 21:15:32 +02:00
Magnus Karlsson	6ce67b5165	selftests: xsk: Eliminate test specific if-statement in test runner Eliminate a test specific if-statement for the RX_FILL_EMTPY stats test that is present in the test runner. We can do this as we now have the use_addr_for_fill option. Just create and empty Rx packet stream and indicated that the test runner should use the addresses in that to populate the fill ring. As there are no packets in the stream, the fill ring will be empty and we will get the error stats that we want to test. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/bpf/20210907071928.9750-19-magnus.karlsson@gmail.com	2021-09-10 21:15:32 +02:00
Magnus Karlsson	a4ba98dd0c	selftests: xsk: Add test for unaligned mode Add a test for unaligned mode in which packet buffers can be placed anywhere within the umem. Some packets are made to straddle page boundaries in order to check for correctness. On the Tx side, buffers are now allocated according to the addresses found in the packet stream. Thus, the placement of buffers can be controlled with the boolean use_addr_for_fill in the packet stream. One new pkt_stream interface is introduced: pkt_stream_replace_half() that replaces every other packet in the default packet stream with the specified new packet. The constant DEFAULT_OFFSET is also introduced. It specifies at what offset from the start of a chunk a Tx packet is placed by the sending thread. This is just to be able to test that it is possible to send packets at an offset not equal to zero. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/bpf/20210907071928.9750-18-magnus.karlsson@gmail.com	2021-09-10 21:15:31 +02:00
Magnus Karlsson	605091c510	selftests: xsk: Introduce replacing the default packet stream Introduce the concept of a default packet stream that is the set of packets sent by most tests. Then add the ability to replace it for a test that would like to send or receive something else through the use of the function pkt_stream_replace() and then restored with pkt_stream_restore_default(). These are then used to convert the STAT_TEST_TX_INVALID to use these new APIs. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/bpf/20210907071928.9750-17-magnus.karlsson@gmail.com	2021-09-10 21:15:31 +02:00
Magnus Karlsson	8abf6f725a	selftests: xsk: Allow for invalid packets Allow for invalid packets to be sent. These are verified by the Rx thread not to be received. Or put in another way, if they are received, the test will fail. This feature will be used to eliminate an if statement for a stats test and will also be used by other tests in later patches. The previous code could only deal with valid packets. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/bpf/20210907071928.9750-16-magnus.karlsson@gmail.com	2021-09-10 21:15:31 +02:00
Magnus Karlsson	8ce7192b50	selftests: xsk: Eliminate MAX_SOCKS define Remove the MAX_SOCKS define as it always will be one for the forseable future and the code does not work for any other case anyway. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/bpf/20210907071928.9750-15-magnus.karlsson@gmail.com	2021-09-10 21:15:31 +02:00
Magnus Karlsson	e2d850d534	selftests: xsx: Make pthreads local scope Make the pthread_t variables local scope instead of global. No reason for them to be global. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/bpf/20210907071928.9750-14-magnus.karlsson@gmail.com	2021-09-10 21:15:31 +02:00
Magnus Karlsson	af6731d1e1	selftests: xsk: Make xdp_flags and bind_flags local Make xdp_flags and bind_flags local instead of global by moving them into the interface object. These flags decide if the socket should be created in SKB mode or in DRV mode and therefore they are sticky and will survive a test_spec_reset. Since every test is first run in SKB mode then in DRV mode, this change only happens once. With this change, the configured_mode global variable can also be erradicated. The first test_spec_init() also becomes superfluous and can be eliminated. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/bpf/20210907071928.9750-13-magnus.karlsson@gmail.com	2021-09-10 21:15:31 +02:00
Magnus Karlsson	85c6c95739	selftests: xsk: Specify number of sockets to create Add the ability in the test specification to specify numbers of sockets to create. The default is one socket. This is then used to remove test specific if-statements around the bpf_res tests. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/bpf/20210907071928.9750-12-magnus.karlsson@gmail.com	2021-09-10 21:15:31 +02:00
Magnus Karlsson	55be575dc1	selftests: xsk: Replace second_step global variable Replace the second_step global variable with a test specification variable called total_steps that a test can be set to indicate how many times the packet stream should be sent without reinitializing any sockets. This eliminates test specific code in the test runner around the bidirectional test. The total_steps variable is 1 by default as most tests only need a single round of packets. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/bpf/20210907071928.9750-11-magnus.karlsson@gmail.com	2021-09-10 21:15:31 +02:00
Magnus Karlsson	1856c24db0	selftests: xsk: Introduce rx_on and tx_on in ifobject Introduce rx_on and tx_on in the ifobject so that we can describe if the thread should create a socket with only tx, rx, or both. This eliminates some test specific if statements from the code. We can also eliminate the flow vector structure now as this is fully specified by the tx_on and rx_on variables. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/bpf/20210907071928.9750-10-magnus.karlsson@gmail.com	2021-09-10 21:15:31 +02:00
Magnus Karlsson	119d4b02fe	selftests: xsk: Add use_poll to ifobject Add a use_poll option to the ifobject so that we do not need to use a test specific if-statement in the test runner. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/bpf/20210907071928.9750-9-magnus.karlsson@gmail.com	2021-09-10 21:15:31 +02:00
Magnus Karlsson	53cb3cec2f	selftests: xsx: Introduce test name in test spec Introduce the test name in the test specification. This so we can set the name locally in the test function and simplify the logic for printing out test results. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/bpf/20210907071928.9750-8-magnus.karlsson@gmail.com	2021-09-10 21:15:31 +02:00
Magnus Karlsson	c160d7afba	selftests: xsk: Make frame_size configurable Make the frame size configurable instead of it being hard coded to a default. This is a property of the umem and will make it possible to implement tests for different umem frame sizes in a later patch. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/bpf/20210907071928.9750-7-magnus.karlsson@gmail.com	2021-09-10 21:15:31 +02:00
Magnus Karlsson	4bf8ee65ba	selftests: xsk: Move rxqsize into xsk_socket_info Move the global variable rxqsize to struct xsk_socket_info as it describes the size of a ring in that struct. By default, it is set to the size dictated by libbpf. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/bpf/20210907071928.9750-6-magnus.karlsson@gmail.com	2021-09-10 21:15:31 +02:00
Magnus Karlsson	83f4ae2f26	selftests: xsk: Move num_frames and frame_headroom to xsk_umem_info Move the global variables num_frames and frame_headroom to struct xsk_umem_info. They describe properties of the umem so no reason for them to be global. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/bpf/20210907071928.9750-5-magnus.karlsson@gmail.com	2021-09-10 21:15:30 +02:00
Magnus Karlsson	ce74acaf01	selftests: xsk: Introduce test specifications Introduce a test specification to be able to concisely describe a test. Currently, a test is implemented by sprinkling test specific if statements here and there, which is not scalable or easy to understand. The end goal with this patch set is to come to the point in which a test is completely specified by a test specification that can easily be constructed in a single function so that new tests can be added without too much trouble. This test specification will be run by a test runner that has no idea about tests. It just executes the what test specification states. This patch introduces the test specification and, as a start, puts the two interface objects in there, one containing the packet stream to be sent and the other one the packet stream that is supposed to be received for a test to pass. The global variables containing these can then be eliminated. The following patches will convert each existing test into a test specification and add the needed fields into it and the functionality in the test runner that act on the test specification. At the end, the test runner should contain no test specific code and each test should be described in a single simple function. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/bpf/20210907071928.9750-4-magnus.karlsson@gmail.com	2021-09-10 21:15:30 +02:00
Magnus Karlsson	744eb5c882	selftests: xsk: Introduce type for thread function Introduce a typedef of the thread function so this can be passed to init_iface() in order to simplify that function. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/bpf/20210907071928.9750-3-magnus.karlsson@gmail.com	2021-09-10 21:15:30 +02:00
Magnus Karlsson	ed7b74dc77	selftests: xsk: Simplify xsk and umem arrays Simplify the xsk_info and umem_info allocation by allocating them upfront in an array, instead of allocating an array of pointers to future creations of these. Allocating them upfront also has the advantage that configuration information can be stored in these structures instead of relying on global variables. With the previous structure, xsk_info and umem_info were created too late to be able to store most configuration information. This will be used to eliminate most global variables in later patches in this series. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/bpf/20210907071928.9750-2-magnus.karlsson@gmail.com	2021-09-10 21:15:30 +02:00
Quentin Monnet	0b46b75505	libbpf: Add LIBBPF_DEPRECATED_SINCE macro for scheduling API deprecations Introduce a macro LIBBPF_DEPRECATED_SINCE(major, minor, message) to prepare the deprecation of two API functions. This macro marks functions as deprecated when libbpf's version reaches the values passed as an argument. As part of this change libbpf_version.h header is added with recorded major (LIBBPF_MAJOR_VERSION) and minor (LIBBPF_MINOR_VERSION) libbpf version macros. They are now part of libbpf public API and can be relied upon by user code. libbpf_version.h is installed system-wide along other libbpf public headers. Due to this new build-time auto-generated header, in-kernel applications relying on libbpf (resolve_btfids, bpftool, bpf_preload) are updated to include libbpf's output directory as part of a list of include search paths. Better fix would be to use libbpf's make_install target to install public API headers, but that clean up is left out as a future improvement. The build changes were tested by building kernel (with KBUILD_OUTPUT and O= specified explicitly), bpftool, libbpf, selftests/bpf, and resolve_btfids builds. No problems were detected. Note that because of the constraints of the C preprocessor we have to write a few lines of macro magic for each version used to prepare deprecation (0.6 for now). Also, use LIBBPF_DEPRECATED_SINCE() to schedule deprecation of btf__get_from_id() and btf__load(), which are replaced by btf__load_from_kernel_by_id() and btf__load_into_kernel(), respectively, starting from future libbpf v0.6. This is part of libbpf 1.0 effort ([0]). [0] Closes: https://github.com/libbpf/libbpf/issues/278 Co-developed-by: Quentin Monnet <quentin@isovalent.com> Co-developed-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210908213226.1871016-1-andrii@kernel.org	2021-09-09 23:28:05 +02:00
Andrii Nakryiko	006a5099fc	libbpf: Fix build with latest gcc/binutils with LTO After updating to binutils 2.35, the build began to fail with an assembler error. A bug was opened on the Red Hat Bugzilla a few days later for the same issue. Work around the problem by using the new `symver` attribute (introduced in GCC 10) as needed instead of assembler directives. This addresses Red Hat ([0]) and OpenSUSE ([1]) bug reports, as well as libbpf issue ([2]). [0]: https://bugzilla.redhat.com/show_bug.cgi?id=1863059 [1]: https://bugzilla.opensuse.org/show_bug.cgi?id=1188749 [2]: Closes: https://github.com/libbpf/libbpf/issues/338 Co-developed-by: Patrick McCarty <patrick.mccarty@intel.com> Co-developed-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Patrick McCarty <patrick.mccarty@intel.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210907221023.2660953-1-andrii@kernel.org	2021-09-07 19:32:04 -07:00
Matt Smith	980a1a4c34	selftests/bpf: Add checks for X__elf_bytes() skeleton helper This patch adds two checks for the X__elf_bytes BPF skeleton helper method. The first asserts that the pointer returned from the helper method is valid, the second asserts that the provided size pointer is set. Signed-off-by: Matt Smith <alastorze@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210901194439.3853238-4-alastorze@fb.com	2021-09-07 17:44:39 -07:00
Matt Smith	a6cc6b34b9	bpftool: Provide a helper method for accessing skeleton's embedded ELF data This adds a skeleton method X__elf_bytes() which returns the binary data of the compiled and embedded BPF object file. It additionally sets the size of the return data to the provided size_t pointer argument. The assignment to s->data is cast to void * to ensure no warning is issued if compiled with a previous version of libbpf where the bpf_object_skeleton field is void * instead of const void * Signed-off-by: Matt Smith <alastorze@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210901194439.3853238-3-alastorze@fb.com	2021-09-07 17:41:23 -07:00
Matt Smith	08a6f22ef6	libbpf: Change bpf_object_skeleton data field to const pointer This change was necessary to enforce the implied contract that bpf_object_skeleton->data should not be mutated. The data will be cast to `void ` during assignment to handle the case where a user is compiling with older libbpf headers to avoid a compiler warning of `const void ` data being cast to `void *` Signed-off-by: Matt Smith <alastorze@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210901194439.3853238-2-alastorze@fb.com	2021-09-07 17:33:49 -07:00
Toke Høiland-Jørgensen	03e601f48b	libbpf: Don't crash on object files with no symbol tables If libbpf encounters an ELF file that has been stripped of its symbol table, it will crash in bpf_object__add_programs() when trying to dereference the obj->efile.symbols pointer. Fix this by erroring out of bpf_object__elf_collect() if it is not able able to find the symbol table. v2: - Move check into bpf_object__elf_collect() and add nice error message Fixes: `6245947c1b` ("libbpf: Allow gaps in BPF program sections to support overriden weak functions") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210901114812.204720-1-toke@redhat.com	2021-09-07 17:18:48 -07:00
Neil Spring	b238290b96	bpf: Permit ingress_ifindex in bpf_prog_test_run_xattr bpf_prog_test_run_xattr takes a struct __sk_buff, but did not permit that __skbuff to include an nonzero ingress_ifindex. This patch updates to allow ingress_ifindex, convert the __sk_buff field to sk_buff (skb_iif) and back, and tests that the value is present from on BPF program side. The test sets an unlikely distinct value for ingress_ifindex (11) from ifindex (1), which is in line with the rest of the synthetic field tests. Adding this support allows testing BPF that operates differently on incoming and outgoing skbs by discriminating on this field. Signed-off-by: Neil Spring <ntspring@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210831033356.1459316-1-ntspring@fb.com	2021-09-07 16:55:32 -07:00
Linus Torvalds	27151f1778	Merge tag 'perf-tools-for-v5.15-2021-09-04' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tool updates from Arnaldo Carvalho de Melo: "New features: - Improvements for the flamegraph python script, including: - Display perf.data header - Display PIDs of user stacks - Added option to change color scheme - Default to blue/green color scheme to improve accessibility - Correctly identify kernel stacks when debuginfo is available - Improvements for 'perf bench futex': - Add --mlockall parameter - Add --broadcast and --pi to the 'requeue' sub benchmark - Add support for PMU aliases. - Introduce an ARM Coresight ETE decoder. - Add a 'perf bench' entry for evlist open/close operations, to help quantify improvements with multithreading 'perf record'. - Allow reporting the [un]throttle PERF_RECORD_ meta event in 'perf script's python scripting. - Add a 'perf test' entry for PMU aliases. - Add a 'perf test' entry for 'perf record/perf report/perf script' pipe mode. Fixes: - perf script dlfilter (API for filtering via dynamically loaded shared object introduced in v5.14) fixes and a 'perf test' entry for it. - Fix get_current_dir_name() compilation on Android. - Fix issues with asciidoc and double dashes uses. - Fix memory leaks in the BTF handling code. - Fix leftover problems in the Documentation from the infrastructure originally lifted from the git codebase. - Fix probe_vfs_getname.sh 'perf test' failures. - Handle fd gaps in 'perf test's test__dso_data_reopen(). - Make sure to show disasembly warnings for 'perf annotate --stdio'. - Fix output from pipe to file and vice-versa in 'perf record/report/script'. - Correct 'perf data -h' output. - Fix wrong comm in system-wide mode with 'perf record --delay'. - Do not allow --for-each-cgroup without cpu in 'perf stat' - Make 'perf test --skip' work on shell tests. - Fix libperf's verbose printing. Misc improvements: - Preparatory patches for multithreading various 'perf record' phases (synthesizing, opening, recording, etc). - Add sparse context/locking annotations in compiler-types.h, also to help with the multithreading effort. - Optimize the generation of the arch specific erno tables used in 'perf trace'. - Optimize libperf's perf_cpu_map__max(). - Improve ARM's CoreSight warnings. - Report collisions in AUX records. - Improve warnings for the LLVM 'perf test' entry. - Improve the PMU events 'perf test' codebase. - perf test: Do not compare overheads in the zstd comp test - Better support annotation on ARM. - Update 'perf trace's cmd string table to decode sys_bpf() first arg. Vendor events: - Add JSON events and metrics for Intel's Ice Lake, Tiger Lake and Elhart Lake. - Update JSON eventsand metrics for Intel's Cascade Lake and Sky Lake servers. Hardware tracing: - Improvements for the ARM hardware tracing auxtrace support" tag 'perf-tools-for-v5.15-2021-09-04' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (130 commits) perf tests: Add test for PMU aliases perf pmu: Add PMU alias support perf session: Report collisions in AUX records perf script python: Allow reporting the [un]throttle PERF_RECORD_ meta event perf build: Report failure for testing feature libopencsd perf cs-etm: Show a warning for an unknown magic number perf cs-etm: Print the decoder name perf cs-etm: Create ETE decoder perf cs-etm: Update OpenCSD decoder for ETE perf cs-etm: Fix typo perf cs-etm: Save TRCDEVARCH register perf cs-etm: Refactor out ETMv4 header saving perf cs-etm: Initialise architecture based on TRCIDR1 perf cs-etm: Refactor initialisation of decoder params. tools build: Fix feature detect clean for out of source builds perf evlist: Add evlist__for_each_entry_from() macro perf evsel: Handle precise_ip fallback in evsel__open_cpu() perf evsel: Move bpf_counter__install_pe() to success path in evsel__open_cpu() perf evsel: Move test_attr__open() to success path in evsel__open_cpu() perf evsel: Move ignore_missing_thread() to fallback code ...	2021-09-05 11:56:18 -07:00
Linus Torvalds	58ca241587	Merge tag 'trace-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: - simplify the Kconfig use of FTRACE and TRACE_IRQFLAGS_SUPPORT - bootconfig can now start histograms - bootconfig supports group/all enabling - histograms now can put values in linear size buckets - execnames can be passed to synthetic events - introduce "event probes" that attach to other events and can retrieve data from pointers of fields, or record fields as different types (a pointer to a string as a string instead of just a hex number) - various fixes and clean ups * tag 'trace-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (35 commits) tracing/doc: Fix table format in histogram code selftests/ftrace: Add selftest for testing duplicate eprobes and kprobes selftests/ftrace: Add selftest for testing eprobe events on synthetic events selftests/ftrace: Add test case to test adding and removing of event probe selftests/ftrace: Fix requirement check of README file selftests/ftrace: Add clear_dynamic_events() to test cases tracing: Add a probe that attaches to trace events tracing/probes: Reject events which have the same name of existing one tracing/probes: Have process_fetch_insn() take a void * instead of pt_regs tracing/probe: Change traceprobe_set_print_fmt() to take a type tracing/probes: Use struct_size() instead of defining custom macros tracing/probes: Allow for dot delimiter as well as slash for system names tracing/probe: Have traceprobe_parse_probe_arg() take a const arg tracing: Have dynamic events have a ref counter tracing: Add DYNAMIC flag for dynamic events tracing: Replace deprecated CPU-hotplug functions. MAINTAINERS: Add an entry for os noise/latency tracepoint: Fix kerneldoc comments bootconfig/tracing/ktest: Update ktest example for boot-time tracing tools/bootconfig: Use per-group/all enable option in ftrace2bconf script ...	2021-09-05 11:50:41 -07:00
Linus Torvalds	f1583cb1be	Merge tag 'linux-kselftest-next-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull Kselftest updates from Shuah Khan: "Fixes to build and test failures: - openat2 test failure for O_LARGEFILE flag on ARM64 - x86 test build failures related to glibc 2.34 adding support for variable sized MINSIGSTKSZ and SIGSTKSZ - removing obsolete configs in sync and cpufreq config files - minor spelling and duplicate header include cleanups" * tag 'linux-kselftest-next-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests/cpufreq: Rename DEBUG_PI_LIST to DEBUG_PLIST selftests/sync: Remove the deprecated config SYNC selftests: safesetid: Fix spelling mistake "cant" -> "can't" selftests/x86: Fix error: variably modified 'altstack_data' at file scope kselftest:sched: remove duplicate include in cs_prctl_test.c selftests: openat2: Fix testing failure for O_LARGEFILE flag	2021-09-03 15:55:41 -07:00
Linus Torvalds	7cca308cfd	Merge tag 'powerpc-5.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - Convert pseries & powernv to use MSI IRQ domains. - Rework the pseries CPU numbering so that CPUs that are removed, and later re-added, are given a CPU number on the same node as previously, when possible. - Add support for a new more flexible device-tree format for specifying NUMA distances. - Convert powerpc to GENERIC_PTDUMP. - Retire sbc8548 and sbc8641d board support. - Various other small features and fixes. Thanks to Alexey Kardashevskiy, Aneesh Kumar K.V, Anton Blanchard, Cédric Le Goater, Christophe Leroy, Emmanuel Gil Peyrot, Fabiano Rosas, Fangrui Song, Finn Thain, Gautham R. Shenoy, Hari Bathini, Joel Stanley, Jordan Niethe, Kajol Jain, Laurent Dufour, Leonardo Bras, Lukas Bulwahn, Marc Zyngier, Masahiro Yamada, Michal Suchanek, Nathan Chancellor, Nicholas Piggin, Parth Shah, Paul Gortmaker, Pratik R. Sampat, Randy Dunlap, Sebastian Andrzej Siewior, Srikar Dronamraju, Wan Jiabing, Xiongwei Song, and Zheng Yongjun. * tag 'powerpc-5.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (154 commits) powerpc/bug: Cast to unsigned long before passing to inline asm powerpc/ptdump: Fix generic ptdump for 64-bit KVM: PPC: Fix clearing never mapped TCEs in realmode powerpc/pseries/iommu: Rename "direct window" to "dma window" powerpc/pseries/iommu: Make use of DDW for indirect mapping powerpc/pseries/iommu: Find existing DDW with given property name powerpc/pseries/iommu: Update remove_dma_window() to accept property name powerpc/pseries/iommu: Reorganize iommu_table_setparms*() with new helper powerpc/pseries/iommu: Add ddw_property_create() and refactor enable_ddw() powerpc/pseries/iommu: Allow DDW windows starting at 0x00 powerpc/pseries/iommu: Add ddw_list_new_entry() helper powerpc/pseries/iommu: Add iommu_pseries_alloc_table() helper powerpc/kernel/iommu: Add new iommu_table_in_use() helper powerpc/pseries/iommu: Replace hard-coded page shift powerpc/numa: Update cpu_cpu_map on CPU online/offline powerpc/numa: Print debug statements only when required powerpc/numa: convert printk to pr_xxx powerpc/numa: Drop dbg in favour of pr_debug powerpc/smp: Enable CACHE domain for shared processor powerpc/smp: Update cpu_core_map on all PowerPc systems ...	2021-09-03 11:22:50 -07:00
Linus Torvalds	14726903c8	Merge branch 'akpm' (patches from Andrew) Merge misc updates from Andrew Morton: "173 patches. Subsystems affected by this series: ia64, ocfs2, block, and mm (debug, pagecache, gup, swap, shmem, memcg, selftests, pagemap, mremap, bootmem, sparsemem, vmalloc, kasan, pagealloc, memory-failure, hugetlb, userfaultfd, vmscan, compaction, mempolicy, memblock, oom-kill, migration, ksm, percpu, vmstat, and madvise)" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (173 commits) mm/madvise: add MADV_WILLNEED to process_madvise() mm/vmstat: remove unneeded return value mm/vmstat: simplify the array size calculation mm/vmstat: correct some wrong comments mm/percpu,c: remove obsolete comments of pcpu_chunk_populated() selftests: vm: add COW time test for KSM pages selftests: vm: add KSM merging time test mm: KSM: fix data type selftests: vm: add KSM merging across nodes test selftests: vm: add KSM zero page merging test selftests: vm: add KSM unmerge test selftests: vm: add KSM merge test mm/migrate: correct kernel-doc notation mm: wire up syscall process_mrelease mm: introduce process_mrelease system call memblock: make memblock_find_in_range method private mm/mempolicy.c: use in_task() in mempolicy_slab_node() mm/mempolicy: unify the create() func for bind/interleave/prefer-many policies mm/mempolicy: advertise new MPOL_PREFERRED_MANY mm/hugetlb: add support for mempolicy MPOL_PREFERRED_MANY ...	2021-09-03 10:08:28 -07:00
Zhansaya Bagdauletkyzy	924a11bd16	selftests: vm: add COW time test for KSM pages Since merged pages are copied every time they need to be modified, the write access time is different between shared and non-shared pages. Add ksm_cow_time() function which evaluates latency of these COW breaks. First, 4000 pages are allocated and the time, required to modify 1 byte in every other page, is measured. After this, the pages are merged into 2000 pairs and in each pair, 1 page is modified (i.e. they are decoupled) to detect COW breaks. The time needed to break COW of merged pages is then compared with performance of non-shared pages. The test is run as follows: ./ksm_tests -C The output: Total size: 15 MiB Not merged pages: Total time: 0.002185489 s Average speed: 3202.945 MiB/s Merged pages: Total time: 0.004386872 s Average speed: 1595.670 MiB/s Link: https://lkml.kernel.org/r/1d03ee0d1b341959d4b61672c6401d498bff5652.1629386192.git.zhansayabagdaulet@gmail.com Signed-off-by: Zhansaya Bagdauletkyzy <zhansayabagdaulet@gmail.com> Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2021-09-03 09:58:18 -07:00
Zhansaya Bagdauletkyzy	9e7cb94ca2	selftests: vm: add KSM merging time test Patch series "add KSM performance tests", v3. Extend KSM self tests with a performance benchmark. These tests are not part of regular regression testing, as they are mainly intended to be used by developers making changes to the memory management subsystem. This patch (of 2): Add ksm_merge_time() function to determine speed and time needed for merging. The total spent time is shown in seconds while speed is in MiB/s. User must specify the size of duplicated memory area (in MiB) before running the test. The test is run as follows: ./ksm_tests -P -s 100 The output: Total size: 100 MiB Total time: 0.201106786 s Average speed: 497.248 MiB/s Link: https://lkml.kernel.org/r/cover.1629386192.git.zhansayabagdaulet@gmail.com Link: https://lkml.kernel.org/r/318b946ac80cc9205c89d0962048378f7ce0705b.1629386192.git.zhansayabagdaulet@gmail.com Signed-off-by: Zhansaya Bagdauletkyzy <zhansayabagdaulet@gmail.com> Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2021-09-03 09:58:18 -07:00
Zhansaya Bagdauletkyzy	82e717ad35	selftests: vm: add KSM merging across nodes test Add check_ksm_numa_merge() function to test that pages in different NUMA nodes are being handled properly. First, two duplicate pages are allocated in two separate NUMA nodes using the libnuma library. Since there is one unique page in each node, with merge_across_nodes = 0, there won't be any shared pages. If merge_across_nodes is set to 1, the pages will be treated as usual duplicate pages and will be merged. If NUMA config is not enabled or the number of NUMA nodes is less than two, then the test is skipped. The test is run as follows: ./ksm_tests -N Link: https://lkml.kernel.org/r/071c17b5b04ebb0dfeba137acc495e5dd9d2a719.1626252248.git.zhansayabagdaulet@gmail.com Signed-off-by: Zhansaya Bagdauletkyzy <zhansayabagdaulet@gmail.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com> Cc: Hugh Dickins <hughd@google.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2021-09-03 09:58:18 -07:00
Zhansaya Bagdauletkyzy	39619982c5	selftests: vm: add KSM zero page merging test Add check_ksm_zero_page_merge() function to test that empty pages are being handled properly. For this, several zero pages are allocated and merged using madvise. If use_zero_pages is enabled, the pages must be shared with the special kernel zero pages; otherwise, they are merged as usual duplicate pages. The test is run as follows: ./ksm_tests -Z Link: https://lkml.kernel.org/r/6d0caab00d4bdccf5e3791cb95cf6dfd5eb85e45.1626252248.git.zhansayabagdaulet@gmail.com Signed-off-by: Zhansaya Bagdauletkyzy <zhansayabagdaulet@gmail.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com> Cc: Hugh Dickins <hughd@google.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2021-09-03 09:58:18 -07:00
Zhansaya Bagdauletkyzy	a40c80e348	selftests: vm: add KSM unmerge test Add check_ksm_unmerge() function to verify that KSM is properly unmerging shared pages. For this, two duplicate pages are merged first and then their contents are modified. Since they are not identical anymore, the pages must be unmerged and the number of merged pages has to be 0. The test is run as follows: ./ksm_tests -U Link: https://lkml.kernel.org/r/c0f55420440d704d5b094275b4365aa1b2ad46b5.1626252248.git.zhansayabagdaulet@gmail.com Signed-off-by: Zhansaya Bagdauletkyzy <zhansayabagdaulet@gmail.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com> Cc: Hugh Dickins <hughd@google.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2021-09-03 09:58:18 -07:00
Zhansaya Bagdauletkyzy	68d6289baa	selftests: vm: add KSM merge test Patch series "add KSM selftests". Introduce selftests to validate the functionality of KSM. The tests are run on private anonymous pages. Since some KSM tunables are modified, their starting values are saved and restored after testing. At the start, run is set to 2 to ensure that only test pages will be merged (we assume that no applications make madvise syscalls in the background). If KSM config not enabled, all tests will be skipped. This patch (of 4): Add check_ksm_merge() function to check the basic merging feature of KSM. First, some number of identical pages are allocated and the MADV_MERGEABLE advice is given to merge these pages. Then, pages_shared and pages_sharing values are compared with the expected numbers using assert_ksm_pages_count() function. The number of pages can be changed using -p option. Link: https://lkml.kernel.org/r/cover.1626252248.git.zhansayabagdaulet@gmail.com Link: https://lkml.kernel.org/r/90287685c13300972ea84de93d1f3f900373f9fe.1626252248.git.zhansayabagdaulet@gmail.com Signed-off-by: Zhansaya Bagdauletkyzy <zhansayabagdaulet@gmail.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2021-09-03 09:58:18 -07:00
Nadav Amit	4410cbb5c9	selftests/vm/userfaultfd: wake after copy failure When userfaultfd copy-ioctl fails since the PTE already exists, an -EEXIST error is returned and the faulting thread is not woken. The current userfaultfd test does not wake the faulting thread in such case. The assumption is presumably that another thread set the PTE through copy/wp ioctl and would wake the faulting thread or that alternatively the fault handler would realize there is no need to "must_wait" and continue. This is not necessarily true. There is an assumption that the "must_wait" tests in handle_userfault() are sufficient to provide definitive answer whether the offending PTE is populated or not. However, userfaultfd_must_wait() test is lockless. Consequently, concurrent calls to ptep_modify_prot_start(), for instance, can clear the PTE and can cause userfaultfd_must_wait() to wrongly assume it is not populated and a wait is needed. There are therefore 3 options: (1) Change the tests to wake on copy failure. (2) Wake faulting thread unconditionally on zero/copy ioctls before returning -EEXIST. (3) Change the userfaultfd_must_wait() to hold locks. This patch took the first approach, but the others are valid solutions with different tradeoffs. Link: https://lkml.kernel.org/r/20210808020724.1022515-4-namit@vmware.com Signed-off-by: Nadav Amit <namit@vmware.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Peter Xu <peterx@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2021-09-03 09:58:16 -07:00
Christoph Hellwig	f358afc52c	mm: remove flush_kernel_dcache_page flush_kernel_dcache_page is a rather confusing interface that implements a subset of flush_dcache_page by not being able to properly handle page cache mapped pages. The only callers left are in the exec code as all other previous callers were incorrect as they could have dealt with page cache pages. Replace the calls to flush_kernel_dcache_page with calls to flush_dcache_page, which for all architectures does either exactly the same thing, can contains one or more of the following: 1) an optimization to defer the cache flush for page cache pages not mapped into userspace 2) additional flushing for mapped page cache pages if cache aliases are possible Link: https://lkml.kernel.org/r/20210712060928.4161649-7-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Cc: Alex Shi <alexs@kernel.org> Cc: Geoff Levand <geoff@infradead.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Guo Ren <guoren@kernel.org> Cc: Helge Deller <deller@gmx.de> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Nick Hu <nickhu@andestech.com> Cc: Paul Cercueil <paul@crapouillou.net> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Ulf Hansson <ulf.hansson@linaro.org> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Yoshinori Sato <ysato@users.osdn.me> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2021-09-03 09:58:13 -07:00

1 2 3 4 5 ...

26740 Commits