Go to file
Stefano Brivio a58db7ad80 netfilter: nft_set_pipapo_avx2: Skip LDMXCSR, we don't need a valid MXCSR state
We don't need a valid MXCSR state for the lookup routines, none of
the instructions we use rely on or affect any bit in the MXCSR
register.

Instead of calling kernel_fpu_begin(), we can pass 0 as mask to
kernel_fpu_begin_mask() and spare one LDMXCSR instruction.

Commit 49200d17d2 ("x86/fpu/64: Don't FNINIT in kernel_fpu_begin()")
already speeds up lookups considerably, and by dropping the MCXSR
initialisation we can now get a much smaller, but measurable, increase
in matching rates.

The table below reports matching rates and a wild approximation of
clock cycles needed for a match in a "port,net" test with 10 entries
from selftests/netfilter/nft_concat_range.sh, limited to the first
field, i.e. the port (with nft_set_rbtree initialisation skipped), run
on a single AMD Epyc 7351 thread (2.9GHz, 512 KiB L1D$, 8 MiB L2$).

The (very rough) estimation of clock cycles is obtained by simply
dividing frequency by matching rate. The "cycles spared" column refers
to the difference in cycles compared to the previous row, and the rate
increase also refers to the previous row. Results are averages of six
runs.

Merely for context, I'm also reporting packet rates obtained by
skipping kernel_fpu_begin() and kernel_fpu_end() altogether (which
shows a very limited impact now), as well as skipping the whole lookup
function, compared to simply counting and dropping all packets using
the netdev hook drop (see nft_concat_range.sh for details). This
workload also includes packet generation with pktgen and the receive
path of veth.

                                      |matching|  est.  | cycles |  rate  |
                                      |  rate  | cycles | spared |increase|
                                      | (Mpps) |        |        |        |
--------------------------------------|--------|--------|--------|--------|
FNINIT, LDMXCSR (before 49200d17d2) |  5.245 |    553 |      - |      - |
LDMXCSR only (with 49200d17d2)      |  6.347 |    457 |     96 |  21.0% |
Without LDMXCSR (this patch)          |  6.461 |    449 |      8 |   1.8% |
-------- for reference only: ---------|--------|--------|--------|--------|
Without kernel_fpu_begin()            |  6.513 |    445 |      4 |   0.8% |
Without actual matching (return true) |  7.649 |    379 |     66 |  17.4% |
Without lookup operation (netdev drop)| 10.320 |    281 |     98 |  34.9% |

The clock cycles spared by avoiding LDMXCSR appear to be in line with CPI
and latency indicated in the manuals of comparable architectures: Intel
Skylake (CPI: 1, latency: 7) and AMD 12h (latency: 12) -- I couldn't find
this information for AMD 17h.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-05-28 21:11:41 +02:00
Documentation linux-can-next-for-5.14-20210527 2021-05-27 14:39:11 -07:00
LICENSES LICENSES: Add the CC-BY-4.0 license 2020-12-08 10:33:27 -07:00
arch Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-05-27 09:55:10 -07:00
block block-5.13-2021-05-22 2021-05-22 07:40:34 -10:00
certs Kbuild updates for v5.13 (2nd) 2021-05-08 10:00:11 -07:00
crypto for-5.13/drivers-2021-04-27 2021-04-28 14:39:37 -07:00
drivers mlx5-updates-2021-05-26 2021-05-27 17:14:23 -07:00
fs proc: Check /proc/$pid/attr/ writes against file opener 2021-05-25 10:24:41 -10:00
include netfilter: nft_exthdr: Support SCTP chunks 2021-05-28 21:11:33 +02:00
init Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf 2021-05-11 16:05:56 -07:00
ipc ipc/mqueue, msg, sem: avoid relying on a stack reference past its expiry 2021-05-22 15:09:07 -10:00
kernel Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-05-27 09:55:10 -07:00
lib lib: kunit: suppress a compilation warning of frame size 2021-05-22 15:09:07 -10:00
mm userfaultfd: hugetlbfs: fix new flag usage in error path 2021-05-22 15:09:07 -10:00
net netfilter: nft_set_pipapo_avx2: Skip LDMXCSR, we don't need a valid MXCSR state 2021-05-28 21:11:41 +02:00
samples Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-05-27 09:55:10 -07:00
scripts kbuild: dummy-tools: adjust to stricter stackprotector check 2021-05-17 12:10:03 +09:00
security trusted-keys: match tpm_get_ops on all return paths 2021-05-12 22:36:37 +03:00
sound sound fixes for 5.13-rc3 2021-05-20 06:42:21 -10:00
tools selftests: devlink_lib: add check for devlink device existence 2021-05-27 14:49:07 -07:00
usr .gitignore: prefix local generated files with a slash 2021-05-02 00:43:35 +09:00
virt kvm: Cap halt polling at kvm->max_halt_poll_ns 2021-05-07 06:06:22 -04:00
.clang-format cxl for 5.12 2021-02-24 09:38:36 -08:00
.cocciconfig
.get_maintainer.ignore Opt out of scripts/get_maintainer.pl 2019-05-16 10:53:40 -07:00
.gitattributes .gitattributes: use 'dts' diff driver for dts files 2019-12-04 19:44:11 -08:00
.gitignore .gitignore: ignore only top-level modules.builtin 2021-05-02 00:43:35 +09:00
.mailmap Merge drm/drm-fixes into drm-misc-fixes 2021-05-11 13:35:52 +02:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS MAINTAINERS: move Murali Karicheri to credits 2021-04-29 15:47:30 -07:00
Kbuild kbuild: rename hostprogs-y/always to hostprogs/always-y 2020-02-04 01:53:07 +09:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-05-27 09:55:10 -07:00
Makefile Linux 5.13-rc3 2021-05-23 11:42:48 -10:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.