linux/net
Stefano Brivio a58db7ad80 netfilter: nft_set_pipapo_avx2: Skip LDMXCSR, we don't need a valid MXCSR state
We don't need a valid MXCSR state for the lookup routines, none of
the instructions we use rely on or affect any bit in the MXCSR
register.

Instead of calling kernel_fpu_begin(), we can pass 0 as mask to
kernel_fpu_begin_mask() and spare one LDMXCSR instruction.

Commit 49200d17d2 ("x86/fpu/64: Don't FNINIT in kernel_fpu_begin()")
already speeds up lookups considerably, and by dropping the MCXSR
initialisation we can now get a much smaller, but measurable, increase
in matching rates.

The table below reports matching rates and a wild approximation of
clock cycles needed for a match in a "port,net" test with 10 entries
from selftests/netfilter/nft_concat_range.sh, limited to the first
field, i.e. the port (with nft_set_rbtree initialisation skipped), run
on a single AMD Epyc 7351 thread (2.9GHz, 512 KiB L1D$, 8 MiB L2$).

The (very rough) estimation of clock cycles is obtained by simply
dividing frequency by matching rate. The "cycles spared" column refers
to the difference in cycles compared to the previous row, and the rate
increase also refers to the previous row. Results are averages of six
runs.

Merely for context, I'm also reporting packet rates obtained by
skipping kernel_fpu_begin() and kernel_fpu_end() altogether (which
shows a very limited impact now), as well as skipping the whole lookup
function, compared to simply counting and dropping all packets using
the netdev hook drop (see nft_concat_range.sh for details). This
workload also includes packet generation with pktgen and the receive
path of veth.

                                      |matching|  est.  | cycles |  rate  |
                                      |  rate  | cycles | spared |increase|
                                      | (Mpps) |        |        |        |
--------------------------------------|--------|--------|--------|--------|
FNINIT, LDMXCSR (before 49200d17d2) |  5.245 |    553 |      - |      - |
LDMXCSR only (with 49200d17d2)      |  6.347 |    457 |     96 |  21.0% |
Without LDMXCSR (this patch)          |  6.461 |    449 |      8 |   1.8% |
-------- for reference only: ---------|--------|--------|--------|--------|
Without kernel_fpu_begin()            |  6.513 |    445 |      4 |   0.8% |
Without actual matching (return true) |  7.649 |    379 |     66 |  17.4% |
Without lookup operation (netdev drop)| 10.320 |    281 |     98 |  34.9% |

The clock cycles spared by avoiding LDMXCSR appear to be in line with CPI
and latency indicated in the manuals of comparable architectures: Intel
Skylake (CPI: 1, latency: 7) and AMD 12h (latency: 12) -- I couldn't find
this information for AMD 17h.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-05-28 21:11:41 +02:00
..
6lowpan 6lowpan: Fix some typos in nhc_udp.c 2021-03-24 17:52:11 -07:00
9p net: 9p: Correct function names in the kerneldoc comments 2021-03-28 17:56:56 -07:00
802
8021q Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-04-26 12:00:00 -07:00
appletalk appletalk: Fix skb allocation size in loopback case 2021-02-12 16:40:28 -08:00
atm net: atm: use DEVICE_ATTR_RO macro 2021-05-20 15:50:54 -07:00
ax25 net/ax25: Delete obsolete TODO file 2021-03-30 16:54:50 -07:00
batman-adv Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-04-09 20:48:35 -07:00
bluetooth Networking changes for 5.13. 2021-04-29 11:57:23 -07:00
bpf bpf: Prepare bpf syscall to be used from kernel and user space. 2021-05-19 00:33:40 +02:00
bpfilter net: remove redundant 'depends on NET' 2021-01-27 17:04:12 -08:00
bridge net: bridge: remove redundant assignment 2021-05-25 15:20:27 -07:00
caif net: caif: Drop unnecessary NULL check after container_of 2021-05-13 15:55:50 -07:00
can linux-can-next-for-5.14-20210527 2021-05-27 14:39:11 -07:00
ceph Notable items here are a series to take advantage of David Howells' 2021-05-06 10:27:02 -07:00
core devlink: append split port number to the port name 2021-05-27 14:21:12 -07:00
dcb net: dcb: Remove unnecessary INIT_LIST_HEAD() 2021-05-18 13:44:24 -07:00
dccp net: Remove the member netns_ok 2021-05-17 15:29:35 -07:00
decnet net/decnet: Delete obsolete TODO file 2021-03-30 16:54:50 -07:00
dns_resolver net: remove redundant 'depends on NET' 2021-01-27 17:04:12 -08:00
dsa net: dsa: fix error code getting shifted with 4 in dsa_slave_get_sset_count 2021-05-10 14:36:59 -07:00
ethernet of: net: pass the dst buffer to of_get_mac_address() 2021-04-13 14:35:02 -07:00
ethtool ethtool: stats: Fix a copy-paste error 2021-05-19 11:57:33 -07:00
hsr net: hsr: fix mac_len checks 2021-05-24 14:10:28 -07:00
ieee802154 net: remove the new_ifindex argument from dev_change_net_namespace 2021-04-07 14:43:28 -07:00
ife net: remove redundant 'depends on NET' 2021-01-27 17:04:12 -08:00
ipv4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-05-27 09:55:10 -07:00
ipv6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-05-27 09:55:10 -07:00
iucv iucv: af_iucv.c: Couple of typo fixes 2021-03-28 17:31:13 -07:00
kcm kcm: kcmsock.c: Couple of typo fixes 2021-03-28 17:31:13 -07:00
key af_key: relax availability checks for skb size calculation 2021-01-04 10:05:50 +01:00
l2tp net: Remove the member netns_ok 2021-05-17 15:29:35 -07:00
l3mdev l3mdev: Correct function names in the kerneldoc comments 2021-03-28 17:56:55 -07:00
lapb net: lapb: Make "lapb_t1timer_running" able to detect an already running timer 2021-03-23 14:14:50 -07:00
llc llc2: Remove redundant assignment to rc 2021-04-27 14:16:14 -07:00
mac80211 mac80211: extend protection against mixed key and fragment cache attacks 2021-05-11 20:14:50 +02:00
mac802154 net: mac802154: Fix general protection fault 2021-04-06 22:42:16 +02:00
mpls mpls: Remove redundant assignment to err 2021-04-27 14:17:00 -07:00
mptcp mptcp: validate 'id' when stopping the ADD_ADDR retransmit timer 2021-05-25 15:56:20 -07:00
ncsi Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-04-09 20:48:35 -07:00
netfilter netfilter: nft_set_pipapo_avx2: Skip LDMXCSR, we don't need a valid MXCSR state 2021-05-28 21:11:41 +02:00
netlabel netlabel: remove unused parameter in netlbl_netlink_auditinfo() 2021-05-19 12:27:13 -07:00
netlink netlink: disable IRQs for netlink_lock_table() 2021-05-17 15:31:03 -07:00
netrom net: netrom: nr_in: Remove redundant assignment to ns 2021-04-28 13:59:08 -07:00
nfc NFC: nci: fix memory leak in nci_allocate_device 2021-05-17 13:56:29 -07:00
nsh
openvswitch Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-05-27 09:55:10 -07:00
packet Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-05-27 09:55:10 -07:00
phonet
psample psample: Add additional metadata attributes 2021-03-14 15:00:43 -07:00
qrtr net: qrtr: ns: Fix error return code in qrtr_ns_init() 2021-05-19 13:14:35 -07:00
rds RDS tcp loopback connection can hang 2021-05-21 14:46:59 -07:00
rfkill Another set of updates, all over the map: 2021-04-20 16:44:04 -07:00
rose net: rose: Fix fall-through warnings for Clang 2021-03-10 12:45:15 -08:00
rxrpc Networking changes for 5.13. 2021-04-29 11:57:23 -07:00
sched Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-05-27 09:55:10 -07:00
sctp Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-05-27 09:55:10 -07:00
smc Networking fixes for 5.13-rc4, including fixes from bpf, netfilter, 2021-05-26 17:44:49 -10:00
strparser
sunrpc NFS client updates for Linux 5.13 2021-05-07 11:23:41 -07:00
switchdev net: bridge: propagate extack through switchdev_port_attr_set 2021-02-14 17:38:11 -08:00
tipc tipc: simplify the finalize work queue 2021-05-18 13:22:09 -07:00
tls Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-05-27 09:55:10 -07:00
unix af_unix: handle idmapped mounts 2021-01-24 14:27:18 +01:00
vmw_vsock vsock/vmci: Remove redundant assignment to err 2021-04-30 15:00:59 -07:00
wireless cfg80211: mitigate A-MSDU aggregation attacks 2021-05-11 20:13:13 +02:00
x25 af_x25.c: Fix a spello 2021-03-28 17:31:13 -07:00
xdp xsk: Fix for xp_aligned_validate_desc() when len == chunk_size 2021-05-04 00:28:06 +02:00
xfrm xfrm: ipcomp: remove unnecessary get_cpu() 2021-04-19 12:49:29 +02:00
Kconfig bpf, kconfig: Add consolidated menu entry for bpf with core options 2021-05-11 13:56:16 -07:00
Makefile net: l3mdev: use obj-$(CONFIG_NET_L3_MASTER_DEV) form in net/Makefile 2021-01-27 17:03:52 -08:00
compat.c
devres.c
socket.c net: Fix a misspell in socket.c 2021-03-25 16:56:27 -07:00
sysctl_net.c net: Ensure net namespace isolation of sysctls 2021-04-12 13:27:11 -07:00