linux/arch
Taehee Yoo c970d42001 crypto: x86/aria - implement aria-avx512
aria-avx512 implementation uses AVX512 and GFNI.
It supports 64way parallel processing.
So, byteslicing code is changed to support 64way parallel.
And it exports some aria-avx2 functions such as encrypt() and decrypt().

AVX and AVX2 have 16 registers.
They should use memory to store/load state because of lack of registers.
But AVX512 supports 32 registers.
So, it doesn't require store/load in the s-box layer.
It means that it can reduce overhead of store/load in the s-box layer.
Also code become much simpler.

Benchmark with modprobe tcrypt mode=610 num_mb=8192, i3-12100:

ARIA-AVX512(128bit and 256bit)
    testing speed of multibuffer ecb(aria) (ecb-aria-avx512) encryption
tcrypt: 1 operation in 1504 cycles (1024 bytes)
tcrypt: 1 operation in 4595 cycles (4096 bytes)
tcrypt: 1 operation in 1763 cycles (1024 bytes)
tcrypt: 1 operation in 5540 cycles (4096 bytes)
    testing speed of multibuffer ecb(aria) (ecb-aria-avx512) decryption
tcrypt: 1 operation in 1502 cycles (1024 bytes)
tcrypt: 1 operation in 4615 cycles (4096 bytes)
tcrypt: 1 operation in 1759 cycles (1024 bytes)
tcrypt: 1 operation in 5554 cycles (4096 bytes)

ARIA-AVX2 with GFNI(128bit and 256bit)
    testing speed of multibuffer ecb(aria) (ecb-aria-avx2) encryption
tcrypt: 1 operation in 2003 cycles (1024 bytes)
tcrypt: 1 operation in 5867 cycles (4096 bytes)
tcrypt: 1 operation in 2358 cycles (1024 bytes)
tcrypt: 1 operation in 7295 cycles (4096 bytes)
    testing speed of multibuffer ecb(aria) (ecb-aria-avx2) decryption
tcrypt: 1 operation in 2004 cycles (1024 bytes)
tcrypt: 1 operation in 5956 cycles (4096 bytes)
tcrypt: 1 operation in 2409 cycles (1024 bytes)
tcrypt: 1 operation in 7564 cycles (4096 bytes)

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-01-06 17:15:47 +08:00
..
alpha MM patches for 6.2-rc1. 2022-12-13 19:29:45 -08:00
arc MM patches for 6.2-rc1. 2022-12-13 19:29:45 -08:00
arm crypto: arm/sha1 - Fix clang function cast warnings 2022-12-30 22:56:27 +08:00
arm64 crypto: arm64/sm4 - fix possible crash with CFI enabled 2022-12-30 17:57:42 +08:00
csky arch/csky patches for 6.2-rc1 2022-12-19 07:51:30 -06:00
hexagon MM patches for 6.2-rc1. 2022-12-13 19:29:45 -08:00
ia64 - Add the call depth tracking mitigation for Retbleed which has 2022-12-14 15:03:00 -08:00
loongarch LoongArch changes for v6.2 2022-12-19 08:23:27 -06:00
m68k m68k: remove broken strcmp implementation 2022-12-21 08:56:43 -08:00
microblaze MM patches for 6.2-rc1. 2022-12-13 19:29:45 -08:00
mips Fixes due to DT changes 2022-12-23 10:49:45 -08:00
nios2 MM patches for 6.2-rc1. 2022-12-13 19:29:45 -08:00
openrisc MM patches for 6.2-rc1. 2022-12-13 19:29:45 -08:00
parisc parisc architecture fixes for kernel v6.2-rc1: 2022-12-20 08:43:53 -06:00
powerpc random: do not include <asm/archrandom.h> from random.h 2022-12-20 03:13:45 +01:00
riscv KVM/riscv changes for 6.2 2022-12-21 18:52:15 -08:00
s390 crypto: s390/aes - drop redundant xts key check 2023-01-06 17:15:47 +08:00
sh treewide: Convert del_timer*() to timer_shutdown*() 2022-12-25 13:38:09 -08:00
sparc MM patches for 6.2-rc1. 2022-12-13 19:29:45 -08:00
um New Feature: 2022-12-17 14:06:53 -06:00
x86 crypto: x86/aria - implement aria-avx512 2023-01-06 17:15:47 +08:00
xtensa MM patches for 6.2-rc1. 2022-12-13 19:29:45 -08:00
.gitignore
Kconfig arm64 fixes for -rc1 2022-12-16 13:46:41 -06:00