Home GnuPG

rijndael: add x86_64 VAES/AVX2 accelerated implementation

Description

rijndael: add x86_64 VAES/AVX2 accelerated implementation

* cipher/Makefile.am: Add 'rijndael-vaes.c' and
'rijndael-vaes-avx2-amd64.S'.
* cipher/rijndael-internal.h (USE_VAES): New.
* cipher/rijndael-vaes-avx2-amd64.S: New.
* cipher/rijndael-vaes.c: New.
* cipher/rijndael.c (_gcry_aes_vaes_cfb_dec, _gcry_aes_vaes_cbc_dec)
(_gcry_aes_vaes_ctr_enc, _gcry_aes_vaes_ocb_crypt)
(_gcry_aes_vaes_xts_crypt): New.
(do_setkey) [USE_VAES]: Add detection for VAES.
(selftest_ctr_128, selftest_cbc_128, selftest_cfb_128)
[USE_VAES]: Increase number of selftest blocks.
* configure.ac: Add 'rijndael-vaes.lo' and
'rijndael-vaes-avx2-amd64.lo'.

Patch adds VAES/AVX2 accelerated implementation for CBC-decryption,
CFB-decryption, CTR-encryption, OCB-en/decryption and XTS-en/decryption.

Benchmarks on AMD Ryzen 5800X:

Before:
AES | nanosecs/byte mebibytes/sec cycles/byte auto Mhz

CBC dec |     0.067 ns/B     14314 MiB/s     0.323 c/B      4850
CFB dec |     0.067 ns/B     14322 MiB/s     0.323 c/B      4850
CTR enc |     0.066 ns/B     14429 MiB/s     0.321 c/B      4850
CTR dec |     0.066 ns/B     14433 MiB/s     0.320 c/B      4850
XTS enc |     0.087 ns/B     10910 MiB/s     0.424 c/B      4850
XTS dec |     0.088 ns/B     10856 MiB/s     0.426 c/B      4850
OCB enc |     0.070 ns/B     13633 MiB/s     0.339 c/B      4850
OCB dec |     0.069 ns/B     13911 MiB/s     0.332 c/B      4850

After (XTS ~1.7x faster, others ~1.9x faster):
AES | nanosecs/byte mebibytes/sec cycles/byte auto Mhz

 CBC dec |     0.034 ns/B     28159 MiB/s     0.164 c/B      4850
 CFB dec |     0.034 ns/B     27955 MiB/s     0.165 c/B      4850
 CTR enc |     0.034 ns/B     28214 MiB/s     0.164 c/B      4850
 CTR dec |     0.034 ns/B     28146 MiB/s     0.164 c/B      4850
 XTS enc |     0.051 ns/B     18539 MiB/s     0.249 c/B      4850
 XTS dec |     0.051 ns/B     18655 MiB/s     0.248 c/B      4850
GCM auth |     0.088 ns/B     10817 MiB/s     0.428 c/B      4850
 OCB enc |     0.037 ns/B     25824 MiB/s     0.179 c/B      4850
 OCB dec |     0.038 ns/B     25359 MiB/s     0.182 c/B      4850
  • Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>

Details

Provenance
jukiviliAuthored on Jan 19 2021, 6:38 PM
Parents
rCffe1d5319703: rijndael-aesni: add 8-block parallel code path for XTS
Branches
Unknown
Tags
Unknown