Home GnuPG

aria: add x86_64 GFNI/AVX512 accelerated implementation

Description

aria: add x86_64 GFNI/AVX512 accelerated implementation

* cipher/Makefile.am: Add 'aria-gfni-avx512-amd64.S'.
* cipher/aria-gfni-avx512-amd64.S: New.
* cipher/aria.c (USE_GFNI_AVX512): New.
[USE_GFNI_AVX512] (MAX_PARALLEL_BLKS): New.
(ARIA_context): Add 'use_gfni_avx512'.
(_gcry_aria_gfni_avx512_ecb_crypt_blk64)
(_gcry_aria_gfni_avx512_ctr_crypt_blk64)
(aria_gfni_avx512_ecb_crypt_blk64)
(aria_gfni_avx512_ctr_crypt_blk64): New.
(aria_crypt_blocks) [USE_GFNI_AVX512]: Add 64 parallel block
AVX512/GFNI processing.
(_gcry_aria_ctr_enc) [USE_GFNI_AVX512]: Add 64 parallel block
AVX512/GFNI processing.
(aria_setkey): Enable GFNI/AVX512 based on HW features.
* configure.ac: Add 'aria-gfni-avx512-amd64.lo'.

This patch adds AVX512/GFNI accelerated ARIA block cipher
implementation for libgcrypt. This implementation is based on
work by Taehee Yoo, with following notable changes:

  • Integration to libgcrypt, use of 'aes-common-amd64.h'.
  • Use round loop instead of unrolling for smaller code size and increased performance.
  • Use stack for temporary storage instead of external buffers.
  • Add byte-addition fast path for CTR.

Benchmark on AMD Ryzen 9 7900X (zen4, turbo-freq off):

GFNI/AVX512:
ARIA128 | nanosecs/byte mebibytes/sec cycles/byte auto Mhz

ECB enc |     0.203 ns/B      4703 MiB/s     0.953 c/B      4700
ECB dec |     0.204 ns/B      4675 MiB/s     0.959 c/B      4700
CTR enc |     0.207 ns/B      4609 MiB/s     0.973 c/B      4700
CTR dec |     0.207 ns/B      4608 MiB/s     0.973 c/B      4700

Benchmark on Intel Core i3-1115G4 (tiger-lake, turbo-freq off):

GFNI/AVX512:
ARIA128 | nanosecs/byte mebibytes/sec cycles/byte auto Mhz

ECB enc |     0.362 ns/B      2635 MiB/s      1.08 c/B      2992
ECB dec |     0.361 ns/B      2639 MiB/s      1.08 c/B      2992
CTR enc |     0.362 ns/B      2633 MiB/s      1.08 c/B      2992
CTR dec |     0.362 ns/B      2633 MiB/s      1.08 c/B      2992

[v2]:

  • Add byte-addition fast path for CTR.

Cc: Taehee Yoo <ap420073@gmail.com>

  • Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>

Details

Provenance
jukiviliAuthored on Feb 17 2023, 11:14 PM
Parents
rCf4268a8f51a8: aria: add x86_64 AESNI/GFNI/AVX/AVX2 accelerated implementations
Branches
Unknown
Tags
Unknown