Home GnuPG

Add SM4 x86-64/AES-NI/AVX2 implementation

Description

Add SM4 x86-64/AES-NI/AVX2 implementation

* cipher/Makefile.am: Add 'sm4-aesni-avx2-amd64.S'.
* cipher/sm4-aesni-avx2-amd64.S: New.
* cipher/sm4.c (USE_AESNI_AVX2): New.
(SM4_context) [USE_AESNI_AVX2]: Add 'use_aesni_avx2'.
[USE_AESNI_AVX2] (_gcry_sm4_aesni_avx2_ctr_enc)
(_gcry_sm4_aesni_avx2_cbc_dec, _gcry_sm4_aesni_avx2_cfb_dec)
(_gcry_sm4_aesni_avx2_ocb_enc, _gcry_sm4_aesni_avx2_ocb_dec)
(_gcry_sm4_aesni_avx_ocb_auth): New.
(sm4_setkey): Enable AES-NI/AVX2 if supported by HW.
(_gcry_sm4_ctr_enc, _gcry_sm4_cbc_dec, _gcry_sm4_cfb_dec)
(_gcry_sm4_ocb_crypt, _gcry_sm4_ocb_auth) [USE_AESNI_AVX2]: Add
AES-NI/AVX2 bulk functions.
* configure.ac: Add ''sm4-aesni-avx2-amd64.lo'.

This patch adds x86-64/AES-NI/AVX2 bulk encryption/decryption. Bulk
functions process 16 blocks in parallel.

Benchmark on AMD Ryzen 7 3700X:

Before:
SM4 | nanosecs/byte mebibytes/sec cycles/byte auto Mhz

 CBC enc |      8.98 ns/B     106.2 MiB/s     38.62 c/B      4300
 CBC dec |      1.55 ns/B     613.7 MiB/s      6.64 c/B      4275
 CFB enc |      8.96 ns/B     106.4 MiB/s     38.52 c/B      4300
 CFB dec |      1.54 ns/B     617.4 MiB/s      6.60 c/B      4275
 CTR enc |      1.57 ns/B     607.8 MiB/s      6.75 c/B      4300
 CTR dec |      1.57 ns/B     608.9 MiB/s      6.74 c/B      4300
 OCB enc |      1.58 ns/B     603.8 MiB/s      6.75 c/B      4275
 OCB dec |      1.57 ns/B     605.7 MiB/s      6.73 c/B      4275
OCB auth |      1.53 ns/B     624.5 MiB/s      6.57 c/B      4300

After (~56% faster than AES-NI/AVX impl.):
SM4 | nanosecs/byte mebibytes/sec cycles/byte auto Mhz

 CBC enc |      8.93 ns/B     106.8 MiB/s     38.61 c/B      4326
 CBC dec |     0.984 ns/B     969.5 MiB/s      4.23 c/B      4300
 CFB enc |      8.93 ns/B     106.8 MiB/s     38.62 c/B      4325
 CFB dec |     0.983 ns/B     970.3 MiB/s      4.23 c/B      4300
 CTR enc |     0.998 ns/B     955.1 MiB/s      4.29 c/B      4300
 CTR dec |     0.996 ns/B     957.4 MiB/s      4.28 c/B      4300
 OCB enc |      1.00 ns/B     951.8 MiB/s      4.31 c/B      4300
 OCB dec |      1.00 ns/B     951.8 MiB/s      4.31 c/B      4300
OCB auth |     0.993 ns/B     960.2 MiB/s      4.28 c/B      4304±2
  • Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>

Details

Provenance
jukiviliAuthored on Jun 12 2020, 9:36 PM
Parents
rCc9a3f1bb91e6: Add SM4 x86-64/AES-NI/AVX implementation
Branches
Unknown
Tags
Unknown