Home GnuPG

Add SM4 x86-64/AES-NI/AVX implementation

Description

Add SM4 x86-64/AES-NI/AVX implementation

* cipher/Makefile.am: Add 'sm4-aesni-avx-amd64.S'.
* cipher/sm4-aesni-avx-amd64.S: New.
* cipher/sm4.c (USE_AESNI_AVX, ASM_FUNC_ABI): New.
(SM4_context) [USE_AESNI_AVX]: Add 'use_aesni_avx'.
[USE_AESNI_AVX] (_gcry_sm4_aesni_avx_expand_key)
(_gcry_sm4_aesni_avx_crypt_blk1_8, _gcry_sm4_aesni_avx_ctr_enc)
(_gcry_sm4_aesni_avx_cbc_dec, _gcry_sm4_aesni_avx_cfb_dec)
(_gcry_sm4_aesni_avx_ocb_enc, _gcry_sm4_aesni_avx_ocb_dec)
(_gcry_sm4_aesni_avx_ocb_auth, sm4_aesni_avx_crypt_blk1_8): New.
(sm4_expand_key) [USE_AESNI_AVX]: Use AES-NI/AVX key setup.
(sm4_setkey): Enable AES-NI/AVX if supported by HW.
(_gcry_sm4_ctr_enc, _gcry_sm4_cbc_dec, _gcry_sm4_cfb_dec)
(_gcry_sm4_ocb_crypt, _gcry_sm4_ocb_auth) [USE_AESNI_AVX]: Add
AES-NI/AVX bulk functions.
* configure.ac: Add ''sm4-aesni-avx-amd64.lo'.

This patch adds x86-64/AES-NI/AVX bulk encryption/decryption and key
setup for SM4 cipher. Bulk functions process eight blocks in parallel.

Benchmark on AMD Ryzen 7 3700X:

Before:
SM4 | nanosecs/byte mebibytes/sec cycles/byte auto Mhz

 CBC enc |      8.94 ns/B     106.7 MiB/s     38.66 c/B      4325
 CBC dec |      4.78 ns/B     199.7 MiB/s     20.42 c/B      4275
 CFB enc |      8.95 ns/B     106.5 MiB/s     38.72 c/B      4325
 CFB dec |      4.81 ns/B     198.2 MiB/s     20.57 c/B      4275
 CTR enc |      4.81 ns/B     198.2 MiB/s     20.69 c/B      4300
 CTR dec |      4.80 ns/B     198.8 MiB/s     20.63 c/B      4300
GCM auth |     0.116 ns/B      8232 MiB/s     0.504 c/B      4351
 OCB enc |      4.88 ns/B     195.5 MiB/s     20.86 c/B      4275
 OCB dec |      4.85 ns/B     196.6 MiB/s     20.86 c/B      4301
OCB auth |      4.80 ns/B     198.9 MiB/s     20.62 c/B      4301

After (~3.0x faster):
SM4 | nanosecs/byte mebibytes/sec cycles/byte auto Mhz

 CBC enc |      8.98 ns/B     106.2 MiB/s     38.62 c/B      4300
 CBC dec |      1.55 ns/B     613.7 MiB/s      6.64 c/B      4275
 CFB enc |      8.96 ns/B     106.4 MiB/s     38.52 c/B      4300
 CFB dec |      1.54 ns/B     617.4 MiB/s      6.60 c/B      4275
 CTR enc |      1.57 ns/B     607.8 MiB/s      6.75 c/B      4300
 CTR dec |      1.57 ns/B     608.9 MiB/s      6.74 c/B      4300
 OCB enc |      1.58 ns/B     603.8 MiB/s      6.75 c/B      4275
 OCB dec |      1.57 ns/B     605.7 MiB/s      6.73 c/B      4275
OCB auth |      1.53 ns/B     624.5 MiB/s      6.57 c/B      4300

sm4 avx fix

sm4 avx fix

  • Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>

Details

Provenance
jukiviliAuthored on Jun 11 2020, 7:17 PM
Parents
rC81fee26bbbae: Optimizations for SM4 cipher
Branches
Unknown
Tags
Unknown