Add SM4 x86-64/AES-NI/AVX implementation
* cipher/Makefile.am: Add 'sm4-aesni-avx-amd64.S'. * cipher/sm4-aesni-avx-amd64.S: New. * cipher/sm4.c (USE_AESNI_AVX, ASM_FUNC_ABI): New. (SM4_context) [USE_AESNI_AVX]: Add 'use_aesni_avx'. [USE_AESNI_AVX] (_gcry_sm4_aesni_avx_expand_key) (_gcry_sm4_aesni_avx_crypt_blk1_8, _gcry_sm4_aesni_avx_ctr_enc) (_gcry_sm4_aesni_avx_cbc_dec, _gcry_sm4_aesni_avx_cfb_dec) (_gcry_sm4_aesni_avx_ocb_enc, _gcry_sm4_aesni_avx_ocb_dec) (_gcry_sm4_aesni_avx_ocb_auth, sm4_aesni_avx_crypt_blk1_8): New. (sm4_expand_key) [USE_AESNI_AVX]: Use AES-NI/AVX key setup. (sm4_setkey): Enable AES-NI/AVX if supported by HW. (_gcry_sm4_ctr_enc, _gcry_sm4_cbc_dec, _gcry_sm4_cfb_dec) (_gcry_sm4_ocb_crypt, _gcry_sm4_ocb_auth) [USE_AESNI_AVX]: Add AES-NI/AVX bulk functions. * configure.ac: Add ''sm4-aesni-avx-amd64.lo'.
This patch adds x86-64/AES-NI/AVX bulk encryption/decryption and key
setup for SM4 cipher. Bulk functions process eight blocks in parallel.
Benchmark on AMD Ryzen 7 3700X:
Before:
SM4 | nanosecs/byte mebibytes/sec cycles/byte auto Mhz
CBC enc | 8.94 ns/B 106.7 MiB/s 38.66 c/B 4325 CBC dec | 4.78 ns/B 199.7 MiB/s 20.42 c/B 4275 CFB enc | 8.95 ns/B 106.5 MiB/s 38.72 c/B 4325 CFB dec | 4.81 ns/B 198.2 MiB/s 20.57 c/B 4275 CTR enc | 4.81 ns/B 198.2 MiB/s 20.69 c/B 4300 CTR dec | 4.80 ns/B 198.8 MiB/s 20.63 c/B 4300 GCM auth | 0.116 ns/B 8232 MiB/s 0.504 c/B 4351 OCB enc | 4.88 ns/B 195.5 MiB/s 20.86 c/B 4275 OCB dec | 4.85 ns/B 196.6 MiB/s 20.86 c/B 4301 OCB auth | 4.80 ns/B 198.9 MiB/s 20.62 c/B 4301
After (~3.0x faster):
SM4 | nanosecs/byte mebibytes/sec cycles/byte auto Mhz
CBC enc | 8.98 ns/B 106.2 MiB/s 38.62 c/B 4300 CBC dec | 1.55 ns/B 613.7 MiB/s 6.64 c/B 4275 CFB enc | 8.96 ns/B 106.4 MiB/s 38.52 c/B 4300 CFB dec | 1.54 ns/B 617.4 MiB/s 6.60 c/B 4275 CTR enc | 1.57 ns/B 607.8 MiB/s 6.75 c/B 4300 CTR dec | 1.57 ns/B 608.9 MiB/s 6.74 c/B 4300 OCB enc | 1.58 ns/B 603.8 MiB/s 6.75 c/B 4275 OCB dec | 1.57 ns/B 605.7 MiB/s 6.73 c/B 4275 OCB auth | 1.53 ns/B 624.5 MiB/s 6.57 c/B 4300
sm4 avx fix
sm4 avx fix
- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>