Add PowerPC vector implementation of SM4
* cipher/Makefile.am: Add 'sm4-ppc.c'. * cipher/sm4-ppc.c: New. * cipher/sm4.c (USE_PPC_CRYPTO): New. (SM4_context): Add 'use_ppc8le' and 'use_ppc9le'. [USE_PPC_CRYPTO] (_gcry_sm4_ppc8le_crypt_blk1_16) (_gcry_sm4_ppc9le_crypt_blk1_16, sm4_ppc8le_crypt_blk1_16) (sm4_ppc9le_crypt_blk1_16): New. (sm4_setkey) [USE_PPC_CRYPTO]: Set use_ppc8le and use_ppc9le based on HW features. (sm4_get_crypt_blk1_16_fn) [USE_PPC_CRYPTO]: Add PowerPC implementation selection.
Benchmark on POWER9:
Before:
SM4 | nanosecs/byte mebibytes/sec cycles/byte
ECB enc | 14.47 ns/B 65.89 MiB/s 33.29 c/B ECB dec | 14.47 ns/B 65.89 MiB/s 33.29 c/B CBC enc | 35.09 ns/B 27.18 MiB/s 80.71 c/B CBC dec | 16.69 ns/B 57.13 MiB/s 38.39 c/B CFB enc | 35.09 ns/B 27.18 MiB/s 80.71 c/B CFB dec | 16.76 ns/B 56.90 MiB/s 38.55 c/B CTR enc | 16.88 ns/B 56.50 MiB/s 38.82 c/B CTR dec | 16.88 ns/B 56.50 MiB/s 38.82 c/B
After (ECB ~4.4x faster):
SM4 | nanosecs/byte mebibytes/sec cycles/byte
ECB enc | 3.26 ns/B 292.3 MiB/s 7.50 c/B ECB dec | 3.26 ns/B 292.3 MiB/s 7.50 c/B CBC enc | 35.10 ns/B 27.17 MiB/s 80.72 c/B CBC dec | 3.33 ns/B 286.3 MiB/s 7.66 c/B CFB enc | 35.10 ns/B 27.17 MiB/s 80.74 c/B CFB dec | 3.36 ns/B 283.8 MiB/s 7.73 c/B CTR enc | 3.47 ns/B 275.0 MiB/s 7.98 c/B CTR dec | 3.47 ns/B 275.0 MiB/s 7.98 c/B
- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>