rijndael-ppc: add key setup and enable single block PowerPC AES
* cipher/Makefile.am: Add 'rijndael-ppc.c'. * cipher/rijndael-internal.h (USE_PPC_CRYPTO): New. (RIJNDAEL_context): Add 'use_ppc_crypto'. * cipher/rijndael-ppc.c (backwards, swap_if_le): Remove. (u128_t, ALWAYS_INLINE, NO_INLINE, NO_INSTRUMENT_FUNCTION) (ASM_FUNC_ATTR, ASM_FUNC_ATTR_INLINE, ASM_FUNC_ATTR_NOINLINE) (ALIGNED_LOAD, ALIGNED_STORE, VEC_LOAD_BE, VEC_STORE_BE) (vec_bswap32_const, vec_aligned_ld, vec_load_be_const) (vec_load_be, vec_aligned_st, vec_store_be, _gcry_aes_sbox4_ppc8) (_gcry_aes_ppc8_setkey, _gcry_aes_ppc8_prepare_decryption) (aes_ppc8_encrypt_altivec, aes_ppc8_decrypt_altivec): New. (_gcry_aes_ppc8_encrypt, _gcry_aes_ppc8_decrypt): Rewrite. (_gcry_aes_ppc8_ocb_crypt): Comment out. * cipher/rijndael.c [USE_PPC_CRYPTO] (_gcry_aes_ppc8_setkey) (_gcry_aes_ppc8_prepare_decryption, _gcry_aes_ppc8_encrypt) (_gcry_aes_ppc8_decrypt): New prototypes. (do_setkey) [USE_PPC_CRYPTO]: Add setup for PowerPC AES. (prepare_decryption) [USE_PPC_CRYPTO]: Ditto. * configure.ac: Add 'rijndael-ppc.lo'. (gcry_cv_ppc_altivec, gcry_cv_cc_ppc_altivec_cflags) (gcry_cv_gcc_inline_asm_ppc_altivec) (gcry_cv_gcc_inline_asm_ppc_arch_3_00): New checks.
Benchmark on POWER8 ~3.8Ghz:
Before:
AES | nanosecs/byte mebibytes/sec cycles/byte
ECB enc | 7.27 ns/B 131.2 MiB/s 27.61 c/B ECB dec | 7.70 ns/B 123.8 MiB/s 29.28 c/B CBC enc | 6.38 ns/B 149.5 MiB/s 24.24 c/B CBC dec | 6.17 ns/B 154.5 MiB/s 23.45 c/B CFB enc | 6.45 ns/B 147.9 MiB/s 24.51 c/B CFB dec | 6.20 ns/B 153.8 MiB/s 23.57 c/B OFB enc | 7.36 ns/B 129.6 MiB/s 27.96 c/B OFB dec | 7.36 ns/B 129.6 MiB/s 27.96 c/B CTR enc | 6.22 ns/B 153.2 MiB/s 23.65 c/B CTR dec | 6.22 ns/B 153.3 MiB/s 23.65 c/B XTS enc | 6.67 ns/B 142.9 MiB/s 25.36 c/B XTS dec | 6.70 ns/B 142.3 MiB/s 25.46 c/B CCM enc | 12.61 ns/B 75.60 MiB/s 47.93 c/B CCM dec | 12.62 ns/B 75.56 MiB/s 47.96 c/B CCM auth | 6.41 ns/B 148.8 MiB/s 24.36 c/B EAX enc | 12.62 ns/B 75.55 MiB/s 47.96 c/B EAX dec | 12.62 ns/B 75.55 MiB/s 47.97 c/B EAX auth | 6.39 ns/B 149.2 MiB/s 24.30 c/B GCM enc | 9.81 ns/B 97.24 MiB/s 37.27 c/B GCM dec | 9.81 ns/B 97.20 MiB/s 37.28 c/B GCM auth | 3.59 ns/B 265.8 MiB/s 13.63 c/B OCB enc | 6.39 ns/B 149.3 MiB/s 24.27 c/B OCB dec | 6.38 ns/B 149.5 MiB/s 24.25 c/B OCB auth | 6.35 ns/B 150.2 MiB/s 24.13 c/B
After:
ECB enc | 1.29 ns/B 737.7 MiB/s 4.91 c/B ECB dec | 1.34 ns/B 711.1 MiB/s 5.10 c/B CBC enc | 2.13 ns/B 448.5 MiB/s 8.08 c/B CBC dec | 1.05 ns/B 908.0 MiB/s 3.99 c/B CFB enc | 2.17 ns/B 439.9 MiB/s 8.24 c/B CFB dec | 2.22 ns/B 429.8 MiB/s 8.43 c/B OFB enc | 1.49 ns/B 640.1 MiB/s 5.66 c/B OFB dec | 1.49 ns/B 640.1 MiB/s 5.66 c/B CTR enc | 2.21 ns/B 432.5 MiB/s 8.38 c/B CTR dec | 2.20 ns/B 432.5 MiB/s 8.38 c/B XTS enc | 2.32 ns/B 410.6 MiB/s 8.83 c/B XTS dec | 2.33 ns/B 409.7 MiB/s 8.85 c/B CCM enc | 4.36 ns/B 218.7 MiB/s 16.57 c/B CCM dec | 4.36 ns/B 218.8 MiB/s 16.56 c/B CCM auth | 2.17 ns/B 440.4 MiB/s 8.23 c/B EAX enc | 4.37 ns/B 218.3 MiB/s 16.60 c/B EAX dec | 4.36 ns/B 218.7 MiB/s 16.57 c/B EAX auth | 2.16 ns/B 440.7 MiB/s 8.22 c/B GCM enc | 5.78 ns/B 165.0 MiB/s 21.96 c/B GCM dec | 5.78 ns/B 165.0 MiB/s 21.96 c/B GCM auth | 3.59 ns/B 265.9 MiB/s 13.63 c/B OCB enc | 2.33 ns/B 410.1 MiB/s 8.84 c/B OCB dec | 2.34 ns/B 407.2 MiB/s 8.90 c/B OCB auth | 2.32 ns/B 411.1 MiB/s 8.82 c/B
- GnuPG-bug-id: T4529
- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>