Add GFNI/AVX2 implementation of Camellia
* cipher/Makefile.am: Add "camellia-gfni-avx2-amd64.S". * cipher/camellia-aesni-avx2-amd64.h [CAMELLIA_GFNI_BUILD]: Add GFNI support. * cipher/camellia-gfni-avx2-amd64.S: New. * cipher/camellia-glue.c (USE_GFNI_AVX2): New. (CAMELLIA_context) [USE_AESNI_AVX2]: New member "use_gfni_avx2". [USE_GFNI_AVX2] (_gcry_camellia_gfni_avx2_ctr_enc) (_gcry_camellia_gfni_avx2_cbc_dec, _gcry_camellia_gfni_avx2_cfb_dec) (_gcry_camellia_gfni_avx2_ocb_enc, _gcry_camellia_gfni_avx2_ocb_dec) (_gcry_camellia_gfni_avx2_ocb_auth): New. (camellia_setkey) [USE_GFNI_AVX2]: Enable GFNI if supported by HW. (_gcry_camellia_ctr_enc) [USE_GFNI_AVX2]: Add GFNI support. (_gcry_camellia_cbc_dec) [USE_GFNI_AVX2]: Add GFNI support. (_gcry_camellia_cfb_dec) [USE_GFNI_AVX2]: Add GFNI support. (_gcry_camellia_ocb_crypt) [USE_GFNI_AVX2]: Add GFNI support. (_gcry_camellia_ocb_auth) [USE_GFNI_AVX2]: Add GFNI support. * configure.ac: Add "camellia-gfni-avx2-amd64.lo".
Benchmark on Intel Core i3-1115G4 (tigerlake):
Before (VAES/AVX2 implementation):
CAMELLIA128 | nanosecs/byte mebibytes/sec cycles/byte auto Mhz
CBC dec | 0.579 ns/B 1646 MiB/s 2.37 c/B 4090 CFB dec | 0.579 ns/B 1648 MiB/s 2.37 c/B 4089 CTR enc | 0.586 ns/B 1628 MiB/s 2.40 c/B 4090 CTR dec | 0.587 ns/B 1626 MiB/s 2.40 c/B 4090 OCB enc | 0.607 ns/B 1570 MiB/s 2.48 c/B 4089 OCB dec | 0.611 ns/B 1561 MiB/s 2.50 c/B 4089 OCB auth | 0.602 ns/B 1585 MiB/s 2.46 c/B 4089
After (~80% faster):
CAMELLIA128 | nanosecs/byte mebibytes/sec cycles/byte auto Mhz
CBC dec | 0.299 ns/B 3186 MiB/s 1.22 c/B 4090 CFB dec | 0.314 ns/B 3039 MiB/s 1.28 c/B 4089 CTR enc | 0.322 ns/B 2962 MiB/s 1.32 c/B 4090 CTR dec | 0.321 ns/B 2970 MiB/s 1.31 c/B 4090 OCB enc | 0.339 ns/B 2817 MiB/s 1.38 c/B 4089 OCB dec | 0.346 ns/B 2756 MiB/s 1.41 c/B 4089 OCB auth | 0.337 ns/B 2831 MiB/s 1.38 c/B 4089
- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>