Add AVX and AVX/BMI2 implementations for SHA-1
* cipher/Makefile.am: Add 'sha1-avx-amd64.S' and 'sha1-avx-bmi2-amd64.S'. * cipher/sha1-avx-amd64.S: New. * cipher/sha1-avx-bmi2-amd64.S: New. * cipher/sha1.c (USE_AVX, USE_BMI2): New. (SHA1_CONTEXT) [USE_AVX]: Add 'use_avx'. (SHA1_CONTEXT) [USE_BMI2]: Add 'use_bmi2'. (sha1_init): Initialize 'use_avx' and 'use_bmi2'. [USE_AVX] (_gcry_sha1_transform_amd64_avx): New. [USE_BMI2] (_gcry_sha1_transform_amd64_bmi2): New. (transform) [USE_BMI2]: Use BMI2 assembly if enabled. (transform) [USE_AVX]: Use AVX assembly if enabled. * configure.ac: Add 'sha1-avx-amd64.lo' and 'sha1-avx-bmi2-amd64.lo'.
Patch adds AVX (for Sandybridge and Ivybridge) and AVX/BMI2 (for Haswell)
optimized implementations of SHA-1.
Note: AVX implementation is currently limited to Intel CPUs due to use
of SHLD instruction for faster rotations on Sandybrigde.
Benchmarks:
cpu C-version SSSE3 AVX/(SHLD|BMI2) New vs C New vs SSSE3
Intel i5-4570 8.84 c/B 4.61 c/B 3.86 c/B 2.29x 1.19x
Intel i5-2450M 9.45 c/B 5.30 c/B 4.39 c/B 2.15x 1.20x
- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>