Add aggregated bulk processing for GCM on x86-64
* cipher/cipher-gcm.c [__x86_64__] (gfmul_pclmul_aggr4): New. (ghash) [GCM_USE_INTEL_PCLMUL]: Add aggregated bulk processing for __x86_64__. (setupM) [__x86_64__]: Add initialization for aggregated bulk processing.
Intel Haswell (x86-64):
Old:
AES GCM enc | 0.990 ns/B 963.3 MiB/s 3.17 c/B
GCM dec | 0.982 ns/B 970.9 MiB/s 3.14 c/B GCM auth | 0.711 ns/B 1340.8 MiB/s 2.28 c/B
New:
AES GCM enc | 0.535 ns/B 1783.8 MiB/s 1.71 c/B
GCM dec | 0.531 ns/B 1796.2 MiB/s 1.70 c/B GCM auth | 0.255 ns/B 3736.4 MiB/s 0.817 c/B
- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>