Add Whirlpool AMD64/SSE2 assembly implementation
* cipher/Makefile.am: Add 'whirlpool-sse2-amd64.S'. * cipher/whirlpool-sse2-amd64.S: New. * cipher/whirlpool.c (USE_AMD64_ASM): New. (whirlpool_tables_s): New. (rc, C0, C1, C2, C3, C4, C5, C6, C7): Combine these tables into single structure and replace old tables with macros of same name. (tab): New structure containing above tables. [USE_AMD64_ASM] (_gcry_whirlpool_transform_amd64) (whirlpool_transform): New. * configure.ac [host=x86_64]: Add 'whirlpool-sse2-amd64.lo'.
Benchmark results:
On Intel Core i5-4570 (3.2 Ghz):
After:
WHIRLPOOL | 4.82 ns/B 197.8 MiB/s 15.43 c/B
Before:
WHIRLPOOL | 9.10 ns/B 104.8 MiB/s 29.13 c/B
On Intel Core i5-2450M (2.5 Ghz):
After:
WHIRLPOOL | 8.43 ns/B 113.1 MiB/s 21.09 c/B
Before:
WHIRLPOOL | 13.45 ns/B 70.92 MiB/s 33.62 c/B
On Intel Core2 T8100 (2.1 Ghz):
After:
WHIRLPOOL | 10.22 ns/B 93.30 MiB/s 21.47 c/B
Before:
WHIRLPOOL | 19.87 ns/B 48.00 MiB/s 41.72 c/B
Summary, old vs new ratio:
Intel Core i5-4570: 1.88x
Intel Core i5-2450M: 1.59x
Intel Core2 T8100: 1.94x
- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>