New ChaCha implementations
* cipher/Makefile.am: Remove 'chacha20-sse2-amd64.S', 'chacha20-ssse3-amd64.S', 'chacha20-avx2-amd64.S'; Add 'chacha20-amd64-ssse3.S', 'chacha20-amd64-avx2.S'. * cipher/chacha20-amd64-avx2.S: New. * cipher/chacha20-amd64-ssse3.S: New. * cipher/chacha20-armv7-neon.S: Rewrite. * cipher/chacha20-avx2-amd64.S: Remove. * cipher/chacha20-sse2-amd64.S: Remove. * cipher/chacha20-ssse3-amd64.S: Remove. * cipher/chacha20.c (CHACHA20_INPUT_LENGTH, USE_SSE2, USE_NEON) (ASM_EXTRA_STACK, chacha20_blocks_t, _gcry_chacha20_amd64_sse2_blocks) (_gcry_chacha20_amd64_ssse3_blocks, _gcry_chacha20_amd64_avx2_blocks) (_gcry_chacha20_armv7_neon_blocks, QROUND, QOUT, chacha20_core) (chacha20_do_encrypt_stream): Remove. (_gcry_chacha20_amd64_ssse3_blocks4, _gcry_chacha20_amd64_avx2_blocks8) (_gcry_chacha20_armv7_neon_blocks4, ROTATE, XOR, PLUS, PLUSONE) (QUARTERROUND, BUF_XOR_LE32): New. (CHACHA20_context_s, chacha20_blocks, chacha20_keysetup) (chacha20_encrypt_stream): Rewrite. (chacha20_do_setkey): Adjust for new CHACHA20_context_s. * configure.ac: Remove 'chacha20-sse2-amd64.lo', 'chacha20-ssse3-amd64.lo', 'chacha20-avx2-amd64.lo'; Add 'chacha20-amd64-ssse3.lo', 'chacha20-amd64-avx2.lo'.
Intel Core i7-4790K CPU @ 4.00GHz (x86_64/AVX2):
CHACHA20 | nanosecs/byte mebibytes/sec cycles/byte
STREAM enc | 0.319 ns/B 2988.5 MiB/s 1.28 c/B STREAM dec | 0.318 ns/B 2995.4 MiB/s 1.27 c/B
Intel Core i7-4790K CPU @ 4.00GHz (x86_64/SSSE3):
CHACHA20 | nanosecs/byte mebibytes/sec cycles/byte
STREAM enc | 0.633 ns/B 1507.4 MiB/s 2.53 c/B STREAM dec | 0.633 ns/B 1506.6 MiB/s 2.53 c/B
Intel Core i7-4790K CPU @ 4.00GHz (i386):
CHACHA20 | nanosecs/byte mebibytes/sec cycles/byte
STREAM enc | 2.05 ns/B 465.2 MiB/s 8.20 c/B STREAM dec | 2.04 ns/B 467.5 MiB/s 8.16 c/B
Cortex-A53 @ 1152Mhz (armv7/neon):
CHACHA20 | nanosecs/byte mebibytes/sec cycles/byte
STREAM enc | 5.29 ns/B 180.3 MiB/s 6.09 c/B STREAM dec | 5.29 ns/B 180.1 MiB/s 6.10 c/B
- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>