Home GnuPG

Add SSSE3 optimized non-parallel ChaCha20 function

Description

Add SSSE3 optimized non-parallel ChaCha20 function

* cipher/chacha20-amd64-ssse3.S (ROTATE_SHUF, ROTATE, WORD_SHUF)
(QUARTERROUND4, _gcry_chacha20_amd64_ssse3_blocks1): New.
* cipher/chacha20.c (_gcry_chacha20_amd64_ssse3_blocks1): New
prototype.
(chacha20_blocks): Rename to ...
(do_chacha20_blocks): ... this.
(chacha20_blocks): New.
(chacha20_encrypt_stream): Adjust for new chacha20_blocks function.

This patch provides SSSE3 optimized version of non-parallel
ChaCha20 core block function. On Intel Haswell generic C function
runs at 6.9 cycles/byte. New function runs at 5.2 cycles/byte, thus
being ~32% faster.

  • Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>

Details

Provenance
jukiviliAuthored on Jan 27 2019, 10:19 AM
Parents
rC88e482d16ee8: tests/basic: increase buffer size for check_one_cipher
Branches
Unknown
Tags
Unknown