Home GnuPG

chacha20: add SSE2/AMD64 optimized implementation
323b1eb80ff3Unpublished

Unpublished Commit ยท Learn More

Not On Permanent Ref: This commit is not an ancestor of any permanent ref.

Description

chacha20: add SSE2/AMD64 optimized implementation

* cipher/Makefile.am: Add 'chacha20-sse2-amd64.S'.
* cipher/chacha20-sse2-amd64.S: New.
* cipher/chacha20.c (USE_SSE2): New.
[USE_SSE2] (_gcry_chacha20_amd64_sse2_blocks): New.
(chacha20_do_setkey) [USE_SSE2]: Use SSE2 implementation for blocks
function.
* configure.ac [host=x86-64]: Add 'chacha20-sse2-amd64.lo'.

Add Andrew Moon's public domain SSE2 implementation of ChaCha20. Original
source is available at: https://github.com/floodyberry/chacha-opt

Benchmark on Intel i5-4570 (haswell),
with "--disable-hwf intel-avx2 --disable-hwf intel-ssse3":

Old:
CHACHA20 | nanosecs/byte mebibytes/sec cycles/byte

STREAM enc |      1.97 ns/B     483.8 MiB/s      6.31 c/B
STREAM dec |      1.97 ns/B     483.6 MiB/s      6.31 c/B

New:
CHACHA20 | nanosecs/byte mebibytes/sec cycles/byte

STREAM enc |     0.931 ns/B    1024.7 MiB/s      2.98 c/B
STREAM dec |     0.930 ns/B    1025.0 MiB/s      2.98 c/B
  • Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>

Details

Provenance
jukiviliAuthored on May 16 2014, 8:28 PM
Parents
rC98f021961ee6: poly1305: add AMD64/AVX2 optimized implementation
Branches
Unknown
Tags
Unknown

Event Timeline

Jussi Kivilinna <jussi.kivilinna@iki.fi> committed rC323b1eb80ff3: chacha20: add SSE2/AMD64 optimized implementation (authored by Jussi Kivilinna <jussi.kivilinna@iki.fi>).May 16 2014, 8:41 PM