Home GnuPG

poly1305: add AMD64/AVX2 optimized implementation
98f021961ee6Unpublished

Unpublished Commit ยท Learn More

Not On Permanent Ref: This commit is not an ancestor of any permanent ref.

Description

poly1305: add AMD64/AVX2 optimized implementation

* cipher/Makefile.am: Add 'poly1305-avx2-amd64.S'.
* cipher/poly1305-avx2-amd64.S: New.
* cipher/poly1305-internal.h (POLY1305_USE_AVX2)
(POLY1305_AVX2_BLOCKSIZE, POLY1305_AVX2_STATESIZE)
(POLY1305_AVX2_ALIGNMENT): New.
(POLY1305_LARGEST_BLOCKSIZE, POLY1305_LARGEST_STATESIZE)
(POLY1305_STATE_ALIGNMENT): Use AVX2 versions when needed.
* cipher/poly1305.c [POLY1305_USE_AVX2]
(_gcry_poly1305_amd64_avx2_init_ext)
(_gcry_poly1305_amd64_avx2_finish_ext)
(_gcry_poly1305_amd64_avx2_blocks, poly1305_amd64_avx2_ops): New.
(_gcry_poly1305_init) [POLY1305_USE_AVX2]: Use AVX2 implementation if
AVX2 supported by CPU.
* configure.ac [host=x86_64]: Add 'poly1305-avx2-amd64.lo'.

Add Andrew Moon's public domain AVX2 implementation of Poly1305. Original
source is available at: https://github.com/floodyberry/poly1305-opt

Benchmarks on Intel i5-4570 (haswell):

Old:

|  nanosecs/byte   mebibytes/sec   cycles/byte

POLY1305 | 0.448 ns/B 2129.5 MiB/s 1.43 c/B

New:

|  nanosecs/byte   mebibytes/sec   cycles/byte

POLY1305 | 0.205 ns/B 4643.5 MiB/s 0.657 c/B

  • Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>

Details

Provenance
jukiviliAuthored on May 11 2014, 7:52 PM
Parents
rC297532602ed2: poly1305: add AMD64/SSE2 optimized implementation
Branches
Unknown
Tags
Unknown

Event Timeline

Jussi Kivilinna <jussi.kivilinna@iki.fi> committed rC98f021961ee6: poly1305: add AMD64/AVX2 optimized implementation (authored by Jussi Kivilinna <jussi.kivilinna@iki.fi>).May 16 2014, 7:54 PM