Page MenuHome GnuPG

libgcrypt: s390x/zSeries implementation of Poly1305 / ChaCha20-Poly1305 AEAD
Closed, ResolvedPublic

Description

s390x/zSeries implement faster ChaCha20-Poly1305 AEAD.

Event Timeline

jukivili created this object in space S1 Public.

Implemented stitched ChaCha20-Poly1305 (vector ChaCha20 & ALU Poly1305). Unfortunately performance is less than OpenSSL (vector ChaCha20 & vector Poly1305). Instruction latencies make Poly1305 slower than combined OpenSSL ChaCha20+Poly1305, thus it is not possible to reach same performance with stitching. Vector Poly1305 implementation is therefore needed.

jukivili renamed this task from libgcrypt: s390x/zSeries 128-bit vector implementation of Poly1305 to libgcrypt: s390x/zSeries implementation of Poly1305 / ChaCha20-Poly1305 AEAD.Dec 30 2020, 12:24 PM
jukivili updated the task description. (Show Details)

With little extra effort, stitched implementation turned out ok after all.

libgcrypt implementation, stitched ChaCha20-Poly1305:

CHACHA20       |  nanosecs/byte   mebibytes/sec   cycles/byte     
    STREAM enc |     0.506 ns/B      1886 MiB/s      2.28 c/B
    STREAM dec |     0.506 ns/B      1884 MiB/s      2.28 c/B
  POLY1305 enc |     0.677 ns/B      1409 MiB/s      3.05 c/B
  POLY1305 dec |     0.655 ns/B      1456 MiB/s      2.95 c/B
 POLY1305 auth |     0.569 ns/B      1675 MiB/s      2.56 c/B

openssl 1.1.1f:

bench-slope-openssl: OpenSSL 1.1.1f  31 Mar 2020
Cipher:
 chacha20       |  nanosecs/byte   mebibytes/sec   cycles/byte
     STREAM enc |     0.592 ns/B    1609.9 MiB/s      2.67 c/B
     STREAM dec |     0.593 ns/B    1607.1 MiB/s      2.67 c/B
   POLY1305 enc |     0.790 ns/B    1207.1 MiB/s      3.56 c/B
   POLY1305 dec |     0.809 ns/B    1178.9 MiB/s      3.64 c/B
  POLY1305 auth |     0.200 ns/B    4772.4 MiB/s     0.900 c/B

Merged to master.