Home GnuPG

Add ARM NEON assembly implementation of Serpent
2cb6e1f323d2Unpublished

Unpublished Commit ยท Learn More

Not On Permanent Ref: This commit is not an ancestor of any permanent ref.

Description

Add ARM NEON assembly implementation of Serpent

* cipher/Makefile.am: Add 'serpent-armv7-neon.S'.
* cipher/serpent-armv7-neon.S: New.
* cipher/serpent.c (USE_NEON): New macro.
(serpent_context_t) [USE_NEON]: Add 'use_neon'.
[USE_NEON] (_gcry_serpent_neon_ctr_enc, _gcry_serpent_neon_cfb_dec)
(_gcry_serpent_neon_cbc_dec): New prototypes.
(serpent_setkey_internal) [USE_NEON]: Detect NEON support.
(_gcry_serpent_neon_ctr_enc, _gcry_serpent_neon_cfb_dec)
(_gcry_serpent_neon_cbc_dec) [USE_NEON]: Use NEON implementations
to process eight blocks in parallel.
* configure.ac [neonsupport]: Add 'serpent-armv7-neon.lo'.

Patch adds ARM NEON optimized implementation of Serpent cipher
to speed up parallelizable bulk operations.

Benchmarks on ARM Cortex-A8 (armhf, 1008 Mhz):

Old:
SERPENT128 | nanosecs/byte mebibytes/sec cycles/byte

CBC dec |     43.53 ns/B     21.91 MiB/s     43.88 c/B
CFB dec |     44.77 ns/B     21.30 MiB/s     45.13 c/B
CTR enc |     45.21 ns/B     21.10 MiB/s     45.57 c/B
CTR dec |     45.21 ns/B     21.09 MiB/s     45.57 c/B

New:
SERPENT128 | nanosecs/byte mebibytes/sec cycles/byte

CBC dec |     26.26 ns/B     36.32 MiB/s     26.47 c/B
CFB dec |     26.21 ns/B     36.38 MiB/s     26.42 c/B
CTR enc |     26.20 ns/B     36.40 MiB/s     26.41 c/B
CTR dec |     26.20 ns/B     36.40 MiB/s     26.41 c/B
  • Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>

Details

Provenance
jukiviliAuthored on Oct 27 2013, 1:07 PM
Parents
rC3ff9d2571c18: Add ARM NEON assembly implementation of Salsa20
Branches
Unknown
Tags
Unknown

Event Timeline

Jussi Kivilinna <jussi.kivilinna@iki.fi> committed rC2cb6e1f323d2: Add ARM NEON assembly implementation of Serpent (authored by Jussi Kivilinna <jussi.kivilinna@iki.fi>).Oct 28 2013, 3:19 PM