Home GnuPG

Add ARMv8/AArch32 Crypto Extension implementation of AES
05a4cecae0c0Unpublished

Unpublished Commit ยท Learn More

Not On Permanent Ref: This commit is not an ancestor of any permanent ref.

Description

Add ARMv8/AArch32 Crypto Extension implementation of AES

* cipher/Makefile.am: Add 'rijndael-armv8-ce.c' and
'rijndael-armv-aarch32-ce.S'.
* cipher/rijndael-armv8-aarch32-ce.S: New.
* cipher/rijndael-armv8-ce.c: New.
* cipher/rijndael-internal.h (USE_ARM_CE): New.
(RIJNDAEL_context_s): Add 'use_arm_ce'.
* cipher/rijndael.c [USE_ARM_CE] (_gcry_aes_armv8_ce_setkey)
(_gcry_aes_armv8_ce_prepare_decryption)
(_gcry_aes_armv8_ce_encrypt, _gcry_aes_armv8_ce_decrypt)
(_gcry_aes_armv8_ce_cfb_enc, _gcry_aes_armv8_ce_cbc_enc)
(_gcry_aes_armv8_ce_ctr_enc, _gcry_aes_armv8_ce_cfb_dec)
(_gcry_aes_armv8_ce_cbc_dec, _gcry_aes_armv8_ce_ocb_crypt)
(_gcry_aes_armv8_ce_ocb_auth): New.
(do_setkey) [USE_ARM_CE]: Add ARM CE/AES HW feature check and key
setup for ARM CE.
(prepare_decryption, _gcry_aes_cfb_enc, _gcry_aes_cbc_enc)
(_gcry_aes_ctr_enc, _gcry_aes_cfb_dec, _gcry_aes_cbc_dec)
(_gcry_aes_ocb_crypt, _gcry_aes_ocb_auth) [USE_ARM_CE]: Add
ARM CE support.
* configure.ac: Add 'rijndael-armv8-ce.lo' and
'rijndael-armv8-aarch32-ce.lo'.

Improvement vs ARM assembly on Cortex-A53:

AES-128  AES-192  AES-256

CBC enc: 14.8x 12.8x 11.4x
CBC dec: 21.4x 20.5x 19.4x
CFB enc: 16.2x 13.6x 11.6x
CFB dec: 21.6x 20.5x 19.4x
CTR: 19.1x 18.6x 17.8x
OCB enc: 16.0x 16.2x 16.1x
OCB dec: 15.6x 15.9x 15.8x
OCB auth: 18.3x 18.4x 18.0x

Benchmark on Cortex-A53 (1152 Mhz):

Before:
AES | nanosecs/byte mebibytes/sec cycles/byte

 ECB enc |     24.42 ns/B     39.06 MiB/s     28.13 c/B
 ECB dec |     25.07 ns/B     38.05 MiB/s     28.88 c/B
 CBC enc |     21.05 ns/B     45.30 MiB/s     24.25 c/B
 CBC dec |     21.16 ns/B     45.07 MiB/s     24.38 c/B
 CFB enc |     21.05 ns/B     45.31 MiB/s     24.25 c/B
 CFB dec |     21.38 ns/B     44.61 MiB/s     24.62 c/B
 OFB enc |     26.15 ns/B     36.47 MiB/s     30.13 c/B
 OFB dec |     26.15 ns/B     36.47 MiB/s     30.13 c/B
 CTR enc |     21.17 ns/B     45.06 MiB/s     24.38 c/B
 CTR dec |     21.16 ns/B     45.06 MiB/s     24.38 c/B
 CCM enc |     42.32 ns/B     22.53 MiB/s     48.75 c/B
 CCM dec |     42.32 ns/B     22.53 MiB/s     48.75 c/B
CCM auth |     21.17 ns/B     45.06 MiB/s     24.38 c/B
 GCM enc |     22.08 ns/B     43.19 MiB/s     25.44 c/B
 GCM dec |     22.08 ns/B     43.18 MiB/s     25.44 c/B
GCM auth |     0.923 ns/B    1032.8 MiB/s      1.06 c/B
 OCB enc |     26.20 ns/B     36.40 MiB/s     30.18 c/B
 OCB dec |     25.97 ns/B     36.73 MiB/s     29.91 c/B
OCB auth |     24.52 ns/B     38.90 MiB/s     28.24 c/B
         =

AES192 | nanosecs/byte mebibytes/sec cycles/byte

 ECB enc |     27.83 ns/B     34.26 MiB/s     32.06 c/B
 ECB dec |     28.54 ns/B     33.42 MiB/s     32.88 c/B
 CBC enc |     24.47 ns/B     38.97 MiB/s     28.19 c/B
 CBC dec |     25.27 ns/B     37.74 MiB/s     29.11 c/B
 CFB enc |     25.08 ns/B     38.02 MiB/s     28.89 c/B
 CFB dec |     25.31 ns/B     37.68 MiB/s     29.16 c/B
 OFB enc |     29.57 ns/B     32.25 MiB/s     34.06 c/B
 OFB dec |     29.57 ns/B     32.25 MiB/s     34.06 c/B
 CTR enc |     25.24 ns/B     37.78 MiB/s     29.08 c/B
 CTR dec |     25.24 ns/B     37.79 MiB/s     29.08 c/B
 CCM enc |     49.81 ns/B     19.15 MiB/s     57.38 c/B
 CCM dec |     49.80 ns/B     19.15 MiB/s     57.37 c/B
CCM auth |     24.58 ns/B     38.80 MiB/s     28.32 c/B
 GCM enc |     26.15 ns/B     36.47 MiB/s     30.13 c/B
 GCM dec |     26.11 ns/B     36.52 MiB/s     30.08 c/B
GCM auth |     0.923 ns/B    1033.0 MiB/s      1.06 c/B
 OCB enc |     29.59 ns/B     32.23 MiB/s     34.09 c/B
 OCB dec |     29.42 ns/B     32.42 MiB/s     33.89 c/B
OCB auth |     27.92 ns/B     34.16 MiB/s     32.16 c/B
         =

AES256 | nanosecs/byte mebibytes/sec cycles/byte

 ECB enc |     31.20 ns/B     30.57 MiB/s     35.94 c/B
 ECB dec |     31.80 ns/B     29.99 MiB/s     36.63 c/B
 CBC enc |     27.83 ns/B     34.27 MiB/s     32.06 c/B
 CBC dec |     27.87 ns/B     34.21 MiB/s     32.11 c/B
 CFB enc |     27.88 ns/B     34.20 MiB/s     32.12 c/B
 CFB dec |     28.16 ns/B     33.87 MiB/s     32.44 c/B
 OFB enc |     32.93 ns/B     28.96 MiB/s     37.94 c/B
 OFB dec |     32.93 ns/B     28.96 MiB/s     37.94 c/B
 CTR enc |     27.95 ns/B     34.13 MiB/s     32.19 c/B
 CTR dec |     27.95 ns/B     34.12 MiB/s     32.20 c/B
 CCM enc |     55.88 ns/B     17.07 MiB/s     64.38 c/B
 CCM dec |     55.88 ns/B     17.07 MiB/s     64.38 c/B
CCM auth |     27.95 ns/B     34.12 MiB/s     32.20 c/B
 GCM enc |     28.86 ns/B     33.05 MiB/s     33.25 c/B
 GCM dec |     28.87 ns/B     33.04 MiB/s     33.25 c/B
GCM auth |     0.923 ns/B    1033.0 MiB/s      1.06 c/B
 OCB enc |     32.96 ns/B     28.94 MiB/s     37.97 c/B
 OCB dec |     32.73 ns/B     29.14 MiB/s     37.70 c/B
OCB auth |     31.29 ns/B     30.48 MiB/s     36.04 c/B

After:
AES | nanosecs/byte mebibytes/sec cycles/byte

 ECB enc |      5.10 ns/B     187.0 MiB/s      5.88 c/B
 ECB dec |      5.27 ns/B     181.0 MiB/s      6.07 c/B
 CBC enc |      1.41 ns/B     675.8 MiB/s      1.63 c/B
 CBC dec |     0.992 ns/B     961.7 MiB/s      1.14 c/B
 CFB enc |      1.30 ns/B     732.4 MiB/s      1.50 c/B
 CFB dec |     0.991 ns/B     962.7 MiB/s      1.14 c/B
 OFB enc |      7.05 ns/B     135.2 MiB/s      8.13 c/B
 OFB dec |      7.05 ns/B     135.2 MiB/s      8.13 c/B
 CTR enc |      1.11 ns/B     856.9 MiB/s      1.28 c/B
 CTR dec |      1.11 ns/B     857.0 MiB/s      1.28 c/B
 CCM enc |      2.58 ns/B     369.8 MiB/s      2.97 c/B
 CCM dec |      2.58 ns/B     369.5 MiB/s      2.97 c/B
CCM auth |      1.58 ns/B     605.2 MiB/s      1.82 c/B
 GCM enc |      2.04 ns/B     467.9 MiB/s      2.35 c/B
 GCM dec |      2.04 ns/B     466.6 MiB/s      2.35 c/B
GCM auth |     0.923 ns/B    1033.0 MiB/s      1.06 c/B
 OCB enc |      1.64 ns/B     579.8 MiB/s      1.89 c/B
 OCB dec |      1.66 ns/B     574.5 MiB/s      1.91 c/B
OCB auth |      1.33 ns/B     715.5 MiB/s      1.54 c/B
         =

AES192 | nanosecs/byte mebibytes/sec cycles/byte

 ECB enc |      5.64 ns/B     169.0 MiB/s      6.50 c/B
 ECB dec |      5.81 ns/B     164.3 MiB/s      6.69 c/B
 CBC enc |      1.90 ns/B     502.1 MiB/s      2.19 c/B
 CBC dec |      1.24 ns/B     771.7 MiB/s      1.42 c/B
 CFB enc |      1.84 ns/B     517.1 MiB/s      2.12 c/B
 CFB dec |      1.23 ns/B     772.5 MiB/s      1.42 c/B
 OFB enc |      7.60 ns/B     125.5 MiB/s      8.75 c/B
 OFB dec |      7.60 ns/B     125.6 MiB/s      8.75 c/B
 CTR enc |      1.36 ns/B     702.7 MiB/s      1.56 c/B
 CTR dec |      1.36 ns/B     702.5 MiB/s      1.56 c/B
 CCM enc |      3.31 ns/B     287.8 MiB/s      3.82 c/B
 CCM dec |      3.31 ns/B     288.0 MiB/s      3.81 c/B
CCM auth |      2.06 ns/B     462.1 MiB/s      2.38 c/B
 GCM enc |      2.28 ns/B     418.4 MiB/s      2.63 c/B
 GCM dec |      2.28 ns/B     418.0 MiB/s      2.63 c/B
GCM auth |     0.923 ns/B    1032.8 MiB/s      1.06 c/B
 OCB enc |      1.83 ns/B     520.1 MiB/s      2.11 c/B
 OCB dec |      1.84 ns/B     517.8 MiB/s      2.12 c/B
OCB auth |      1.52 ns/B     626.1 MiB/s      1.75 c/B
         =

AES256 | nanosecs/byte mebibytes/sec cycles/byte

 ECB enc |      5.86 ns/B     162.7 MiB/s      6.75 c/B
 ECB dec |      6.02 ns/B     158.3 MiB/s      6.94 c/B
 CBC enc |      2.44 ns/B     390.5 MiB/s      2.81 c/B
 CBC dec |      1.45 ns/B     656.4 MiB/s      1.67 c/B
 CFB enc |      2.39 ns/B     399.5 MiB/s      2.75 c/B
 CFB dec |      1.45 ns/B     656.8 MiB/s      1.67 c/B
 OFB enc |      7.81 ns/B     122.1 MiB/s      9.00 c/B
 OFB dec |      7.81 ns/B     122.1 MiB/s      9.00 c/B
 CTR enc |      1.57 ns/B     605.8 MiB/s      1.81 c/B
 CTR dec |      1.57 ns/B     605.9 MiB/s      1.81 c/B
 CCM enc |      4.07 ns/B     234.3 MiB/s      4.69 c/B
 CCM dec |      4.07 ns/B     234.1 MiB/s      4.69 c/B
CCM auth |      2.61 ns/B     365.7 MiB/s      3.00 c/B
 GCM enc |      2.50 ns/B     381.9 MiB/s      2.88 c/B
 GCM dec |      2.49 ns/B     382.3 MiB/s      2.87 c/B
GCM auth |     0.926 ns/B    1029.7 MiB/s      1.07 c/B
 OCB enc |      2.05 ns/B     465.6 MiB/s      2.36 c/B
 OCB dec |      2.06 ns/B     462.0 MiB/s      2.38 c/B
OCB auth |      1.74 ns/B     548.4 MiB/s      2.00 c/B
  • Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>

Details

Provenance
jukiviliAuthored on Jul 14 2016, 4:55 PM
Parents
rC962b15470663: Add ARMv8/AArch32 Crypto Extension implementation of GCM
Branches
Unknown
Tags
Unknown

Event Timeline

Jussi Kivilinna <jussi.kivilinna@iki.fi> committed rC05a4cecae0c0: Add ARMv8/AArch32 Crypto Extension implementation of AES (authored by Jussi Kivilinna <jussi.kivilinna@iki.fi>).Jul 14 2016, 4:55 PM