Add ARMv8/CE acceleration for AES-XTS

Authored by jukivili on Jan 20 2018, 9:05 PM.

Description

Add ARMv8/CE acceleration for AES-XTS

* cipher/rijndael-armv8-aarch32-ce.S (_gcry_aes_xts_enc_armv8_ce)
(_gcry_aes_xts_dec_armv8_ce): New.
* cipher/rijndael-armv8-aarch64-ce.S (_gcry_aes_xts_enc_armv8_ce)
(_gcry_aes_xts_dec_armv8_ce): New.
* cipher/rijndael-armv8-ce.c (_gcry_aes_xts_enc_armv8_ce)
(_gcry_aes_xts_dec_armv8_ce, xts_crypt_fn_t)
(_gcry_aes_armv8_ce_xts_crypt): New.
* cipher/rijndael.c (_gcry_aes_armv8_ce_xts_crypt): New.
(_gcry_aes_xts_crypt) [USE_ARM_CE]: New.

Benchmark on Cortex-A53 (AArch64, 1152 Mhz):

Before:
AES | nanosecs/byte mebibytes/sec cycles/byte

XTS enc |      4.88 ns/B     195.5 MiB/s      5.62 c/B
XTS dec |      4.94 ns/B     192.9 MiB/s      5.70 c/B
        =

AES192 | nanosecs/byte mebibytes/sec cycles/byte

XTS enc |      5.55 ns/B     171.8 MiB/s      6.39 c/B
XTS dec |      5.61 ns/B     169.9 MiB/s      6.47 c/B
        =

AES256 | nanosecs/byte mebibytes/sec cycles/byte

XTS enc |      6.22 ns/B     153.3 MiB/s      7.17 c/B
XTS dec |      6.29 ns/B     151.7 MiB/s      7.24 c/B
        =

After (~2.6x faster):
AES | nanosecs/byte mebibytes/sec cycles/byte

XTS enc |      1.83 ns/B     520.9 MiB/s      2.11 c/B
XTS dec |      1.82 ns/B     524.9 MiB/s      2.09 c/B
        =

AES192 | nanosecs/byte mebibytes/sec cycles/byte

XTS enc |      1.97 ns/B     483.3 MiB/s      2.27 c/B
XTS dec |      1.96 ns/B     486.9 MiB/s      2.26 c/B
        =

AES256 | nanosecs/byte mebibytes/sec cycles/byte

XTS enc |      2.11 ns/B     450.9 MiB/s      2.44 c/B
XTS dec |      2.10 ns/B     453.8 MiB/s      2.42 c/B
        =

Benchmark on Cortex-A53 (AArch32, 1152 Mhz):

Before:
AES | nanosecs/byte mebibytes/sec cycles/byte

XTS enc |      6.52 ns/B     146.2 MiB/s      7.51 c/B
XTS dec |      6.57 ns/B     145.2 MiB/s      7.57 c/B
        =

AES192 | nanosecs/byte mebibytes/sec cycles/byte

XTS enc |      7.10 ns/B     134.3 MiB/s      8.18 c/B
XTS dec |      7.11 ns/B     134.2 MiB/s      8.19 c/B
        =

AES256 | nanosecs/byte mebibytes/sec cycles/byte

XTS enc |      7.30 ns/B     130.7 MiB/s      8.41 c/B
XTS dec |      7.38 ns/B     129.3 MiB/s      8.50 c/B
        =

After (~2.7x faster):
Cipher:
AES | nanosecs/byte mebibytes/sec cycles/byte

XTS enc |      2.33 ns/B     409.6 MiB/s      2.68 c/B
XTS dec |      2.35 ns/B     405.3 MiB/s      2.71 c/B
        =

AES192 | nanosecs/byte mebibytes/sec cycles/byte

XTS enc |      2.53 ns/B     377.6 MiB/s      2.91 c/B
XTS dec |      2.54 ns/B     375.5 MiB/s      2.93 c/B
        =

AES256 | nanosecs/byte mebibytes/sec cycles/byte

XTS enc |      2.75 ns/B     346.8 MiB/s      3.17 c/B
XTS dec |      2.76 ns/B     345.2 MiB/s      3.18 c/B
        =
  • Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>

Details

Committed
jukiviliJan 20 2018, 9:05 PM
Parents
rCc3d60acc3ab5: rijndael-ssse3: call assembly functions directly
Branches
Unknown
Tags
Unknown