Home GnuPG

Add Aarch64 assembly implementation of Camellia
de73a2e7237bUnpublished

Unpublished Commit ยท Learn More

Not On Permanent Ref: This commit is not an ancestor of any permanent ref.

Description

Add Aarch64 assembly implementation of Camellia

* cipher/Makefile.am: Add 'camellia-aarch64.S'.
* cipher/camellia-aarch64.S: New.
* cipher/camellia-glue.c [USE_ARM_ASM][__aarch64__]: Set stack burn
size to zero.
* cipher/camellia.h: Enable USE_ARM_ASM if __AARCH64EL__ and
HAVE_COMPATIBLE_GCC_AARCH64_PLATFORM_AS defined.
* configure.ac [host=aarch64]: Add 'rijndael-aarch64.lo'.

Patch adds ARMv8/Aarch64 implementation of Camellia.

Benchmark on Cortex-A53 (1152 Mhz):

Before:
CAMELLIA128 | nanosecs/byte mebibytes/sec cycles/byte

 ECB enc |     39.71 ns/B     24.01 MiB/s     45.75 c/B
 ECB dec |     39.72 ns/B     24.01 MiB/s     45.75 c/B
 CBC enc |     40.80 ns/B     23.38 MiB/s     47.00 c/B
 CBC dec |     39.66 ns/B     24.05 MiB/s     45.69 c/B
 CFB enc |     40.69 ns/B     23.44 MiB/s     46.88 c/B
 CFB dec |     39.66 ns/B     24.05 MiB/s     45.69 c/B
 OFB enc |     40.69 ns/B     23.44 MiB/s     46.88 c/B
 OFB dec |     40.69 ns/B     23.44 MiB/s     46.88 c/B
 CTR enc |     39.88 ns/B     23.91 MiB/s     45.94 c/B
 CTR dec |     39.88 ns/B     23.91 MiB/s     45.94 c/B
 CCM enc |     79.97 ns/B     11.92 MiB/s     92.13 c/B
 CCM dec |     79.97 ns/B     11.93 MiB/s     92.13 c/B
CCM auth |     40.20 ns/B     23.72 MiB/s     46.31 c/B
 GCM enc |     41.18 ns/B     23.16 MiB/s     47.44 c/B
 GCM dec |     41.18 ns/B     23.16 MiB/s     47.44 c/B
GCM auth |      1.30 ns/B     732.7 MiB/s      1.50 c/B
 OCB enc |     42.04 ns/B     22.69 MiB/s     48.43 c/B
 OCB dec |     42.03 ns/B     22.69 MiB/s     48.42 c/B
OCB auth |     41.38 ns/B     23.05 MiB/s     47.67 c/B
         =

CAMELLIA256 | nanosecs/byte mebibytes/sec cycles/byte

 ECB enc |     52.36 ns/B     18.22 MiB/s     60.31 c/B
 ECB dec |     52.36 ns/B     18.22 MiB/s     60.31 c/B
 CBC enc |     53.39 ns/B     17.86 MiB/s     61.50 c/B
 CBC dec |     52.14 ns/B     18.29 MiB/s     60.06 c/B
 CFB enc |     53.28 ns/B     17.90 MiB/s     61.38 c/B
 CFB dec |     52.14 ns/B     18.29 MiB/s     60.06 c/B
 OFB enc |     53.17 ns/B     17.94 MiB/s     61.25 c/B
 OFB dec |     53.17 ns/B     17.94 MiB/s     61.25 c/B
 CTR enc |     52.36 ns/B     18.21 MiB/s     60.32 c/B
 CTR dec |     52.36 ns/B     18.21 MiB/s     60.32 c/B
 CCM enc |     105.0 ns/B      9.08 MiB/s     120.9 c/B
 CCM dec |     105.0 ns/B      9.08 MiB/s     120.9 c/B
CCM auth |     52.74 ns/B     18.08 MiB/s     60.75 c/B
 GCM enc |     53.66 ns/B     17.77 MiB/s     61.81 c/B
 GCM dec |     53.66 ns/B     17.77 MiB/s     61.82 c/B
GCM auth |      1.30 ns/B     732.3 MiB/s      1.50 c/B
 OCB enc |     54.54 ns/B     17.49 MiB/s     62.83 c/B
 OCB dec |     54.48 ns/B     17.50 MiB/s     62.77 c/B
OCB auth |     53.89 ns/B     17.70 MiB/s     62.09 c/B
         =

After (~1.7x faster):
CAMELLIA128 | nanosecs/byte mebibytes/sec cycles/byte

 ECB enc |     22.25 ns/B     42.87 MiB/s     25.63 c/B
 ECB dec |     22.25 ns/B     42.87 MiB/s     25.63 c/B
 CBC enc |     23.27 ns/B     40.97 MiB/s     26.81 c/B
 CBC dec |     22.14 ns/B     43.08 MiB/s     25.50 c/B
 CFB enc |     23.17 ns/B     41.17 MiB/s     26.69 c/B
 CFB dec |     22.14 ns/B     43.08 MiB/s     25.50 c/B
 OFB enc |     23.11 ns/B     41.26 MiB/s     26.63 c/B
 OFB dec |     23.11 ns/B     41.26 MiB/s     26.63 c/B
 CTR enc |     22.36 ns/B     42.65 MiB/s     25.76 c/B
 CTR dec |     22.36 ns/B     42.65 MiB/s     25.76 c/B
 CCM enc |     44.87 ns/B     21.26 MiB/s     51.69 c/B
 CCM dec |     44.87 ns/B     21.25 MiB/s     51.69 c/B
CCM auth |     22.62 ns/B     42.15 MiB/s     26.06 c/B
 GCM enc |     23.66 ns/B     40.31 MiB/s     27.25 c/B
 GCM dec |     23.66 ns/B     40.31 MiB/s     27.25 c/B
GCM auth |      1.30 ns/B     732.0 MiB/s      1.50 c/B
 OCB enc |     24.32 ns/B     39.21 MiB/s     28.02 c/B
 OCB dec |     24.32 ns/B     39.21 MiB/s     28.02 c/B
OCB auth |     23.75 ns/B     40.15 MiB/s     27.36 c/B
         =

CAMELLIA256 | nanosecs/byte mebibytes/sec cycles/byte

 ECB enc |     29.08 ns/B     32.79 MiB/s     33.50 c/B
 ECB dec |     29.19 ns/B     32.67 MiB/s     33.63 c/B
 CBC enc |     30.11 ns/B     31.67 MiB/s     34.69 c/B
 CBC dec |     29.05 ns/B     32.83 MiB/s     33.47 c/B
 CFB enc |     30.00 ns/B     31.79 MiB/s     34.56 c/B
 CFB dec |     28.97 ns/B     32.91 MiB/s     33.38 c/B
 OFB enc |     29.95 ns/B     31.84 MiB/s     34.50 c/B
 OFB dec |     29.95 ns/B     31.84 MiB/s     34.50 c/B
 CTR enc |     29.19 ns/B     32.67 MiB/s     33.63 c/B
 CTR dec |     29.19 ns/B     32.67 MiB/s     33.63 c/B
 CCM enc |     58.54 ns/B     16.29 MiB/s     67.43 c/B
 CCM dec |     58.54 ns/B     16.29 MiB/s     67.44 c/B
CCM auth |     29.46 ns/B     32.37 MiB/s     33.94 c/B
 GCM enc |     30.49 ns/B     31.28 MiB/s     35.12 c/B
 GCM dec |     30.49 ns/B     31.27 MiB/s     35.13 c/B
GCM auth |      1.30 ns/B     731.6 MiB/s      1.50 c/B
 OCB enc |     31.16 ns/B     30.61 MiB/s     35.90 c/B
 OCB dec |     31.22 ns/B     30.55 MiB/s     35.96 c/B
OCB auth |     30.59 ns/B     31.18 MiB/s     35.24 c/B
         =
  • Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>

Details

Provenance
jukiviliAuthored on Apr 27 2016, 5:18 PM
Parents
rC4cd8d40d6985: Add ARMv8/AArch64 Crypto Extension implementation of AES
Branches
Unknown
Tags
Unknown

Event Timeline

Jussi Kivilinna <jussi.kivilinna@iki.fi> committed rCde73a2e7237b: Add Aarch64 assembly implementation of Camellia (authored by Jussi Kivilinna <jussi.kivilinna@iki.fi>).Sep 5 2016, 7:08 PM