Home GnuPG

Optimizations for generic table-based GCM implementations

Description

Optimizations for generic table-based GCM implementations

* cipher/cipher-gcm.c [GCM_TABLES_USE_U64] (do_fillM): Precalculate
M[32..63] values.
[GCM_TABLES_USE_U64] (do_ghash): Split processing of two 64-bit halfs
of the input to two separate loops; Use precalculated M[] values.
[GCM_USE_TABLES && !GCM_TABLES_USE_U64] (do_fillM): Precalculate
M[64..127] values.
[GCM_USE_TABLES && !GCM_TABLES_USE_U64] (do_ghash): Use precalculated
M[] values.
[GCM_USE_TABLES] (bshift): Avoid conditional execution for mask
calculation.
* cipher/cipher-internal.h (gcry_cipher_handle): Double gcm_table size.

Benchmark on Intel Haswell (amd64, --disable-hwf all):

Before:

                   |  nanosecs/byte   mebibytes/sec   cycles/byte  auto Mhz
GMAC_AES           |      2.79 ns/B     341.3 MiB/s     11.17 c/B      3998

After (~36% faster):

                   |  nanosecs/byte   mebibytes/sec   cycles/byte  auto Mhz
GMAC_AES           |      2.05 ns/B     464.7 MiB/s      8.20 c/B      3998

Benchmark on Intel Haswell (win32, --disable-hwf all):

Before:

                   |  nanosecs/byte   mebibytes/sec   cycles/byte  auto Mhz
GMAC_AES           |      4.90 ns/B     194.8 MiB/s     19.57 c/B      3997

After (~36% faster):

                   |  nanosecs/byte   mebibytes/sec   cycles/byte  auto Mhz
GMAC_AES           |      3.58 ns/B     266.4 MiB/s     14.31 c/B      3999
  • Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>

Details

Provenance
jukiviliAuthored on Apr 27 2019, 9:03 PM
Parents
rCaf5f3fb08674: Optimizations for GCM Intel/PCLMUL implementation
Branches
Unknown
Tags
Unknown