Home GnuPG

crc-intel-pclmul: add AVX2 and AVX512 code paths

Description

crc-intel-pclmul: add AVX2 and AVX512 code paths

* cipher/crc-intel-pclmul.c (crc32_consts_s, crc32_consts)
(crc24rfc2440_consts): Add k_ymm and k_zmm.
(crc32_reflected_bulk, crc32_bulk): Add VPCLMUL+AVX2 and VAES_VPCLMUL+AVX512
code paths; Add 'hwfeatures' parameter.
(_gcry_crc32_intel_pclmul, _gcry_crc24rfc2440_intel_pclmul): Add 'hwfeatures'
parameter.
* cipher/crc.c (CRC_CONTEXT) [USE_INTEL_PCLMUL]: Add 'hwfeatures'.
(_gcry_crc32_intel_pclmul, _gcry_crc24rfc2440_intel_pclmul): Add 'hwfeatures'
parameter.
(crc32_init, crc32rfc1510_init, crc24rfc2440_init) [USE_INTEL_PCLMUL]: Store
HW features to context.

Benchmark on Zen4:

Before:

|  nanosecs/byte   mebibytes/sec   cycles/byte  auto Mhz

CRC32 | 0.046 ns/B 20861 MiB/s 0.248 c/B 5421±1
CRC32RFC1510 | 0.046 ns/B 20809 MiB/s 0.250 c/B 5463±14
CRC24RFC2440 | 0.046 ns/B 20934 MiB/s 0.251 c/B 5504±2

After AVX2:

|  nanosecs/byte   mebibytes/sec   cycles/byte  auto Mhz

CRC32 | 0.023 ns/B 42277 MiB/s 0.123 c/B 5440±6
CRC32RFC1510 | 0.022 ns/B 42949 MiB/s 0.121 c/B 5454±16
CRC24RFC2440 | 0.023 ns/B 41955 MiB/s 0.124 c/B 5439±13

After AVX512:

|  nanosecs/byte   mebibytes/sec   cycles/byte  auto Mhz

CRC32 | 0.011 ns/B 85877 MiB/s 0.061 c/B 5500
CRC32RFC1510 | 0.011 ns/B 83898 MiB/s 0.063 c/B 5500
CRC24RFC2440 | 0.012 ns/B 80590 MiB/s 0.065 c/B 5500

  • Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>

Details

Provenance
jukiviliAuthored on Sun, Aug 3, 12:49 PM
Parents
rC0c2d120e1124: poly1305-p10le: use '.rodata' section for read-only data
Branches
Unknown
Tags
Unknown