Add PowerPC vpmsum implementation of CRC
* cipher/Makefile.am: Add 'crc-ppc.c'. * cipher/crc-armv8-ce.c: Remove 'USE_INTEL_PCLMUL' comment. * cipher/crc-ppc.c: New. * cipher/crc.c (USE_PPC_VPMSUM): New. (CRC_CONTEXT): Add 'use_vpmsum'. (_gcry_crc32_ppc8_vpmsum, _gcry_crc24rfc2440_ppc8_vpmsum): New. (crc32_init, crc24rfc2440_init): Add HWF check for 'use_vpmsum'. (crc32_write, crc24rfc2440_write): Add 'use_vpmsum' code-path. * configure.ac: Add 'vpmsumd' instruction to PowerPC VSX inline assembly check; Add 'crc-ppc.lo'.
Benchmark on POWER8 (ppc64le, ~3.8Ghz):
Before:
| nanosecs/byte mebibytes/sec cycles/byte
CRC32 | 0.978 ns/B 975.0 MiB/s 3.72 c/B
CRC24RFC2440 | 0.974 ns/B 978.8 MiB/s 3.70 c/B
After(~22x faster):
| nanosecs/byte mebibytes/sec cycles/byte
CRC32 | 0.044 ns/B 21878 MiB/s 0.166 c/B
CRC24RFC2440 | 0.043 ns/B 22077 MiB/s 0.164 c/B
Benchmark on POWER9 (ppc64le, ~3.8Ghz):
Before:
| nanosecs/byte mebibytes/sec cycles/byte
CRC32 | 1.01 ns/B 943.7 MiB/s 3.84 c/B
CRC24RFC2440 | 0.993 ns/B 960.6 MiB/s 3.77 c/B
After (~20x faster):
| nanosecs/byte mebibytes/sec cycles/byte
CRC32 | 0.046 ns/B 20675 MiB/s 0.175 c/B
CRC24RFC2440 | 0.048 ns/B 19691 MiB/s 0.184 c/B
- GnuPG-bug-id: T4460
- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>