poly1305: add fast addition macro for ppc64
* cipher/poly1305.c [USE_MPI_64BIT && __powerpc__] (ADD_1305_64): New.
Benchmark on POWER8 (ppc64le, ~3.8Ghz):
Before:
| nanosecs/byte mebibytes/sec cycles/byte
POLY1305 | 0.547 ns/B 1742 MiB/s 2.08 c/B
After (~8% faster):
| nanosecs/byte mebibytes/sec cycles/byte
POLY1305 | 0.502 ns/B 1901 MiB/s 1.91 c/B
Benchmark on POWER9 (ppc64le, ~3.8Ghz):
Before:
| nanosecs/byte mebibytes/sec cycles/byte
POLY1305 | 0.493 ns/B 1934 MiB/s 1.87 c/B
After (~7% faster):
| nanosecs/byte mebibytes/sec cycles/byte
POLY1305 | 0.459 ns/B 2077 MiB/s 1.74 c/B
- GnuPG-bug-id: T4460
- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>