Home GnuPG

mpi/amd64: optimize add_n and sub_n

Description

mpi/amd64: optimize add_n and sub_n

* mpi/amd64/mpih-add1.S (_gcry_mpih_add_n): New implementation
with 4x unrolled fast-path loop.
* mpi/amd64/mpih-sub1.S (_gcry_mpih_sub_n): Likewise.

Benchmark on AMD Ryzen 9 7900X:

Before:

|  nanosecs/byte   mebibytes/sec   cycles/byte  auto Mhz

add | 0.035 ns/B 27559 MiB/s 0.163 c/B 4700
sub | 0.034 ns/B 28332 MiB/s 0.158 c/B 4700

After (~26% faster):

|  nanosecs/byte   mebibytes/sec   cycles/byte  auto Mhz

add | 0.027 ns/B 35271 MiB/s 0.127 c/B 4700
sub | 0.027 ns/B 35206 MiB/s 0.127 c/B 4700

  • Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>

Details

Provenance
jukiviliAuthored on Apr 16 2023, 8:45 PM
Parents
rC3e17e819a6a4: mpi/amd64: fix use of 'movd' for 64-bit register move in lshift&rshift
Branches
Unknown
Tags
Unknown