mpi/ec: add fast reduction functions for NIST curves
* configure.ac (ASM_DISABLED): New. * mpi/Makefile.am: Add 'ec-nist.c' and 'ec-inline.h'. * mpi/ec-nist.c: New. * mpi/ec-inline.h: New. * mpi/ec-internal.h (_gcry_mpi_ec_nist192_mod) (_gcry_mpi_ec_nist224_mod, _gcry_mpi_ec_nist256_mod) (_gcry_mpi_ec_nist384_mod, _gcry_mpi_ec_nist521_mod): New. * mpi/ec.c (ec_addm, ec_subm, ec_mulm, ec_mul2): Use 'ctx->mod'. (field_table): Add 'mod' function; Add NIST reduction functions. (ec_p_init): Setup ctx->mod; Setup function pointers from field_table only if pointer is not NULL; Resize ctx->a and ctx->b only if set. * mpi/mpi-internal.h (RESIZE_AND_CLEAR_IF_NEEDED): New. * mpi/mpiutil.c (_gcry_mpi_resize): Clear all unused limbs also in realloc case. * src/ec-context.h (mpi_ec_ctx_s): Add 'mod' function.
Benchmark on AMD Ryzen 7 5800X (x86_64):
Before:
NIST-P192 | nanosecs/iter cycles/iter auto Mhz
mult | 283346 1369473 4833 keygen | 1688442 8185744 4848 sign | 549683 2662984 4845 verify | 615284 2984325 4850 =
NIST-P224 | nanosecs/iter cycles/iter auto Mhz
mult | 516443 2501173 4843 keygen | 2859746 13866802 4849 sign | 918472 4455043 4850 verify | 1057940 5131372 4850 =
NIST-P256 | nanosecs/iter cycles/iter auto Mhz
mult | 423536 2054040 4850 keygen | 2383097 11557572 4850 sign | 774346 3754243 4848 verify | 864934 4196315 4852 =
NIST-P384 | nanosecs/iter cycles/iter auto Mhz
mult | 929985 4511881 4852 keygen | 5230788 25367299 4850 sign | 1671432 8109726 4852 verify | 1902729 9228568 4850 =
NIST-P521 | nanosecs/iter cycles/iter auto Mhz
mult | 2123546 10300952 4851 keygen | 12019340 58297774 4850 sign | 3886988 18853054 4850 verify | 4507885 21864015 4850
After:
NIST-P192 | nanosecs/iter cycles/iter auto Mhz speed-up
mult | 186679 905603 4851 +51% keygen | 1161423 5623822 4842 +46% sign | 389531 1887557 4846 +41% verify | 412936 2000461 4844 +49% =
NIST-P224 | nanosecs/iter cycles/iter auto Mhz speed-up
mult | 260621 1256327 4821 +99% keygen | 1557845 7531677 4835 +84% sign | 521678 2527083 4844 +76% verify | 554084 2677949 4833 +92% =
NIST-P256 | nanosecs/iter cycles/iter auto Mhz speed-up
mult | 319045 1542061 4833 +33% keygen | 1834822 8898950 4850 +30% sign | 612866 2972630 4850 +26% verify | 664821 3222597 4847 +30% =
NIST-P384 | nanosecs/iter cycles/iter auto Mhz speed-up
mult | 593894 2875260 4841 +57% keygen | 3526600 17089717 4846 +48% sign | 1178098 5710151 4847 +42% verify | 1260185 6107449 4846 +51% =
NIST-P521 | nanosecs/iter cycles/iter auto Mhz speed-up
mult | 1160220 5621946 4846 +83% keygen | 6862975 33247351 4844 +75%ยด sign | 2287366 11096711 4851 +70% verify | 2455858 11888045 4841 +84%
Benchmark on AMD Ryzen 7 5800X (i386):
Before:
NIST-P192 | nanosecs/iter cycles/iter auto Mhz
mult | 648039 3143236 4850 keygen | 3554452 17244822 4852 sign | 1163173 5641932 4850 verify | 1300076 6305673 4850 =
NIST-P224 | nanosecs/iter cycles/iter auto Mhz
mult | 798607 3874405 4851 keygen | 4657604 22589864 4850 sign | 1515803 7352049 4850 verify | 1635470 7935373 4852 =
NIST-P256 | nanosecs/iter cycles/iter auto Mhz
mult | 927033 4496283 4850 keygen | 5313601 25771983 4850 sign | 1735795 8418514 4850 verify | 1945804 9438212 4851 =
NIST-P384 | nanosecs/iter cycles/iter auto Mhz
mult | 2301781 11164473 4850 keygen | 12856001 62353242 4850 sign | 4161041 20180651 4850 verify | 4705961 22827478 4851 =
NIST-P521 | nanosecs/iter cycles/iter auto Mhz
mult | 6066635 29422721 4850 keygen | 32995868 160046407 4850 sign | 10503306 50945387 4850 verify | 12225252 59294323 4850
After:
NIST-P192 | nanosecs/iter cycles/iter auto Mhz speed-up
mult | 413605 2007498 4854 +57% keygen | 2479429 12010926 4844 +44% sign | 825111 3997147 4844 +41% verify | 890206 4318723 4851 +46% =
NIST-P224 | nanosecs/iter cycles/iter auto Mhz speed-up
mult | 551703 2676454 4851 +45% keygen | 3257022 15781844 4845 +43% sign | 1085678 5258894 4844 +40% verify | 1172195 5678499 4844 +40% =
NIST-P256 | nanosecs/iter cycles/iter auto Mhz speed-up
mult | 720395 3497486 4855 +29% keygen | 4217758 20461257 4851 +26% sign | 1404350 6814131 4852 +24% verify | 1515136 7353955 4854 +28% =
NIST-P384 | nanosecs/iter cycles/iter auto Mhz speed-up
mult | 1525742 7400771 4851 +51% keygen | 9046660 43877889 4850 +42% sign | 2974641 14408703 4844 +40% verify | 3265285 15834951 4849 +44% =
NIST-P521 | nanosecs/iter cycles/iter auto Mhz speed-up
mult | 3289348 15968678 4855 +84% keygen | 19354174 93873531 4850 +70% sign | 6351493 30830140 4854 +65% verify | 6979292 33854215 4851 +75%
- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>