keygen.c occasionally fail with "error generating RSA key: Number is not prime"
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	neverpanic
	Apr 5 2022, 11:54 AM

Description

In the CI runs of the libgcrypt mirror on GitLab (https://gitlab.com/redhat-crypto/libgcrypt/libgcrypt-mirror/-/pipelines?page=1&scope=all&ref=master), tests/basic.c and tests/keygen.c have both failed once in forced FIPS mode with the error message

error generating RSA key: Number is not prime

The specific runs where this happened are:

https://gitlab.com/redhat-crypto/libgcrypt/libgcrypt-mirror/-/jobs/2262073184 line 2138 for tests/basic.c. Note that the error message is inside of the folded output that starts in line 2082.
https://gitlab.com/redhat-crypto/libgcrypt/libgcrypt-mirror/-/jobs/2179912723 line 2146 for tests/keygen.c. Fold starts in line 2086.

It would be great if we could improve the stability of the tests so that this does not happen.

Details

External Link: https://gitlab.com/redhat-crypto/libgcrypt/libgcrypt-mirror/-/pipelines?page=1&scope=all&ref=master
Version: master

Revisions and Commits

rC libgcrypt
	rCcd30ed3c0d71 cipher: Change the bounds for RSA key generation round.
	rC51754fa2ed06 cipher: Fix rsa key generation.

Related Objects

Mentioned In: T7165: Release Libgcrypt 1.11.0
T5905: Release Libgcrypt 1.10.2

Event Timeline

neverpanic created this task.Apr 5 2022, 11:54 AM

I don't know the exact procedure by FIPS, but just setting the least significant bit in the generation (after _gcry_mpi_randomize) can reduce the probability by half.

I think that it is OK to loop forever until we find a prime.

diff --git a/cipher/rsa.c b/cipher/rsa.c
index 3f1cd722..3ff7fec4 100644
--- a/cipher/rsa.c
+++ b/cipher/rsa.c
@@ -386,7 +386,6 @@ generate_fips (RSA_secret_key *sk, unsigned int nbits, unsigned long use_e,
   gcry_mpi_t diff, mindiff;
   gcry_random_level_t random_level;
   unsigned int pbits = nbits/2;
-  unsigned int i;
   int pqswitch;
   gpg_err_code_t ec;
 
@@ -476,18 +475,18 @@ generate_fips (RSA_secret_key *sk, unsigned int nbits, unsigned long use_e,
 
  retry:
   /* generate p and q */
-  for (i = 0; i < 5 * pbits; i++)
+  for (;;)
     {
-    ploop:
       if (!testparms)
         {
           _gcry_mpi_randomize (p, pbits, random_level);
+          mpi_set_bit (p, 0);
         }
       if (mpi_cmp (p, minp) < 0)
         {
           if (testparms)
             goto err;
-          goto ploop;
+          continue;
         }
 
       mpi_sub_ui (p1, p, 1);
@@ -505,21 +504,19 @@ generate_fips (RSA_secret_key *sk, unsigned int nbits, unsigned long use_e,
       else if (testparms)
         goto err;
     }
-  if (i >= 5 * pbits)
-    goto err;
 
-  for (i = 0; i < 5 * pbits; i++)
+  for (;;)
     {
-    qloop:
       if (!testparms)
         {
           _gcry_mpi_randomize (q, pbits, random_level);
+          mpi_set_bit (q, 0);
         }
       if (mpi_cmp (q, minp) < 0)
         {
           if (testparms)
             goto err;
-          goto qloop;
+          continue;
         }
       if (mpi_cmp (p, q) > 0)
         {
@@ -535,7 +532,7 @@ generate_fips (RSA_secret_key *sk, unsigned int nbits, unsigned long use_e,
         {
           if (testparms)
             goto err;
-          goto qloop;
+          continue;
         }
 
       mpi_sub_ui (q1, q, 1);
@@ -553,8 +550,6 @@ generate_fips (RSA_secret_key *sk, unsigned int nbits, unsigned long use_e,
       else if (testparms)
         goto err;
     }
-  if (i >= 5 * pbits)
-    goto err;
 
   if (testparms)
     {

The set_bit is obvious but we should cross check with the specs. In the non-fips mode we also try w/o a limit.

I checked FIPS 186-4 (and FIPS 186-5-draft). It is Appendix A 1.3.

In the generation, we should call mpi_set_bit.

The constants for loops are different:
(1) In FIPS 186-4, it's 5 for p and it's also 5 for q.
(2) In FIPS 186-5-draft, it's 10 for p and it's 20 for q.

So, probably, this kind of failure is known and addressed in FIPS 186-5-draft.

• gniibe added a commit: rC51754fa2ed06: cipher: Fix rsa key generation..Apr 18 2022, 4:09 AM

That sounds reasonable. The FIPS 186-5 draft (https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.186-5-draft.pdf) covers this in section A.1.3, although I'm not quite sure why a lower bound for p was chosen compared to q. The comment that seems to have triggered this change is published on page 68 of https://csrc.nist.gov/CSRC/media/Publications/fips/186/4/final/documents/comments-received-fips186-4-december-2015.pdf by Allen Roginsky. It only contains a suggestion of 20, presumably for both numbers.

For future reference, the same algorithm in FIPS 186-4 (https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.186-4.pdf) can be found in section B.3.3.

Seems like following the 186-5 draft would be a reasonable course of action. Comments on 186-5 (see comment no. 38 on page 80 https://csrc.nist.gov/CSRC/media/Publications/fips/186/5/draft/documents/fips-186-5-draft-comments-received.pdf) suggest this has originally been copied from ANSI X9.80-2020, so it's not out of the question that this will change again in the final version of 186-5.

• gniibe claimed this task.Apr 19 2022, 11:01 AM

• gniibe moved this task from Backlog to Next on the FIPS board.Apr 19 2022, 11:07 AM

• gniibe added a commit: rCcd30ed3c0d71: cipher: Change the bounds for RSA key generation round..Apr 20 2022, 8:12 AM

Feedback from the lab is that they'd recommend returning a specific error code that indicates that the prime search failed and then relying on the caller to decide whether to loop or bubble up the error. I'm not sure who we would consider to be the "caller" of the relevant generation function in this case, though.

I'd say let's consider this solved by switching to 10/20. According to the comments I linked above, the probability for generation of a p or q to fail for a specific k = nlen/2 is (1 - (2 / k * ln 2))^(a * k) with a = {5, 10, 20}.

For generation of an RSA key to fail, either the generation of p or the generation of q must fail, hence we know that the probability that the generation of a key fails is p(failure) = 1 - p(success) = 1 - (p(p success) * p(q success)). If I run the numbers, key size does not contribute a lot, and the 5/5 approach gives us a probability of failure between 1.0417e-6 and 1.0796e-6, i.e. just around 1.08 ppm. The 10/20 approach on the other hand is mostly dominated by the success probability of the generation of p and comes in between 2.7131e-13 and 2.9140e-13, i.e., around 0.0000003 ppm.

In other words, with the 5/5 approach, on average every 500,000th key generation fails, while the 10/20 approach fails on average every 1.7e12 keys generated. If we assume generating one key takes a second, that's ~54000 years until a failure.

• gniibe added a project: Restricted Project.May 3 2022, 10:48 AM

• werner added a project: backport.May 3 2022, 11:21 AM

• gniibe moved this task from Next to Ready for release on the FIPS board.May 6 2022, 2:31 AM

It will be in 1.10.2.

• werner changed the task status from Open to Testing.Sep 22 2022, 11:02 AM

• werner removed a project: Restricted Project.