Page MenuHome GnuPG

Flaky test (<keyboxd>tests/openpgp/use-exact-key.scm) failure with gnupg 2.3.5, 2.3.6
Closed, ResolvedPublic

Description

Originally reported downstream in Gentoo.

This seems to be flaky to me:

  • 2.3.5 passed, then failed
  • 2.3.6 failed, then passed

Snippet from log:

ASS: <keyboxd>tests/openpgp/issue2419.scm 
Checking robustness wrt empty databases in gnupghome (issue2417)... 
PASS: <keyboxd>tests/openpgp/issue2417.scm 
PASS: <keyboxd>tests/openpgp/delete-keys.scm 
Importing ssh keys... 
    > dsa rsa ecdsa ed25519 <  
Checking for issue2316... 
PASS: <keyboxd>tests/openpgp/ssh-import.scm 
Checking key export 
    > D74C5F22 C40FDECF ECABF51D <  
PASS: <keyboxd>tests/openpgp/export.scm 
Importing public key. 
Checking that the most recent, valid signing subkey is used by default 
    > 8BC90111 3E880CFF F5F77B83 45117079 1EA97479 <  
Checking that the most recent, valid encryption subkey is used by default 
    > 8BC90111 3E880CFF F5F77B83 45117079 1EA97479 <  
PASS: <keyboxd>tests/openpgp/default-key.scm 
Importing public key. 
Checking that the most recent, valid signing subkey is used by default 
    > 8BC90111 3E880CFF F5F77B83 45117079 1EA97479 <  
Checking that we can select a specific signing key 
    > 8BC90111 F5F77B83 (((/var/tmp/portage/app-crypt/gnupg-2.3.5-r1/work/gnupg-2.3.5/bin/gpg --no-permission-warning --always-trust --output - --verify --status-fd=1) 7706 1))
FAIL: <keyboxd>tests/openpgp/use-exact-key.scm 
Checking gpgtar without encryption 
Checking gpgtar without encryption with nicer actions 
Checking gpgtar with asymmetric encryption 
Checking gpgtar with asymmetric encryption and signature 
Checking gpgtar with signature 
Checking gpgtar with symmetric encryption 
Checking gpgtar with symmetric encryption and chosen cipher 
Checking gpgtar with both symmetric and asymmetric encryption 
PASS: <keyboxd>tests/openpgp/gpgtar.scm 
Checking WoT with trust signatures (PGP trust model)...
PASS: <keyboxd>tests/openpgp/trust-pgp-2.scm 
Checking basic WoT (classic trust model)...
PASS: <keyboxd>tests/openpgp/trust-pgp-1.scm 
Checking conventional encryption with MDC

Full tarball of logs:

.

I can try get more information if needed but the flaky nature makes it harder.

Details

External Link
https://bugs.gentoo.org/840511
Version
2.3.6

Event Timeline

After re-running myself a few times, I managed to hit it again. In tests/openpgp/report.xml, I see:

[...]
<testsuite name="&lt;keyboxd&gt;tests/openpgp/use-exact-key.scm" time="0" package="&lt;keyboxd&gt;tests/openpgp" id="0" timestamp="2022-04-25T16:18:27" hostname="unknown" tests="1" failures="0" errors="0" >
<properties/>
<testcase name="use-exact-key.scm" classname="&lt;keyboxd&gt;tests.openpgp" time="0" >
<failure message="Unknown error." />
</testcase>
<system-out>
Importing public key.
Checking that the most recent, valid signing subkey is used by default
    &gt; 8BC90111 3E880CFF F5F77B83 45117079 1EA97479 &lt;
Checking that we can select a specific signing key
    &gt; 8BC90111 F5F77B83 1EA97479 &lt;
</system-out>
<system-err>
</system-err>
[...]

Full build.log too:

Thank you. I can replicate the issue.

I pushed the changes to allow single invocation of a test with variant.

Then, running the test loop:

while GPGSCM_TEST_VARIANT=keyboxd make -C tests/openpgp TESTS=use-exact-key.scm check ; do true; done

It eventually fails (in five minutes, in my case).

werner triaged this task as Normal priority.Apr 26 2022, 8:38 AM

Another test, it took 30 minutes to replicate.

/tmp/gpgscm-20220426T073802-use-exact-key-QTjvwO/a

[GNUPG:] PLAINTEXT 62 1650958682 
[GNUPG:] PLAINTEXT_LENGTH 0
[GNUPG:] NEWSIG
[GNUPG:] KEY_CONSIDERED E657FB607BB4F21C90BB6651BC067AF28BC90111 0
[GNUPG:] KEY_CONSIDERED E657FB607BB4F21C90BB6651BC067AF28BC90111 0
[GNUPG:] BADSIG 1141BF7FF5F77B83 Barrett Brown <barrett@example.org>
[GNUPG:] FAILURE gpg-exit 33554433

I located the problem. The test program use-exact-key invokes two gpg-es connecting by pipe (one gpg to generate a signature, another gpg to verify the signature). Those multiple gpg-es race accessing keyboxd.

In kbx/kbxserver.c, we call set_assuan_context_func twice, before the loop and after the loop. But it is a single object within a process. When a thread is working and another thread calls set_assuan_context_func (NULL), the status won't be reported to a user of keyboxd.

When status is not reported, subkey is not detected correctly, because of no PUBKEY_INFO in a status line. Then, main key is used to verify a signature, which naturally fails.

gniibe removed a project: Testing.

Fixed in 2.3.7.