Page MenuHome GnuPG

gnupg self test hang: clean migration
Testing, NormalPublic

Description

After updating gnupg to 2.5.17 in pkgsrc I ran the self tests and noticed them hanging. The symptom is:

Making check in migrations
GPG_AGENT_INFO= LC_ALL=C  EXEEXT=  PATH="../gpgscm:/tmp/security/gnupg2/work/.cwrapper/bin:/tmp/security/gnupg2/work/.buildlink/bin:/tmp/security/gnupg2/work/.gcc/bin:/tmp/security/gnupg2/work/.tools/bin:/usr/pkg/bin:/home/wiz/bin:/usr/local/bin:/usr/X11R7/bin:/bin:/usr/bin:/usr/pkg/bin:/usr/local/bin:/sbin:/usr/sbin:/usr/pkg/sbin:/usr/local/sbin:/usr/games:/usr/sbin:/usr/local/bin:/usr/pkg_bulk/bin:/root/.cargo/bin:/usr/pkg/bin:/usr/pkg/bin"  abs_top_srcdir="/tmp/security/gnupg2/work/gnupg-2.5.17"  objdir="/tmp/security/gnupg2/work/gnupg-2.5.17"  GNUPG_BUILD_ROOT="/tmp/security/gnupg2/work/gnupg-2.5.17/tests"  GNUPG_IN_TEST_SUITE=fact  GPGSCM_PATH="/tmp/security/gnupg2/work/gnupg-2.5.17/tests/gpgscm" /tmp/security/gnupg2/work/gnupg-2.5.17/tests/gpgscm/gpgscm  /tmp/security/gnupg2/work/gnupg-2.5.17/tests/migrations/run-tests.scm
Testing a clean migration ...

and then nothing happens.
There are three gnupg related processes I found:

test    9517  0.0  0.0    18184   3196 pts/2  I+    2:45PM  0:00.00 gpg --no-permission-warning --no-greeting --no-secmem-warning --batch --agent-program=/tmp/security/gnupg2/work/gnupg-2.5.17/agent/gpg-agent|--debug-quick-random --list-secret-keys
test   16450  0.0  0.0    15644   3088 pts/2  I+    2:45PM  0:00.07 /tmp/security/gnupg2/work/gnupg-2.5.17/tests/gpgscm/gpgscm /tmp/security/gnupg2/work/gnupg-2.5.17/tests/migrations/run-tests.scm
test   27899  0.0  0.0    15652   3104 pts/2  I+    2:45PM  0:00.02 gpgscm /tmp/security/gnupg2/work/gnupg-2.5.17/tests/migrations/from-classic.scm

In case it matters, this is on NetBSD 11.99.5/x86_64.

I also see hangs when running gpgme self tests or the notmuch configure script.

Details

Version
2.5.17

Event Timeline

wiz updated the task description. (Show Details)

When I kill the gpg process, I see:

("/tmp/security/gnupg2/work/gnupg-2.5.17/g10/gpg" --no-permission-warning --no-greeting --no-secmem-warning --batch "--agent-program=/tmp/security/gnupg2/work/gnupg-2.5.17/agent/gpg-agent|--debug-quick-random" --list-sec
ret-keys) failed: gpg: starting migration from earlier GnuPG versions

gpg: signal Terminated caught ... exiting

0: tests.scm:114: (throw (string-append (stringify what) " failed") (:stderr result))
1: from-classic.scm:26: (call-check `(,@gpg --list-secret-keys))
2: from-classic.scm:42: (trigger-migration)
3: common.scm:59: (test (getcwd))
FAIL: tests/migrations/from-classic.scm
Testing the extended private key format ...

and the next test seems to hang too.

When I kill that gpg too, I get:

Error: Key not found: "C40FDECF"
FAIL: tests/migrations/extended-pkf.scm

and the rest of the test suite in that directory runs fine, with

PASS: tests/migrations/issue2276.scm
===================
3 tests run, 1 succeeded, 2 failed, 0 failed expectedly, 0 succeeded unexpectedly, 0 skipped.
Failed tests: tests/migrations/extended-pkf.scm tests/migrations/from-classic.scm
===================
*** Error code 2
werner added a subscriber: werner.

Do you remember wether you had the same problem also with 2.5.14 or 2.5.16? Or can you test with these versions? Which version of libgpg-error are you using?

The previous pkgsrc version was 2.4.9. However, I've just tested 2.5.14 and saw the same behaviour (so I guess there is no point in testing 2.5.16).

libgpg-error is at version 1.58. It passes all its self tests in the same environment.

In the same environment, 2.4.9 passes its self tests.
I've reverted the update in pkgsrc until this can be resolved.

I bisected it and found the commit that introduced this test failure:

a035938216c39230e1476925119d3cff76932e7e is the first bad commit
commit a035938216c39230e1476925119d3cff76932e7e
Author: NIIBE Yutaka <gniibe@fsij.org>
Date:   Thu May 11 19:18:21 2023 +0900

    common,agent,gpg,dirmngr,g13,scd,tests,tools: New spawn function.
...
    --

    GnuPG-bug-id: 6275
    Signed-off-by: NIIBE Yutaka <gniibe@fsij.org>

Thank you for your report.

Could you please show me the result of

$ make -C tests/migrations TESTS=from-classic.scm verbose=3 check

in your build directory of gnupg?

gniibe triaged this task as Normal priority.

Thank you for looking at this.
I'm testing with gnupg git head as of today, please let me know if you prefer 2.5.17 instead.

.../gnupg/build> gmake -C tests/migrations TESTS=from-classic.scm verbose=3 check
gmake: Entering directory '.../gnupg/build/tests/migrations'
GPG_AGENT_INFO= LC_ALL=C EXEEXT= PATH="../gpgscm:/home/wiz/bin:/usr/local/bin:/usr/X11R7/bin:/bin:/usr/bin:/usr/pkg/bin:/usr/local/bin:/sbin:/usr/sbin:/usr/pkg/sbin:/usr/local/sbin:/usr/games:/archive/foreign/localsrc/security/advisories" abs_top_srcdir=".../gnupg/build/.." objdir=".../gnupg/build" GNUPG_BUILD_ROOT=".../gnupg/build/tests" GNUPG_IN_TEST_SUITE=fact GPGSCM_PATH=".../gnupg/build/../tests/gpgscm" .../gnupg/build/tests/gpgscm/gpgscm \
  .../gnupg/build/../tests/migrations/run-tests.scm  from-classic.scm
Executing: '.../gnupg/build/tools/gpgtar' '--help'
Child #proc returned: ((command (".../gnupg/build/tools/gpgtar" --help)) (status 0) (stdout gpgtar (GnuPG) 2.5.18-beta3
Copyright (C) 2025 g10 Code GmbH
License GNU GPL-3.0-or-later <https://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Syntax: gpgtar [options] [files] [directories]
Encrypt or sign files into an archive

Commands:

     --create                create an archive
     --extract               extract an archive
 -e, --encrypt               create an encrypted archive
 -d, --decrypt               extract an encrypted archive
 -s, --sign                  create a signed archive
 -t, --list-archive          list an archive

Options:

 -c, --symmetric             use symmetric encryption
 -r, --recipient USER-ID     encrypt for USER-ID
 -u, --local-user USER-ID    use USER-ID to sign or decrypt
 -o, --output FILE           write output to FILE
 -v, --verbose               verbose
 -q, --quiet                 be somewhat more quiet
     --skip-crypto           skip the crypto processing
     --dry-run               do not make any changes

Tar options:

 -C, --directory DIRECTORY   change to DIRECTORY first
 -T, --files-from FILE       get names to create from FILE
     --null                  -T reads null-terminated names

Please report bugs to <https://bugs.gnupg.org>.
) (stderr ))
Executing: '.../gnupg/build/tools/gpgtar' '--help'
Child #proc returned: ((command (".../gnupg/build/tools/gpgtar" --help)) (status 0) (stdout gpgtar (GnuPG) 2.5.18-beta3
Copyright (C) 2025 g10 Code GmbH
License GNU GPL-3.0-or-later <https://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Syntax: gpgtar [options] [files] [directories]
Encrypt or sign files into an archive

Commands:

     --create                create an archive
     --extract               extract an archive
 -e, --encrypt               create an encrypted archive
 -d, --decrypt               extract an encrypted archive
 -s, --sign                  create a signed archive
 -t, --list-archive          list an archive

Options:

 -c, --symmetric             use symmetric encryption
 -r, --recipient USER-ID     encrypt for USER-ID
 -u, --local-user USER-ID    use USER-ID to sign or decrypt
 -o, --output FILE           write output to FILE
 -v, --verbose               verbose
 -q, --quiet                 be somewhat more quiet
     --skip-crypto           skip the crypto processing
     --dry-run               do not make any changes

Tar options:

 -C, --directory DIRECTORY   change to DIRECTORY first
 -T, --files-from FILE       get names to create from FILE
     --null                  -T reads null-terminated names

Please report bugs to <https://bugs.gnupg.org>.
) (stderr ))
Testing a clean migration ...
Executing: '.../gnupg/build/g10/gpg' '--no-permission-warning' '--no-greeting' '--no-secmem-warning' '--batch' '--agent-program=.../gnupg/build/agent/gpg-agent|--debug-quick-random' '--dearmor' (4 6 2)
Executing: '.../gnupg/build/tools/gpgtar' '--extract' '--directory=.' '-' (5 6 2)
Executing: '.../gnupg/build/tools/gpgconf' '--create-socketdir'
Child #proc returned: ((command (".../gnupg/build/tools/gpgconf" --create-socketdir)) (status 1) (stdout ) (stderr gpgconf: socketdir is '/tmp/gpgscm-20260130T084246-from-classic-2kLWne'
gpgconf:        no /run/user dir
gpgconf:        using homedir as fallback
gpgconf: error creating socket directory
gpgconf: fatal error (exit status 1)
))
Warning: Creating socket directory failed: gpgconf: socketdir is '/tmp/gpgscm-20260130T084246-from-classic-2kLWne'
gpgconf:        no /run/user dir
gpgconf:        using homedir as fallback
gpgconf: error creating socket directory
gpgconf: fatal error (exit status 1)

Executing: '.../gnupg/build/g10/gpg' '--no-permission-warning' '--no-greeting' '--no-secmem-warning' '--batch' '--agent-program=.../gnupg/build/agent/gpg-agent|--debug-quick-random' '--list-secret-keys'

This where it hangs.

Thank you for the log.

I confirmed that it hangs in the call chain of: start_new_service -> _gpgrt_process_spawn ->_gpgrt_process_wait.part.0 -> _leave_npth -> _ksem_wait.

Possibly, the semantics of POSIX semaphore between processes is different.

I will test the following patch of libgpg-error on NetBSD:

diff --git a/src/spawn-posix.c b/src/spawn-posix.c
index bd236ad..339826a 100644
--- a/src/spawn-posix.c
+++ b/src/spawn-posix.c
@@ -458,9 +458,7 @@ spawn_detached (const char *pgmname, const char *argv[],
       return ec;
     }
 
-  _gpgrt_pre_syscall ();
   pid = fork ();
-  _gpgrt_post_syscall ();
   if (pid == (pid_t)(-1))
     {
       ec = _gpg_err_code_from_syserror ();
@@ -750,9 +748,7 @@ _gpgrt_process_spawn (const char *pgmname, const char *argv1[],
       fd_err[1] = -1;
     }
 
-  _gpgrt_pre_syscall ();
   pid = fork ();
-  _gpgrt_post_syscall ();
   if (pid == (pid_t)(-1))
     {
       ec = _gpg_err_code_from_syserror ();
gniibe mentioned this in Unknown Object (Maniphest Task).Mon, Feb 2, 8:25 AM

Thank you for the patch. I've tried it in my environment, and gnupg 987c6a398a9505399b2c25a775d4b625753bc962 passes all its self-tests for me now!

@wiz Thank you for your quick feedback.

I pushed modified version of the patch in: rE20c673e15bd7: spawn:posix: Take care of POSIX semaphore "shared" semantics.

Tested with NetBSD 10.1/amd64.

In tests/migrations, (unlike tests/openpgp and tests/cms), the tests do not prepare gpg-agent, but it is gpg which invokes gpg-agent if needed.
Because of that, on NetBSD (where POSIX semaphore has a different semantics), it hangs with gpg --list-secret-key, when gpg tries to spawn the gpg-agent process.
In the old code of 2.4, it simply ignores about the npth_protect and npth_unprotect sequence when calling fork to spawn a process.
New code in libgpg-error cares about npth_protect and npth_unprotect sequence but it was not sufficient; We need to care about NetBSD's semantics. Child process should not call npth_protect. With shared semantics, child process's calling npth_protect affects to cause parent process: it hangs.

gniibe changed the task status from Open to Testing.Tue, Feb 3, 6:48 AM

I've tried the new patch in my environment, and it fixes the gnupg HEAD self tests as well. Thank you!

I've also asked on a NetBSD mailing list about the pshared difference to Linux.