Page MenuHome GnuPG

SM, W32: GPGSM hangs up the GnuPG System
Closed, ResolvedPublic


Sometimes on Windows the GnuPG System "hangs" by GnuPG System I mean both OpenPGP and GPGSM. So it's likely to be related to the gpg-agent or shared files like the pubring.
All further operations spawn a new gpgsm or gpg process but do not finish. This can happen as soon as GpgOL / Kleopatra starts leaving the whole system blocked until processes are killed.

I've seen it in the past multiple times and T4248 is related and probably the same. The best way to reproduce it seems to be a bulk import of certificates. They do not have to be secret as T4248 describes. This needs to be fixed as it happens multiple times per day / week in larger deployments.again" is a restart of.

Highest prio as this seems to be a deployment blocker and I will work on it with the highest prio.



Event Timeline

To reproduce this issue I started Kleopatra with an empty GNUPGHOME and imported 10 S/MIME certs at once (which spawns a gpgsm process each) with enabled logging.

It is not all actions that are blocked then but all which need to write in the pubring. So this might not be related to keylisting blocking in Kleopatra.

The processes hang while waiting for the pubring.kbx.lock as they write in the debug output.

One process is looping endlessly in gnupg_rename_file trying to rename the pubring. This fails with a sharing violation. I assume that this process is the one that currently holds the lock.

But according to procexp the file "pubring.kbx" is open in a different gpgsm process PID 12400 but that process is also writing "waiting for lock" so this looks like there is at least one place where the pubring is open without having the lock.

The last lines that the process currently holding wrote in the log:

2019-05-14 10:37:59 gpgsm[12400] DBG: chan_0x00000114 <- # Home: C:\Users\aheinecke\AppData\Roaming\gnupg
2019-05-14 10:37:59 gpgsm[12400] DBG: chan_0x00000114 <- # Config: [none]
2019-05-14 10:37:59 gpgsm[12400] DBG: chan_0x00000114 <- OK Dirmngr 2.2.15 at your service
2019-05-14 10:37:59 gpgsm[12400] DBG: connection to the dirmngr established
2019-05-14 10:37:59 gpgsm[12400] DBG: chan_0x00000114 -> GETINFO version
2019-05-14 10:37:59 gpgsm[12400] DBG: chan_0x00000114 <- D 2.2.15
2019-05-14 10:37:59 gpgsm[12400] DBG: chan_0x00000114 <- OK
2019-05-14 10:37:59 gpgsm[12400] DBG: chan_0x00000114 -> OPTION audit-events=1
2019-05-14 10:37:59 gpgsm[12400] DBG: chan_0x00000114 <- OK
2019-05-14 10:37:59 gpgsm[12400] DBG: chan_0x00000114 -> LOOKUP --cache-only /CN=GlobalSign,O=GlobalSign,OU=GlobalSign%20Root%20CA%20-%20R3
2019-05-14 10:37:59 gpgsm[12400] DBG: chan_00000114 <- [ 44 20 30 82 03 5f 30 82 02 47 a0 03 02 01 02 02 ...(895 byte(s) skipped) ]
2019-05-14 10:37:59 gpgsm[12400] DBG: chan_0x00000114 <- END

After killing that 12400 the process that was looping in gnupg_rename continues and finishes. Then I'm asked to mark a root CA as trusted by PID 2412. After confirmation that process waits on the lock while 2656 hangs in gnupg_rename. But according to procexp again 2412 holds the pubring.kbx open already.

 2019-05-14 11:23:31 gpgsm[2412] DBG: chan_0x000000f4 <- OK
2019-05-14 11:23:31 gpgsm[2412] Das Wurzelzertifikat wurde nun als vertrauenswrdig markiert
2019-05-14 11:23:31 gpgsm[2656] Warte bis auf die Datei 'C:\Users\aheinecke\AppData\Roaming\gnupg\pubring.kbx' zugegriffen werden kann ...
2019-05-14 11:23:32 gpgsm[2412] waiting for lock C:\Users\aheinecke\AppData\Roaming\gnupg\pubring.kbx.lock...

I imported 39 certificate files at once with Kleopatra with about 700 certificates and it worked. Took a long time though so It would be nice if Kleopatra would show a progess indicator or some indication that the import is running. But this is a different issue.

aheinecke reassigned this task from aheinecke to werner.
aheinecke lowered the priority of this task from Unbreak Now! to High.

When doing a "gpgsm --with-validation -k foo" (assuming you have a cert foo) gpgsm now goes into a loop and prints the certficates that match "foo" over and over again. I have not tested if it was caused by this change but I think it is likely.

Reopening this as I have seen such hangs multiple times during testing. When importing multiple keys with Kleopatra at once this can be reproduced sometimes.

I noticed this now by importing the keys for edward tester pub, the two berta boss priv keys and the test ca keys into Kleoptra. I did also have GpgOL open and tried to encrypt there but Kleopatra would also do keylists.

The setup had CRL checks enabled and was in VS-NfD compliant mode.

werner added a project: Restricted Project.Jan 12 2021, 12:18 PM

@rjh reported a problem with keyboxd from the current 2.3 beta on the ML. This is also a locking problem and _might_ be related to this bug.

Well, this is a pure Windows bug. It easily shows up when running dozens of gpgsm processes each importing a different certificate (e.g. using Kleopatra's current importer, which spawns one process per cert). The only possible fix is to close all files before starting a long running operation *and* before locking the files.

werner changed the task status from Open to Testing.Mar 2 2021, 7:33 PM