Page MenuHome GnuPG

gpg-agent race-condition with parallel clients
Open, NormalPublic

Description

Overview:

On macOS Ventura, after a key has been unlocked in gpg-agent once, attempting to access that same key from multiple clients in parallel will result in prompts to unlock the key again for some subset of the clients.

Environment:

macOS Ventura 13.2
gpg-agent (GnuPG) 2.4.0
libgcrypt 1.10.1

Reproduction Instructions

The following commands will demonstrate the problem on macOS. Even though the gpg key has already been unlocked previously, when running in parallel, we're prompted for the password.

sh
# encrypt a file to yourself to work with
echo "blah" | gpg -e -r $USER > data.gpg

# decrypt the file once to ensure gpg-agent has the key
gpg -d data.gpg

# Now decrypt it repeatedly in serial; you should not see passphrase prompt
for i in {0..5}; do gpg -d data.gpg; done

# Now decrypt it in parallel; you will be prompted for passphrase
for i in {0..5}; do (gpg -d data.gpg) & done

Background:

Why does this absurd case matter? Well, I'm using ansible with passwordstore to store secrets. Passwordstore uses gpg under the hood for encryption. Ansible runs against multiple hosts in parallel, so gpg is being run to decrypt various files with the same key in parallel repeatedly.

This problem does not reproduce on my old linux system, just on this macOS system.

Details

Version
2.4.0

Event Timeline

Although gpg-agent launching is protected by a file system lock, there is indeed a small race related to the pinentry. The invocation of the pinentries is serialized but if a second pinentry is requested while the first pinentry has not yet returned and put the passphrase into the cache, the second pinentry will be called anyway. Fixing this not easy and should rarely be a problem. The mitigation is to do a dummy decryption to seed the cache or use a custom pinentry.

Just in case: check whether the launch lock does not work. You should see a second gpg-agent process in this case with one of them killing itself after some time. You should also notice a file ~/.gnupg/gnupg_spawn_agent_sentinel which is used as file lock while gpg-agent is launched.

I may be reading your comment wrong, but the problem here is not multiple pinentry prompts, or multiple gpg-agents present.

A single instance of gpg-agent is already running, and the key has already been unlocked in a non-parallel invocation.

The problem is that of the five parallel gpg processes, gpg-agent should never run pinentry as the key is already unlocked. Instead, during execution for one or two of the requests (varies) gpg-agent will run pinentry asking for the passphrase for the already unlocked key.

Truncated process tree:

| \-+- 02008 dlary /Users/dlary/Library/Application Support/iTerm2/iTermServer
|   |-+= 28805 root login -fp dlary
|   | \-+= 28806 dlary -zsh
|   |   \--= 88768 dlary gpg -d data.gpg
\-+- 51000 dlary gpg-agent --debug-level basic --no-detach --daemon -v
  |--- 51073 dlary scdaemon --multi-server
  \--- 88772 dlary /opt/homebrew/opt/pinentry/bin/pinentry

No additional gpg-agent processes are created, and I've confirmed from the gpg-agent debug output that the pinenetry is from the sole gpg-agent process.

Okay, I see. The commands above are a real reproducer and not standalone examples. Then yes, you should get a pinentry only for the first gpg -d (as long as the keys are still in the cache). I am lacking macOS/homebrew stuff to replicate this. What you can do is to put

debug ipc,cache
debug-pinentry

log-file socket://

into gpg-agent.conf and run in another tty

watchgnupg  tee mylog

so see all output from all gpg-agent processes. Kill all gpg-agent processes first.

created ~/.gnupg/gpg-agent.conf containing:

debug ipc,cache
debug-pinentry
log-file socket://

ran in one tty:

killall gpg-agent
watchgnupg | tee gpg-agent.log

In a second tty ran:

gpg -d data.gpg
# unlocked key
for i in {0..5}; do (gpg -d data.gpg) & done

Attached is the log

werner triaged this task as Normal priority.
werner added a project: gnupg24.

Thanks. please give a few days.

FYI: Quite some more days than a few passed by. I still did not found the time for this, sorry.

FWIW: I have not done any tests but the comment below is about the case I suspected to be the cuase for your problem:

/* If the pinentry is currently in use, we wait up to 60 seconds
    for it to close and check the cache again.  This solves a common
    situation where several requests for unprotecting a key have
    been made but the user is still entering the passphrase for
    the first request.  Because all requests to agent_askpin are
    serialized they would then pop up one after the other to
    request the passphrase - despite that the user has already
    entered it and is then available in the cache.  This
    implementation is not race free but in the worst case the
    user has to enter the passphrase only once more. */

I have a look at the log file of gpg-agent.log. I can see that six PKDECRYPT requests are handled simultaneously. I think that it's out of secure memory to decrypt the private key which results pinentry request.

That is:

  • passphrase cache works well.
  • it tries to decrypt the private key by passphrase, but not enough secure memory for the sixth request (because other five request also use secure memory).
  • decryption failure fallbacks to pinentry request to user.

It is reproducible bug even with master branch.

With the patch below:

diff --git a/agent/findkey.c b/agent/findkey.c
index c0cbce7a2..45f4f40aa 100644
--- a/agent/findkey.c
+++ b/agent/findkey.c
@@ -877,6 +877,8 @@ unprotect (ctrl_t ctrl, const char *cache_nonce, const char *desc_text,
               *keybuf = result;
               return 0;
             }
+          else
+            log_error ("unprotect failure: %s\n", gpg_strerror (rc));
           xfree (pw);
         }
       else if (cache_mode == CACHE_MODE_NORMAL)

I can reproduce with debug log of:

2024-09-27 11:06:10 gpg-agent[8288] unprotect failure: Cannot allocate memory
2024-09-27 11:06:10 gpg-agent[8288] starting a new PIN Entry

on GNU/Linux, for multiple decryption requests.

For this test, I use RSA4096 key generated by:

$ gpg --quick-gen-key "Chuji k <chuji@gniibe.org>" rsa4096 cert,sign,encr

For the cause, this report is a kind of duplicate of: T4255: gpg-agent: "<gcrypt> Cannot allocate memory" with 10 threads decrypting OpenPGP