Page MenuHome GnuPG

keyboxd with data pipe
Testing, LowPublic

Description

When data pipe is used for keyboxd (modifying the call kbx_client_data_new with no D-lines), it may cause a deadlock on the lock for estream.

In gpg-error, estream.c:do_deinit is registered by atexit. It is called when exit is called, and it calls _gpgrt_fflush for all streams.
But in kbx/kbx-client-util.c:datastream_thread, it calls es_read on the stream, so, that stream is locked when it blocks on read system call.
Thus, it cannot finish the process.

gpg_keyboxd_deinit_session_data should be called before calling exit, so that the call of assuan_release can close the assuan connection (and thus, pipe connection, too). Then, read system call will return an error with EPIPE, to finish the datastream thread.

Event Timeline

It is reproducible by testing tests/openpgp/multisig.scm with keyboxd enabled (it hangs), with the modification of following.

diff --git a/g10/call-keyboxd.c b/g10/call-keyboxd.c
index 7f4d5f493..e4bf22159 100644
--- a/g10/call-keyboxd.c
+++ b/g10/call-keyboxd.c
@@ -223,7 +223,7 @@ open_context (ctrl_t ctrl, keyboxd_local_t *r_kbl)
           return err;
         }
 
-      err = kbx_client_data_new (&kbl->kcd, kbl->ctx, 1);
+      err = kbx_client_data_new (&kbl->kcd, kbl->ctx, 0);
       if (err)
         {
           assuan_release (kbl->ctx);

The problem of hang of tests/openpgp/multisig.scm is solved by rGef4f22b9d98b: gpg: Graceful exit for signature checking with --batch.
But the problem itself is not yet solved.

werner edited projects, added gnupg26; removed gnupg.
werner added a subscriber: werner.

See also commit rG6fcc263c18 from 2020 where I switched to D-lines.

A quick test with about 3000 OpenPGP keys showed that D-lines are only 10%
slower than the fd-passing based implementation. Given that the
thread adds extra complexity we go for now with the D-line approach.

Thus I set this to low priority. We can fix this in 2.6 (as you did).

It looks like having the datastream_thread may be not worth.
One possibility is to implement synchronous read from pipe in kbx_client_data_wait, instead of datastream_thread.

Test with Wine (i686) emulation, I encountered another hang at: Checking armored_key_8192

A change of pipe buffer size fixed the hang.

diff --git a/common/exechelp-w32.c b/common/exechelp-w32.c
index f63341e7c..2307ec59a 100644
--- a/common/exechelp-w32.c
+++ b/common/exechelp-w32.c
@@ -271,7 +271,7 @@ create_inheritable_pipe (HANDLE filedes[2], int flags)
   sec_attr.nLength = sizeof sec_attr;
   sec_attr.bInheritHandle = TRUE;
 
-  if (!CreatePipe (&r, &w, &sec_attr, 0))
+  if (!CreatePipe (&r, &w, &sec_attr, 4096*4))
     return -1;
 
   if ((flags & INHERIT_READ) == 0)

tests/openpgp/import.scm hangs with 4096*4.

I now use 4096*8. All tests go well.

Calling assuan_release before kbx_client_data_release is the best (and we join the thread).

I considered (and tested, actually) use of select before the call of es_read to avoid blocking holding estream lock.
This works for POSIX. For Windows, it seems no easy way for a pipe to block with something like select.

gniibe changed the task status from Open to Testing.Jun 8 2023, 7:43 AM
gniibe claimed this task.
gniibe changed the status of subtask T6523: gpgscm: call-with-io deadlock when larger stderr output from Open to Testing.

With the fix of T6523, make check goes all well (on Wine emulation and on Windows, for i686 and for x86_64).