Page MenuHome GnuPG

gpgrt: spawn functions
Open, NormalPublic

Description

We have implementations for spawn functions in libgpg-error, but it's not exposed/enabled for users' use yet.

Expose them and use it gnupg and libassuan.

Event Timeline

gniibe triaged this task as Normal priority.Oct 19 2022, 3:12 AM
gniibe created this task.
  • assuan_pipe_connect and internal _assuan_spawn
    • The use of no_close_list in general is only valid for POSIX. There is a use case in gnupg/g10/call-gpg.c:start_gpg, which does
      • open pipes
      • start_gpg
      • use INPUT FD and OUTPUT FD
      • feed input through pipe, extract output through pipe
      • This can be achieved by sendfd + INPUT FD/OUTPUT FD, even on Windows, when sending fd will be possible.
    • Other use cases are: don't close STDERR
    • So, current support of no_close_list general case, should be revised.

without this list we don't have an option to keep file descriptors open; its not just stderr but for example log files and descriptors which pare passed by other meands than libassuan functions.

I see. I understand the use cases for POSIX to keep some file descriptors.

I'm considering a bit larger picture for both cases of POSIX and of Windows.

  • Using the descriptors, receiving side needs to know which descriptors are passed
    • So, we would need some interaction after spawning
    • or we need to define some way (say, environment variable, arguments, etc.) to share such information
  • Possibly, those file descriptors may be possible to be sent later (after spawning), if it's socket on POSIX
  • I'll see more...

By the commit rE43c1e85fe29a: spawn: Expose spawn functions., spawn functions are exposed now. The API is compatible to the one of internal functions in GnuPG master (2.3).
Semantics is not well-defined portably for:

  • gpgrt_spawn_process: EXCEPT only makes sense in POSIX. User could expect that the API does closing all fds except fds specified by EXCEPT in POSIX.
  • gpgrt_spawn_process_fd: AFTER_FORK_CB only makes sense in POSIX. User could specify the callback so that it can control sigmask, envvar, open/close/dup-ing file descriptors, making sure releasing some resources beforehand, etc.

To have clear semantics, I propose a change to gpgrt_spawn_process_fd (calling SPAWN_CB, instead of AFTER_FORK_CB, and give it return value), and exporting gpgrt_close_all_fds to users.

By this change, I intend:

  • Allow users to keep file (descriptors/handles) to spawning process.
  • Allow users to do something before its spawning a process (although it's difficult to use having same semantics among POSIX and Windows).
  • Allow users to call gpgrt_close_all_fds from the callback.

I found that the feature like gpgrt_close_all_fds might be useful, as I found this: https://docs.rs/close_fds/latest/close_fds/

I general I agree.

What do you think about this

spawn (...)
{
      if (spawn_cb)
         spawn_cb (spawn_arg, -1);
     ....
     pid = fork();
      if( !pid)
        { 
              if (spawn_cb)
                   spawn_cb (spawn_arg, 0);
              ...
            _exit(127);
        }

      if (spawn_cb)
          spawn_cb (spawn_arg, pid);
}

This gives us more hooks into the spawn function which might be useful. Sure we need to do some mapping here for Windows but that is not different than without this.

Another point is that under Windows we have a process handle and a pid. Shouldn't we use an opaque object instead of pid_t here and require an explict free-ing of that object as we need to do under Windows anyway?

@werner - having another argument might be useful. Indeed, pthread_atfork has three callback functions as its arguments (prepare, parent, and child).

For pid_t, we would be able to keep the current API, but, yes, it's better to change it. For POSIX programmer, pid_t sounds like just a reference to the process.

Let me consider more.

Another thing when we define a type which represents process.
For pid_t, MinGW-w64 has a bug: https://bugzilla.redhat.com/show_bug.cgi?id=1397787 (or https://sourceforge.net/p/mingw-w64/mailman/mingw-w64-public/thread/1456671365-21759-1-git-send-email-sw%40weilnetz.de/).
(1) GetCurrentProcessId always returns 32-bit (DWORD), so, it can be represented in 32-bit (although DWORD is unsigned).
(2) POSIX requires pid_t should be signed integer https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/sys_types.h.html
(3) Original MinGW defines pid_t as int (in include/sys/type.h by _pid_t). (checked in mingwrt-5.4.2)

I decide use of pid_t, as there are different semantics between POSIX and Windows, *and* there is a problem of MinGW-w64. I introduce gpgrt_process_t, instead.

For the spawn_cb, I reconsider. Having three calls complicates use, and it is actually not needed. In the case of pthread_atfork, it is needed, because fork may be used deeply in some functions. In our use cases of spawn function, prepare part of the callback can be called before calling spawn, and parent part of the callback can be called after calling spawn.

gniibe changed the task status from Open to Testing.Nov 7 2022, 6:11 AM

Here is the change of GnuPG to use new spawn functions from libgpg-error:

Before applying this change, we need to consider about pid_t use cases in GnuPG. It would be better to introduce gnupg_process_t for the migration.

Examining again, I realized that the current spawn API (not published yet, only available in libgpg-error master) is not that useful in general (or difficult to use), while it works somehow.

Indeed, it may be useful for current internal of libassuan, which does:
(1) Before call of the spawn function, allocate pipe or socketpair, then,
(2) spawn a process with pipe/socketpair

However, this is rather peculiar to current libassuan implementation, which might be improved/fixed.
I think that it is better to combine allocation of pipe/socketpair with the new spawn function.
Let me try.

Evaluating again, I'd like to change spawn functions like this one in libgpg-error:

That is:
(1) Use opaque gpgrt_process_t to have OS specific fields (process ID/handle, exit code, file handle/descriptor).
(2) Offer common API for both of POSIX and Windows.
(3) If needed, offer specific uses of gpgrt_process_ctl function.
(4) Combine allocation of pipe/socketpair inside the spawn function.
(4) No support of supplying pipe/socketpair by the caller.

I don't understand the last two points: This is only about the three standard descriptors but how shall we supply more descriptors? At least in GPGME we definitely need more.

Last two points are for future changes of assuan internal; For the case of controlling fds in detail, it is possible to use spawn callback controlling fds by the routine and let no-touching (inherit) by the spawn function.

Experiments in two weeks or so, I concluded that exposing the API from gpgrt is a bit early. There are two different things (at least):

(1) Improving the use cases in GnuPG for process spawning for portability, sorting out things, factoring, etc.
(2) Fixing libassuan internal for portability, etc.

I am going to finish (1) at first, in GnuPG.
Then, if needed, reflect the experience to (unpublished routines of) libgpg-error.
And then, it will be time to consider (2).

I realized that (2) is hard, as assuan internal is actually exposed by the assuan_set_system_hooks.
Other hooks in assuan, assuan_set_malloc_hooks and assuan_set_log_cb also need update/improvement.

We should consider to break the Assuan API maybe we can do that without too many problems for the current use cases.

gniibe changed the task status from Testing to Open.Nov 26 2022, 3:26 AM