Page MenuHome GnuPG

_gpg_close_all_fds hangs on nwer Linux systems in a simple chroot w/o /proc/self/fd
Open, NormalPublic

Description

This was originally reported on https://bugs.debian.org/1079696

When gpgconf (or any gpg process that spawns a child process) is run in a bare chroot or other constrained environment where /proc/self/fd is not available, it tries to close every file descriptor up to the maximum limit. This can take over an hour, depending on the configuration of the system.

I can confirm that this happens using libgpgerror (libgpgrt) version 1.51-3 on debian.

The source for this loop appears to be in get_max_fds in src/spawn-posix.c, which looks for /proc/self/fd but failing that walks through (in preference order, taking the first available answer):

  • RLIMIT_NOFILE
  • RLIMIT_OFILE
  • SC_OPEN_MAX (from sysconf)
  • POSIX_OPEN_MAX
  • OPEN_MAX
  • defaults to 256

As can be seen from the strace output in https://bugs.debian.org/1079696 , on at least some modern Debian systems, the rlim_max for RLIMIT_NOFILE is 1073741816 (0x3ffffff8 in hex), even though rlim_cur could be just 1024.

Trying to do more than a billion close() calls will make any process appear to hang indefinitely. Even if each close call takes 0.000004s (as i've measured on a modern machine) we're still talking about more than an hour spent on this loop.

Details

Revisions and Commits

Event Timeline

Hm, this might also be relevant in GnuPG's codebase in common/exechelp-posix.c, which contains a copy of the same code (licensed differently).

i note that get_max_fds ends with this:

  /* AIX returns INT32_MAX instead of a proper value.  We assume that
     this is always an error and use an arbitrary limit.  */
#ifdef INT32_MAX
  if (max_fds == INT32_MAX)
    max_fds = 256;
#endif

This appears to have been introduced due to T1778, as reported by @aixtools , but was given as a narrowly targeted workaround, rather than thinking about what plausible runtime upper limits on a close() loop would look like.

Thank you for your report.

For libgpg-error, I pushed the change which uses closefrom, in the commit: rEe3e793302b67: spawn: Use closefrom when available.

werner renamed this task from `_gpg_close_all_fds` hangs on modern Linux when `/proc/self/fd` is unavailable; spawning a process without `GPGRT_SPAWN_INHERIT_FILE` takes > 1 hour to _gpg_close_all_fds hangs on nwer Linux systems in a simple chroot w/o /proc/self/fd.Wed, Jan 8, 8:50 AM
werner triaged this task as Normal priority.
werner added a project: Linux.

@gniibe: Please see gpgme/src/posix-io.c where we have this:

          /* First close all fds which will not be inherited.  If we
           * have closefrom(2) we first figure out the highest fd we
           * do not want to close, then call closefrom, and on success
           * use the regular code to close all fds up to the start
           * point of closefrom.  Note that Solaris' and FreeBSD's closefrom do
           * not return errors.  */
#ifdef HAVE_CLOSEFROM
          {
            fd = -1;
            for (i = 0; fd_list[i].fd != -1; i++)
              if (fd_list[i].fd > fd)
                fd = fd_list[i].fd;
            fd++;
#if defined(__sun) || defined(__FreeBSD__) || defined(__GLIBC__)
            closefrom (fd);
            max_fds = fd;
#else /*!__sun */
            while ((i = closefrom (fd)) && errno == EINTR)
              ;
            if (!i || errno == EBADF)
              max_fds = fd;
#endif /*!__sun*/
          }
#endif /*HAVE_CLOSEFROM*/