fd
Testing, NormalPublic
Actions

Assigned To

None

Authored By

	dkg
	Jan 7 2025, 11:01 PM

Description

This was originally reported on https://bugs.debian.org/1079696

When gpgconf (or any gpg process that spawns a child process) is run in a bare chroot or other constrained environment where /proc/self/fd is not available, it tries to close every file descriptor up to the maximum limit. This can take over an hour, depending on the configuration of the system.

I can confirm that this happens using libgpgerror (libgpgrt) version 1.51-3 on debian.

The source for this loop appears to be in get_max_fds in src/spawn-posix.c, which looks for /proc/self/fd but failing that walks through (in preference order, taking the first available answer):

RLIMIT_NOFILE
RLIMIT_OFILE
SC_OPEN_MAX (from sysconf)
POSIX_OPEN_MAX
OPEN_MAX
defaults to 256

As can be seen from the strace output in https://bugs.debian.org/1079696 , on at least some modern Debian systems, the rlim_max for RLIMIT_NOFILE is 1073741816 (0x3ffffff8 in hex), even though rlim_cur could be just 1024.

Trying to do more than a billion close() calls will make any process appear to hang indefinitely. Even if each close call takes 0.000004s (as i've measured on a modern machine) we're still talking about more than an hour spent on this loop.

Details

External Link: https://bugs.debian.org/1079696

Revisions and Commits

rE libgpg-error
	rE0f4fe2edf5e5 spawn: Care about closefrom/close call is interrupted.
	rEe3e793302b67 spawn: Use closefrom when available.

Related Objects

Mentioned In: T7385: Release GpgRT 1.52
Mentioned Here: T1778: t-exechelp-posix get_max_fds returns MAX_INT32 rather than something sensible

Event Timeline

dkg created this task.Jan 7 2025, 11:01 PM

Hm, this might also be relevant in GnuPG's codebase in common/exechelp-posix.c, which contains a copy of the same code (licensed differently).

i note that get_max_fds ends with this:

  /* AIX returns INT32_MAX instead of a proper value.  We assume that
     this is always an error and use an arbitrary limit.  */
#ifdef INT32_MAX
  if (max_fds == INT32_MAX)
    max_fds = 256;
#endif

This appears to have been introduced due to T1778, as reported by @aixtools , but was given as a narrowly targeted workaround, rather than thinking about what plausible runtime upper limits on a close() loop would look like.

dkg added a project: gnupg.Jan 7 2025, 11:44 PM

Thank you for your report.

For libgpg-error, I pushed the change which uses closefrom, in the commit: rEe3e793302b67: spawn: Use closefrom when available.

• werner renamed this task from `_gpg_close_all_fds` hangs on modern Linux when `/proc/self/fd` is unavailable; spawning a process without `GPGRT_SPAWN_INHERIT_FILE` takes > 1 hour to _gpg_close_all_fds hangs on nwer Linux systems in a simple chroot w/o /proc/self/fd.Jan 8 2025, 8:50 AM

• werner triaged this task as Normal priority.

• werner added a project: Linux.

@gniibe: Please see gpgme/src/posix-io.c where we have this:

          /* First close all fds which will not be inherited.  If we
           * have closefrom(2) we first figure out the highest fd we
           * do not want to close, then call closefrom, and on success
           * use the regular code to close all fds up to the start
           * point of closefrom.  Note that Solaris' and FreeBSD's closefrom do
           * not return errors.  */
#ifdef HAVE_CLOSEFROM
          {
            fd = -1;
            for (i = 0; fd_list[i].fd != -1; i++)
              if (fd_list[i].fd > fd)
                fd = fd_list[i].fd;
            fd++;
#if defined(__sun) || defined(__FreeBSD__) || defined(__GLIBC__)
            closefrom (fd);
            max_fds = fd;
#else /*!__sun */
            while ((i = closefrom (fd)) && errno == EINTR)
              ;
            if (!i || errno == EBADF)
              max_fds = fd;
#endif /*!__sun*/
          }
#endif /*HAVE_CLOSEFROM*/

• werner added a commit: rEe3e793302b67: spawn: Use closefrom when available..Jan 8 2025, 9:10 AM

@werner I read the code of gpgme/src/posix-io.c. I understand the two points:

For the correctness sake, the possible interrupted closefrom should be handled.
we can share the code with closefrom case and non-closefrom case.

For the first point, I think that we also need to care about possible interrupted close, too.

I'm going to improve the code for those points.

Fixed in: rE0f4fe2edf5e5: spawn: Care about closefrom/close call is interrupted.

• gniibe added a commit: rE0f4fe2edf5e5: spawn: Care about closefrom/close call is interrupted..Jan 14 2025, 7:33 AM

• werner moved this task from Backlog to QA on the gpgrt board.Apr 8 2025, 8:44 AM

• werner mentioned this in T7385: Release GpgRT 1.52.Apr 8 2025, 9:18 AM