GETAUDITLOG fails to close fd
Closed, ResolvedPublic

Description

I think this is what's causing T1493 -- gpgme hanging when doing a
GETAUDITLOG command. This is with gnupg 2.0.21 and gpgme 1.4.3.

Running gpgme's gpgsm/t-verify test under strace -f, so you can see both
processes (output below), it looks like gpgsm receives the data FD from
t-verify, dups it to get a new FD, writes the result to the new FD --
and then closes the original, leaving the new FD open. t-verify hangs
because it's still reading from the data FD and it never gets an EOF.

Looking at the code in sm/server.c cmd_getauditlog, it effectively does
the following:

  out_stream = es_fdopen_nc ( dup (out_fd), "w");
  // (write the output to out_stream)
  es_fclose (out_stream);
  assuan_close_output_fd (ctx);

es_fdopen_nc doesn't close the FD when the stream is closed, so the new
FD that dup returns is never closed.

Removing the dup call appears to fix the problem -- at least, gnupg and
gpgme's test suites both pass.

The salient bit of strace output when t-verify hangs is as follows, where pid
9309 is t-verify, and 9738 is gpgsm:

9309 10:16:06 sendmsg(3, {msg_name(0)=NULL, msg_iov(1)=[{"GETAUDITLOG", 11}],
msg_controllen=0, msg_flags=0}, 0 <unfinished ...>
9738 10:16:06 <... recvmsg resumed> {msg_name(0)=NULL,
msg_iov(1)=[{"GETAUDITLOG", 1002}], msg_controllen=0, msg_flags=0}, 0) = 11
9309 10:16:06 <... sendmsg resumed> ) = 11
9738 10:16:06 recvmsg(4, <unfinished ...>
9309 10:16:06 sendmsg(3, {msg_name(0)=NULL, msg_iov(1)=[{"\n", 1}],
msg_controllen=0, msg_flags=0}, 0 <unfinished ...>
9738 10:16:06 <... recvmsg resumed> {msg_name(0)=NULL, msg_iov(1)=[{"\n",
991}], msg_controllen=0, msg_flags=0}, 0) = 1
9309 10:16:06 <... sendmsg resumed> ) = 1
9738 10:16:06 dup(6 <unfinished ...>
9309 10:16:06 select(6, [4 5], [], NULL, {1, 0} <unfinished ...>
9738 10:16:06 <... dup resumed> ) = 7
9738 10:16:06 write(7, "* Data verification succeeded: Yes\n* Data
available: Yes\n* Signature available: Yes\n* Parsing data
succeeded: Yes\n* (data hash algorithm: SHA1)\n* Signature 0:

Good\n*     (#00/CN=test cert 1,OU=Aegypten Project,O=g10 Code

GmbH,L=D\303\274sseldorf,C=DE)\n* (data hash algorithm: SHA1)\n* (attr
hash algorithm: SHA1)\n* Certificate chain available: Yes\n*
(#00/CN=test cert 1,OU=Aegypten Project,O=g10 Code
GmbH,L=D\303\274sseldorf,C=DE)\n* Certificate chain valid: Yes\n*

Root certificate trustworthy:         Yes\n*     CRL/OCSP check of

certificates: Not enabled\n* Included certificates: 0\n*
Gpg-Agent usable: Yes\n", 715) = 715
9309 10:16:06 <... select resumed> ) = 1 (in [4], left {1, 0})
9738 10:16:06 write(7, NULL, 0 <unfinished ...>
9309 10:16:06 select(5, [4], [], NULL, {0, 0} <unfinished ...>
9738 10:16:06 <... write resumed> ) = 0
9309 10:16:06 <... select resumed> ) = 1 (in [4], left {0, 0})
9738 10:16:06 close(6 <unfinished ...>
9309 10:16:06 read(4, <unfinished ...>
9738 10:16:06 <... close resumed> ) = 0
9309 10:16:06 <... read resumed> "* Data verification succeeded: Yes\n*

Data available:         Yes\n*   Signature available:         Yes\n*   Parsing

data succeeded: Yes\n* (data hash algorithm: SHA1)\n* Signature 0:

Good\n*     (#00/CN=test cert 1,OU=Aegypten Project,O=g10 Code

GmbH,L=D\303\274sseldorf,C=DE)\n* (data hash algorithm: SHA1)\n* (attr
hash algorithm: SHA1)\n* Certificate chain available: Yes\n*
(#00/CN=test cert 1,OU=Aegypten Project,O=g10 Code
GmbH,L=D\303\274sseldorf,C=DE)\n* Certificate chain valid: Yes\n*

Root certificate trustworthy:         Yes\n*     CRL/OCSP check of

certificates: Not enabled\n* Included certificates: 0\n*
Gpg-Agent usable: Yes\n", 4096) = 715
9738 10:16:06 sendmsg(4, {msg_name(0)=NULL, msg_iov(1)=[{"OK", 2}],
msg_controllen=0, msg_flags=0}, 0 <unfinished ...>
9309 10:16:06 select(6, [4 5], [], NULL, {1, 0} <unfinished ...>
9738 10:16:06 <... sendmsg resumed> ) = 2
9309 10:16:06 <... select resumed> ) = 1 (in [5], left {0, 999997})
9738 10:16:06 sendmsg(4, {msg_name(0)=NULL, msg_iov(1)=[{"\n", 1}],
msg_controllen=0, msg_flags=0}, 0 <unfinished ...>
9309 10:16:06 select(6, [5], [], NULL, {0, 0} <unfinished ...>
9738 10:16:06 <... sendmsg resumed> ) = 1
9309 10:16:06 <... select resumed> ) = 1 (in [5], left {0, 0})
9738 10:16:06 recvmsg(4, <unfinished ...>
9309 10:16:06 recvmsg(3, {msg_name(0)=NULL, msg_iov(1)=[{"OK\n", 1002}],
msg_controllen=0, msg_flags=0}, 0) = 3
9309 10:16:06 close(5) = 0
9309 10:16:06 select(5, [4], [], NULL, {1, 0}) = 0 (Timeout)
9309 10:16:07 select(5, [4], [], NULL, {1, 0}) = 0 (Timeout)

With the dup removed, it does this, which looks more sensible -- pid
23939 is t-verify and 24181 is gpgsm:

23939 10:34:36 sendmsg(3, {msg_name(0)=NULL, msg_iov(1)=[{"GETAUDITLOG", 11}],
msg_controllen=0, msg_flags=0}, 0 <unfinished ...>
24181 10:34:36 <... recvmsg resumed> {msg_name(0)=NULL,
msg_iov(1)=[{"GETAUDITLOG", 1002}], msg_controllen=0, msg_flags=0}, 0) = 11
23939 10:34:36 <... sendmsg resumed> ) = 11
24181 10:34:36 recvmsg(4, <unfinished ...>
23939 10:34:36 sendmsg(3, {msg_name(0)=NULL, msg_iov(1)=[{"\n", 1}],
msg_controllen=0, msg_flags=0}, 0 <unfinished ...>
24181 10:34:36 <... recvmsg resumed> {msg_name(0)=NULL, msg_iov(1)=[{"\n",
991}], msg_controllen=0, msg_flags=0}, 0) = 1
23939 10:34:36 <... sendmsg resumed> ) = 1
23939 10:34:36 select(6, [4 5], [], NULL, {1, 0} <unfinished ...>
24181 10:34:36 write(6, "* Data verification succeeded: Yes\n* Data
available: Yes\n* Signature available: Yes\n* Parsing data
succeeded: Yes\n* (data hash algorithm: SHA1)\n* Signature 0:

Bad\n*     (#00/CN=test cert 1,OU=Aegypten Project,O=g10 Code

GmbH,L=D\303\274sseldorf,C=DE)\n* (data hash algorithm: SHA1)\n*
Certificate chain available: No\n* Included certificates:
0\n", 414) = 414
23939 10:34:36 <... select resumed> ) = 1 (in [4], left {0, 999880})
24181 10:34:36 write(6, NULL, 0 <unfinished ...>
23939 10:34:36 select(5, [4], [], NULL, {0, 0} <unfinished ...>
24181 10:34:36 <... write resumed> ) = 0
23939 10:34:36 <... select resumed> ) = 1 (in [4], left {0, 0})
24181 10:34:36 close(6 <unfinished ...>
23939 10:34:36 read(4, <unfinished ...>
24181 10:34:36 <... close resumed> ) = 0
23939 10:34:36 <... read resumed> "* Data verification succeeded: Yes\n*

Data available:         Yes\n*   Signature available:         Yes\n*   Parsing

data succeeded: Yes\n* (data hash algorithm: SHA1)\n* Signature 0:

Bad\n*     (#00/CN=test cert 1,OU=Aegypten Project,O=g10 Code

GmbH,L=D\303\274sseldorf,C=DE)\n* (data hash algorithm: SHA1)\n*
Certificate chain available: No\n* Included certificates:
0\n", 4096) = 414
24181 10:34:36 sendmsg(4, {msg_name(0)=NULL, msg_iov(1)=[{"OK", 2}],
msg_controllen=0, msg_flags=0}, 0 <unfinished ...>
23939 10:34:36 select(6, [4 5], [], NULL, {1, 0} <unfinished ...>
24181 10:34:36 <... sendmsg resumed> ) = 2
23939 10:34:36 <... select resumed> ) = 2 (in [4 5], left {1, 0})
24181 10:34:36 sendmsg(4, {msg_name(0)=NULL, msg_iov(1)=[{"\n", 1}],
msg_controllen=0, msg_flags=0}, 0 <unfinished ...>
23939 10:34:36 select(6, [5], [], NULL, {0, 0} <unfinished ...>
24181 10:34:36 <... sendmsg resumed> ) = 1
23939 10:34:36 <... select resumed> ) = 1 (in [5], left {0, 0})
24181 10:34:36 recvmsg(4, <unfinished ...>
23939 10:34:36 recvmsg(3, {msg_name(0)=NULL, msg_iov(1)=[{"OK\n", 1002}],
msg_controllen=0, msg_flags=0}, 0) = 3
23939 10:34:36 close(5) = 0
23939 10:34:36 select(5, [4], [], NULL, {0, 0}) = 1 (in [4], left {0, 0})
23939 10:34:36 read(4, "", 4096) = 0
23939 10:34:36 close(4) = 0

Details

Version
2.0.21

Related Objects

atsampson set Version to 2.0.21.Sep 5 2013, 12:02 PM
atsampson added projects: gnupg, Bug Report.
atsampson added a subscriber: atsampson.
werner added a subscriber: werner.Sep 18 2013, 8:36 AM

That is a very good analysis. Looking at the history of the code shows:

    2009-11-10  Marcus Brinkmann  <marcus@g10code.de>
    
    	* server.c (cmd_getauditlog): Don't dup FD for es_fdopen_nc as
    	this leaks the FD here.

that is in master. I cherry picked it and pushed it to 2.0. Many thanks.

outer added a subscriber: outer.Sep 24 2013, 2:57 AM

Removing the dup() call also seems to fix the unresolved Mac issue with gpg2
2.0.21 and gpgme 1.4.3 compiled with --enable-fd-passing

werner closed this task as Resolved.
werner claimed this task.

So, we can try to use it again by default. Thanks for testing.