Page MenuHome GnuPG

SIGBUS running `gpg-agent --daemon`
Closed, ResolvedPublic

Description

Getting SIGBUS running gpg-agent --daemon. System has malloc debugging
enabled, turning it off "fixes" the problem (so I'm not sure if it's really a
bug). FreeBSD 9.0-CURRENT r219101.

(gdb) run --daemon
Starting program: /usr/local/bin/gpg-agent --daemon
GPG_AGENT_INFO=/tmp/gpg-P9Qb7d/S.gpg-agent:45051:1; export GPG_AGENT_INFO;

Program received signal SIGBUS, Bus error.
0x0000000800f25f58 in pth_ring_append (r=0x801c13a90, rn=0x652da0) at
pth_ring.c:166
166 rn->rn_prev = r->r_hook->rn_prev;
(gdb) bt full
#0 0x0000000800f25f58 in
pth_ring_append (r=0x801c13a90, rn=0x652da0) at
pth_ring.c:166
No locals.
#1 0x0000000800f2ec1f in pth_mutex_acquire (mutex=0x652da0, tryonly=0,
ev_extra=0x0) at pth_sync.c:63

ev = 0x800659bbd
ev_key = -1

#2 0x000000000042c1c7 in es_list_iterate (iterator=0x42f970 <do_fflush>) at
estream.c:391

list_obj = 0x1
ret = 0

#3 0x000000000042fa0d in es_fflush (stream=0x0) at estream.c:2682

err = 0

#4 0x000000000042c27e in es_deinit () at estream.c:444
No locals.
#5 0x0000000801716364 in __cxa_finalize (dso=0x0) at
/data/src/freebsd/base/head/lib/libc/stdlib/atexit.c:195

phdr_info = {dlpi_addr = 34359738368, dlpi_name = 0x0, dlpi_phdr =

0x7fffffffd5d0, dlpi_phnum = 54840, dlpi_adds = 140737488344608, dlpi_subs =
34366371213,

dlpi_tls_modid = 34390049792, dlpi_tls_data = 0x0}
      p = (struct atexit *) 0x80197a9a0
      fn = {fn_type = 1, fn_ptr = {std_func = 0x42c270 <es_deinit>, cxa_func =

0x42c270 <es_deinit>}, fn_arg = 0x0, fn_dso = 0x0}

n = 64
has_phdr = 0

#6 0x00000008016c0b67 in exit (status=0) at
/data/src/freebsd/base/head/lib/libc/stdlib/exit.c:67
No locals.
#7 0x00000000004087d1 in main (argc=0, argv=0x7fffffffd630) at gpg-agent.c:1200

infostr = 0x801ce8400 'Z' <repeats 200 times>...
infostr_ssh_sock = 0x0
infostr_ssh_pid = 0x0
fd = 8
fd_ssh = -1
pid = 45051
pargs = {argc = 0x7fffffffd44c, argv = 0x7fffffffd440, flags = 32769,

err = 0, r_opt = 0, r_type = 0, r = {ret_int = 0, ret_long = 0, ret_ulong = 0,
ret_str = 0x0}, internal = {

idx = 2, inarg = 0, stopped = 0, last = 0x7fffffffd8d1 "--daemon", aliases =

0x0, cur_alias = 0x0}}

orig_argc = 2
may_coredump = 0
orig_argv = (char **) 0x7fffffffd620
configfp = (FILE *) 0x0
configname = 0x0
shell = 0x7fffffffd910 "/usr/local/bin/bash"
configlineno = 0
parse_debug = 0
default_config = 0
greeting = 0
nogreeting = 0
pipe_server = 0
is_daemon = 1
nodetach = 0
csh_style = 0
logfile = 0x0
debug_wait = 0
gpgconf_list = 0
err = 0
env_file_name = 0x0
malloc_hooks = {malloc = 0x405f2c <gcry_malloc@plt>, realloc = 0x40594c

<gcry_realloc@plt>, free = 0x405cdc <gcry_free@plt>}

names = {0x43ea2e "DISPLAY", 0x43ea36 "TERM", 0x43ea3b "XAUTHORITY",

0x43ea46 "PINENTRY_USER_DATA", 0x0}

Details

Version
2.0.17

Event Timeline

crsd added projects: gnupg, Bug Report.
crsd added a subscriber: crsd.

What do you mean by malloc debugging? The libgcrypt configure option
--enable-m-guard or some FreeBSD feature? The libgcrypt option does not always
work. If it is a FreeBSD feature, libpth might be the culprit. What system?
If it is a kernel feature will I be able to test it using 8.0 on ia32?

I mean the J flag from malloc(3), which is in effect by default on -CURRENT:

J       Each byte of new memory allocated by malloc(), realloc(), or
        reallocf() will be initialized to 0xa5.  All memory returned by
        free(), realloc(), or reallocf() will be initialized to 0x5a.
        This is intended for debugging and will impact performance nega‐
        tively.

You can try ln -sf J /etc/malloc.conf to get similar behavior on 8.0.
gpg-agent --daemon from 2.0.16 works correctly with this option enabled.
Please let me know if you need any additional information.

cf. Fabian Keil's mail from today to gnupg-devel.

Quoting my gnupg-devel mail:

I think the problem is caused by this chunk from the
gpgtar backport 4d364ade61952b7:

diff --git a/common/estream.c b/common/estream.c
index 4015905..3ab68b5 100644

  • a/common/estream.c

+++ b/common/estream.c
[...]
@@ -368,15 +453,18 @@ static int
es_init_do (void)
{
-#ifdef HAVE_PTH

static int initialized;
 
if (!initialized)
  {

+#ifdef HAVE_PTH

if (!pth_init () && errno != EPERM )
  return -1;
if (pth_mutex_init (&estream_list_lock))
  initialized = 1;
  • }

+#else
+ initialized = 1;
#endif
+ atexit (es_deinit);
+ }

return 0;

}

In the "gpg-agent --daemon" case, main() calls pth_kill()
after the client has been forked, so when es_deinit() is
called on exit, acquiring the estream_list_lock seems to
cause pth to dereference a pointer located in a memory
region that has previously been free()'d.

The attached patch (against 2.0.17) prevents the crashes by
not locking the list when flushing the content through es_deinit().
I created it based on the assumption that locking isn't necessary in
that situation, is that correct?

The fix applied is a pth_kill wrapper and to protect all pth calls in estream.c.
Please try ce98524.

For 2.1 we plan to drop pth and thus we may not need it.

Works for me, thanks.

Looks like gpg-agent.c should #include "estream.h" now, though:
gpg-agent.c:1066: warning: implicit declaration of function 'es_pth_kill'

Thanks.

In will include estream.h.

werner claimed this task.
werner removed a project: Testing.