Asan finding in libgcrypt
Closed, WontfixPublic

Description

Hi Everyone,

I'm performing a Asan run on GnuPG 2.2.15 and all of its dependencies. Everything was built with -fsanitize=address -fno-omit-frame-pointer. For Autotools projects the two flags get placed in CFLAGS, CXXFLAGS and LDFLAGS.

It looks like Asan is producing a finding for libgcrypt.

=================================================================
==13505==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 24 byte(s) in 1 object(s) allocated from:
    #0 0x7fc86c8b3c08 in __interceptor_malloc (/lib64/libasan.so.5+0xefc08)
    #1 0x7fc86c5a63cb in do_malloc /build/libgcrypt-1.8.4/src/global.c:918
    #2 0x40410f in _GLOBAL__sub_I_00099_1_mpitests.c (/build/libgcrypt-1.8.4/tests/mpitests+0x40410f)

Indirect leak of 8 byte(s) in 1 object(s) allocated from:
    #0 0x7fc86c8b3c08 in __interceptor_malloc (/lib64/libasan.so.5+0xefc08)
    #1 0x7fc86c5a63cb in do_malloc /build/libgcrypt-1.8.4/src/global.c:918
    #2 0x7fff1c156287  ([stack]+0x1e287)

SUMMARY: AddressSanitizer: 32 byte(s) leaked in 2 allocation(s).
FAIL: mpitests

This finding kind of sucks. It probably does not matter much for a desktop running GnuPG because the memory gets cleaned up when GnuPG exits.

But it really sucks for programs in managed languages like C# and Java. C# and Java will load and unload the libgcrypt shared object multiple times during the life of the program. The memory leaks will grow unbounded until all memory is exhausted. This issue causes the OpenJDK folks a lot of problems. (Just ask them about the OpenSSL memory leaks).

I did not look at the code in detail. For triage... For compilers and runtime environments that support it, use __attribute__((destructor)) to clear the leak.

JW created this task.May 10 2019, 11:43 PM
JW created this object in space S1 Public.
JW updated the task description. (Show Details)
JW added a comment.EditedMay 11 2019, 7:24 AM

Here's a couple of awful hacks that get me through make check. Feel free to restate how awful they are; I know it is a bad thing to do.

I think gcry_mpi_set_flag(x, GCRYMPI_FLAG_CONST) is dangerous/toxic for some programs. User programs, especially those in managed languages, are going to have to understand what they are getting into due to the cumulative effects of loading, unloading and the memory leaks.

--- mpi/mpiutil.c
+++ mpi/mpiutil.c
@@ -32,6 +32,21 @@
 static gcry_mpi_t constants[MPI_NUMBER_OF_CONSTANTS];
 
 
+/* Best effort attempt to cleanup constants array. */
+#if defined(__GNUC__)
+__attribute__((destructor))
+static void
+mpi_array_cleanup()
+{
+  int idx;
+  for (idx=0; idx < MPI_NUMBER_OF_CONSTANTS; idx++)
+    {
+      constants[idx]->flags &= ~32U;
+      _gcry_mpi_free(constants[idx]);
+    }
+}
+#endif
+
 
 const char *
 _gcry_mpi_get_hw_config (void)
--- tests/mpitests.c
+++ tests/mpitests.c
@@ -35,6 +35,17 @@
 #define PGM "mpitests"
 #include "t-common.h"
 
+/* From mpi.h. The shadow struct is valid for libgcrypt 1.8.4.
+   The shadow struct may not hold if gcry_mpi_t changes in
+   future versions of the library. */
+struct hack_gcry_mpi
+{
+  int alloced;
+  int nlimbs;
+  int sign;
+  unsigned int flags;
+  void *d;
+};
 
 /* Set up some test patterns */
 
@@ -149,6 +158,13 @@
   /* Due to the the constant flag the release below should be a NOP
      and will leak memory.  */
   gcry_mpi_release (one);
+
+  /* Hack... free the memory to avoid warnings and errors from tooling.
+     This compliments the release performed on 'one' above. If we avoid
+     the double-free then GCRYMPI_FLAG_CONST is working as expected. */
+  ((struct hack_gcry_mpi*)one)->flags &= ~32U;
+  gcry_mpi_release (one);
+
   return 1;
}

Maybe cleaner option for mpi/mpiutil.c would be to statically allocate the constants

diff --git a/mpi/mpiutil.c b/mpi/mpiutil.c
index 9dde37fb4..fabb55a57 100644
--- a/mpi/mpiutil.c
+++ b/mpi/mpiutil.c
@@ -43,8 +43,21 @@
 #endif
 
 
-/* Constants allocated right away at startup.  */
-static gcry_mpi_t constants[MPI_NUMBER_OF_CONSTANTS];
+/* Fixed constants allocated statically.  */
+static mpi_limb_t constant_limbs[MPI_NUMBER_OF_CONSTANTS] =
+{
+  0, 1, 2, 3, 4, 8
+};
+
+static struct gcry_mpi constants[MPI_NUMBER_OF_CONSTANTS] =
+{
+  /* [MPI_C_ZERO]  = */ { 1, 0, 0, (16 | 32), &constant_limbs[0] },
+  /* [MPI_C_ONE]   = */ { 1, 1, 0, (16 | 32), &constant_limbs[1] },
+  /* [MPI_C_TWO]   = */ { 1, 1, 0, (16 | 32), &constant_limbs[2] },
+  /* [MPI_C_THREE] = */ { 1, 1, 0, (16 | 32), &constant_limbs[3] },
+  /* [MPI_C_FOUR]  = */ { 1, 1, 0, (16 | 32), &constant_limbs[4] },
+  /* [MPI_C_EIGHT] = */ { 1, 1, 0, (16 | 32), &constant_limbs[5] },
+};
 
 
 
@@ -60,25 +73,6 @@ _gcry_mpi_get_hw_config (void)
 gcry_err_code_t
 _gcry_mpi_init (void)
 {
-  int idx;
-  unsigned long value;
-
-  for (idx=0; idx < MPI_NUMBER_OF_CONSTANTS; idx++)
-    {
-      switch (idx)
-        {
-        case MPI_C_ZERO:  value = 0; break;
-        case MPI_C_ONE:   value = 1; break;
-        case MPI_C_TWO:   value = 2; break;
-        case MPI_C_THREE: value = 3; break;
-        case MPI_C_FOUR:  value = 4; break;
-        case MPI_C_EIGHT: value = 8; break;
-        default: log_bug ("invalid mpi_const selector %d\n", idx);
-        }
-      constants[idx] = mpi_alloc_set_ui (value);
-      constants[idx]->flags = (16|32);
-    }
-
   return 0;
 }
 
@@ -756,7 +750,5 @@ _gcry_mpi_const (enum gcry_mpi_constants no)
 {
   if ((int)no < 0 || no > MPI_NUMBER_OF_CONSTANTS)
     log_bug("invalid mpi_const selector %d\n", no);
-  if (!constants[no])
-    log_bug("MPI subsystem not initialized\n");
-  return constants[no];
+  return &constants[no];
 }
JW added a comment.EditedMay 11 2019, 9:14 PM

Maybe cleaner option for mpi/mpiutil.c would be to statically allocate the constants

+1. Static resources don't require cleanup.

The mpi changes tested good, and I did not appear to need the awful hack in tests/mpitests.c after switching to it.

I'm still seeing a few odd outputs from make check, but I have not investigated them yet.

JW added a comment.EditedMay 11 2019, 10:50 PM

I'm still seeing a few odd outputs from make check, but I have not investigated them yet.

Here are the next two failures I am seeing while testing libgrcypt. It appears to be related to GCRYCTL_INIT_SECMEM.

Asan is doing some strange things when observing the rng...

JW added a comment.EditedMay 12 2019, 12:44 AM

Here are the next two failures I am seeing while testing libgrcypt. It appears to be related to GCRYCTL_INIT_SECMEM.

OK, so it looks like t-secmem.c : 176 does this:

xgcry_control (GCRYCTL_INIT_SECMEM, pool_size, 0);

xgcry_control is defined in t-common.h:

#define xgcry_control(cmd...)                                   \
  do {                                                          \
    gcry_error_t err__ = gcry_control (cmd);                    \
    if (err__)                                                  \
      die ("line %d: gcry_control (%s) failed: %s",             \
           __LINE__, #cmd, gcry_strerror (err__));              \
  } while (0)

The second and third arguments passed to xgcry_control seem to be lost when calling gcry_control.

How do you recommend clearing this issue?

JW added a comment.EditedMay 12 2019, 7:16 AM

The second and third arguments passed to xgcry_control seem to be lost when calling gcry_control.

The problem is definitely the xgcry_control macro. If I am parsing the Stack Overflow questions and answers correctly, a define cannot forward the varargs like it is trying to do. In fact a vararg function cannot forward to another vararg function.

How do you recommend clearing this issue?

This cleared it for me. After gutting xgcry_control and defining it to gcry_control, all the self tests pass.

--- tests/t-common.h
+++ tests/t-common.h
@@ -151,13 +151,7 @@


 /* Convenience macro for initializing gcrypt with error checking.  */
-#define xgcry_control(cmd...)                                   \
-  do {                                                          \
-    gcry_error_t err__ = gcry_control (cmd);                    \
-    if (err__)                                                  \
-      die ("line %d: gcry_control (%s) failed: %s",             \
-           __LINE__, #cmd, gcry_strerror (err__));              \
-  } while (0)
+#define xgcry_control gcry_control


 /* Split a string into colon delimited fields A pointer to each field

I'm still seeing the unusual output from the rng, though. But it is not causing a self test failure or memory error.

jukivili added a comment.EditedMay 12 2019, 6:45 PM

That type of variadic macro is GCC extension, see https://gcc.gnu.org/onlinedocs/cpp/Variadic-Macros.html

It needs to be changed to use more standard C way for passing arguments (doubly nested parentheses maybe, __VA_ARGS__ requires newer C standard than what libgcrypt expects).

werner triaged this task as Low priority.May 13 2019, 4:01 PM
werner added a subscriber: werner.

I have not yet looked at the details but I do not consider one-time allocation a problem. If you want to silence ASAN it is possible to use gpgrt_annotate_leaked_object( foo). Dynamic loading of Libgcrypt is anyway not supported; those who do that are on their own.

JW added a comment.EditedMay 13 2019, 10:17 PM
Dynamic loading of Libgcrypt is anyway not supported; those who do that are on their own.

I was not aware libgcrypt only provided an archive. But I don't think it materially affects the issue. Providing only an archive merely moves the problem around. The underlying issue still exists.

Also, wrapper shared objects are common on platforms like Android. One would need to provide a wrapper so an Android app could use libgcrypt. There's nothing special about libgcrypt here. The same would apply to services provided by Gnulib and other libraries that only provide archives.

JW added a comment.EditedMay 14 2019, 1:45 AM

I was talking to Thomas Dickey, who maintains Ncurses. Ncurses had a leak and he offered a config option to remove it. Ncurses config responds to --disable-leaks.

Perhaps GnuPG can offer a similar option for users who wish to run without resource leaks.

I've prepared patch for statically defining mpiutil contants, but I can leave it out and not push to master.

I would prefer not to fix that. I did some experiments on replacing all the runtime parsed ECC constants by static data. Adding the other constants will then be simple.

@JW: Libgcrypt can be used by design a shared libary or a DLL. But at least on Unix shared libraries are a hack and they are not proper objects of the OS (DLL are much better in this regard). dlopen will likely work but shutting down that shared library may leave the calling process in an undefined state (e.g. callbacks, threads and mutexes). It is just a bad idea to do that (even though there are function calls in Libgcrypt to help with that). Better don't do that.

werner closed this task as Wontfix.Thu, Jan 7, 11:16 AM
werner claimed this task.

For security and auditing reasons a Libgcrypt SO may not be "unloaded".