SIGILL in prepare_macpads() at md.c:681
Closed, ResolvedPublic

Description

Compiled libgcrypt 1.7.8 with bleading edge clang-5 ( 5.0.0 (trunk 305735)) and UBSan on Ubuntu 16.04 x64. Used the following flags:

-O2 -fno-omit-frame-pointer -g -fsanitize=address -fsanitize=undefined -fsanitize=integer -fsanitize-coverage=trace-pc-guard -fno-sanitize-recover=undefined -fsanitize=float-divide-by-zero -fsanitize=float-cast-overflow -fsanitize-undefined-trap-on-error -fno-sanitize-recover=all

Ran the following test:

libgcrypt-1.7.8/tests/dsa-rfc6979
Illegal instruction

Program received signal SIGILL, Illegal instruction.
0x00000000005415fb in prepare_macpads (a=<optimized out>, key=<optimized out>, keylen=<optimized out>) at md.c:681
681       if (!a->ctx->list)
(gdb) bt
#0  0x00000000005415fb in prepare_macpads (a=<optimized out>, key=<optimized out>, keylen=<optimized out>) at md.c:681
#1  _gcry_md_setkey (hd=0x616000000680, key=0x603000000490, keylen=20) at md.c:797
#2  0x000000000068f636 in _gcry_dsa_gen_rfc6979_k (r_k=0x7fffffffd6c0, dsa_q=0x603000000250, dsa_x=0x6030000000a0, h1=0xffffffffa94 "", hlen=20, halgo=2, extraloops=2147448832) at dsa-common.c:234
#3  0x000000000051c97f in sign (r=0x6030000003a0, s=0x6030000003d0, input=0x6030000001c0, skey=0x7fffffffd8a0, flags=-10072, hashalgo=2) at dsa.c:627
#4  0x000000000051adbc in dsa_sign (r_sig=<optimized out>, s_data=<optimized out>, keyparms=<optimized out>) at dsa.c:1061
#5  0x0000000000558b83 in _gcry_pk_sign (r_sig=0x7fffffffdb40, s_hash=0x60b000000040, s_skey=<optimized out>) at pubkey.c:430
#6  0x00000000004e849e in gcry_pk_sign (result=0x7fffffffdb40, data=0x60b000000040, skey=0x619000000080) at visibility.c:996
#7  0x00000000004e4a04 in check_dsa_rfc6979 () at dsa-rfc6979.c:984
#8  main (argc=<optimized out>, argv=<optimized out>) at dsa-rfc6979.c:1027

libgcrypt-1.7.8/tests/hmac also crashes in a similar fashion:

Program received signal SIGILL, Illegal instruction.
0x00000000005413eb in prepare_macpads (a=<optimized out>, key=<optimized out>, keylen=<optimized out>) at md.c:681
681       if (!a->ctx->list)
(gdb) bt
#0  0x00000000005413eb in prepare_macpads (a=<optimized out>, key=<optimized out>, keylen=<optimized out>) at md.c:681
#1  _gcry_md_setkey (hd=0x619000000080, key=0x7fffffffdb80, keylen=64) at md.c:797
#2  0x00000000004e8dfe in gcry_md_setkey (hd=0x619000000080, key=0x7fffffffdb80, keylen=64) at visibility.c:1275
#3  0x00000000004e5484 in check_one_mac (algo=2, key=<optimized out>, keylen=<optimized out>, data=<optimized out>, datalen=9, expect=<optimized out>) at hmac.c:84
#4  0x00000000004e4875 in check_hmac () at hmac.c:121
#5  main (argc=<optimized out>, argv=<optimized out>) at hmac.c:225

Details

geeknik created this task.Jul 4 2017, 3:11 AM
gniibe triaged this task as Low priority.Jul 4 2017, 6:27 AM
gniibe claimed this task.
gniibe added a subscriber: gniibe.

I think that the problem is in your usage with your tool. Please have a look at md_open function in cipher/md.c.
This bug is not the one in libgcrypt, but in the compiler.

To help the bug tracking of the compiler, I keep this bug open for a while.

You can check sizeof (PROPERLY_ALIGNED_TYPE) with the compiler.
You can also check the code generated by "disas" command in GDB.

gniibe added a comment.EditedJul 5 2017, 4:07 AM

I can replicate the issue on my system.
It is not the line 681, actually.

When you do step execution, it is the line 745 of memcpy which causes the SIGILL.
After that execution, it confused as if it's the line 681.

The reason why it causes SIGILL is that, the compiler checks the size of r->context.c object and it intentionally emits the
instruction of "ud2" to cause SIGILL when the memcpy argument of size exceeds the size of the object.

The size of the object is 1 from the compiler's view, and it causes SIGILL.

In the code in question, we allocate enough space for memcpy, and checking the size of object of r->context.c is completely irrelevant.
From the developer's view of libgcrypt, it is up to the compiler user to disable this irrelevant check.

marcus added a subscriber: marcus.Jul 5 2017, 2:52 PM

Maybe casting to a void* helps to disable the check in the compiler.

marcus added a comment.Jul 5 2017, 2:56 PM

At a meta level, I really think that writing more conservative code that enables compilers to do a better job checking for safety is a good idea. The tricks we do with structs are premature optimization from a time when compilers were dumb as a doornail.

werner added a subscriber: werner.Jul 5 2017, 2:59 PM

Sorry, this is a standard C feature and the only way to have dynamic sized arrays. CLANG simply does not get this pattern right. Grep for pgut001's very comments on such ill behaving compilers (including gcc).

marcus added a comment.Jul 5 2017, 3:05 PM

This is a standard dynamic sized array:

my_type_t *my_array = malloc(my_dynamic_array_size * sizeof(my_type_t));

I stand by my opinion. Slamming compiler authors or referring to external authorities is not going to convince me otherwise, because I understand the value that the additional checks provide.

werner added a comment.Jul 5 2017, 3:23 PM

With an integer overflow.

jukivili added a subscriber: jukivili.EditedJul 6 2017, 1:22 PM

I did some experimenting and clang SIGILL does not trigger with commonly used, but non-conforming, variable-length object with "struct hack", as below:

struct test_s {
  int a;
  char b[1];
};

Even if the flexible length array is inside structure-in-structure, clang does not generate the invalid instruction:

struct test_s {
  int a;
  struct {
   char b[1];
  } ctx;
};

However, when using union with larger aligned object, such as used for gcry_cipher_handle.context, I get the SIGILL issue reproduced:

struct test_s {
  int a;
  union {
    char b[1];
    char bar[16] __attribute__ ((aligned (16)));
  };
};

So maybe SIGILL is not generated because accessing memory beyond structure but for accessing some padding area within structure/union.

werner closed this task as Resolved.Sep 21 2017, 3:39 PM

Closing due to compiler error.