gcry_pk_genkey() segfaults for ecdsa 384
Closed, ResolvedPublic

Description

The torture_algorithms testcase of libssh.org segfaults with libgcrypt-1.8.1-3.fc27 and ecdsa keys.

The ecdsa 256bit ecdh test works, 384 segfaults and 521 hangs.

Reproducer:
Get libssh.org
Build with gcrypt and client tests
Run: ctest -V -R torture_algorithms

Here is the backtrace:

Program received signal SIGSEGV, Segmentation fault.                                  
0x00007ffff6633d5c in jent_memaccess (ec=ec@entry=0x7ffff7fed058, 
loop_cnt=loop_cnt@entry=0) at ./jitterentropy-base.c:274                                                  
274                     *tmpval = (*tmpval + 1) & 0xff;                               
(gdb) bt                                                                              
#0  0x00007ffff6633d5c in jent_memaccess (ec=ec@entry=0x7ffff7fed058, 
loop_cnt=loop_cnt@entry=0) at ./jitterentropy-base.c:274                                              
#1  0x00007ffff6633e8d in jent_measure_jitter (ec=ec@entry=0x7ffff7fed058) at 
/jitterentropy-base.c:341
#2  0x00007ffff663404e in jent_gen_entropy (ec=ec@entry=0x7ffff7fed058) at ./
jitterentropy-base.c:453                                                                       
#3  0x00007ffff66341bc in jent_read_entropy (ec=0x7ffff7fed058, 
data=data@entry=0x7fffffffacc0 "", len=len@entry=18) at ./jitterentropy-
base.c:541                          
#4  0x00007ffff66347d1 in _gcry_rndjent_poll (add=add@entry=0x7ffff662fca0 
<add_randomness>, origin=origin@entry=RANDOM_ORIGIN_EXTRAPOLL, 
length=length@entry=18)           
    at ./rndjent.c:294                     
#5  0x00007ffff66351da in _gcry_rndlinux_gather_random (add=0x7ffff662fca0 
<add_randomness>, origin=RANDOM_ORIGIN_EXTRAPOLL, length=36, level=2) at 
rndlinux.c:180          
#6  0x00007ffff662f8b0 in read_random_source 
(origin=origin@entry=RANDOM_ORIGIN_EXTRAPOLL, length=length@entry=48, 
level=level@entry=2) at random-csprng.c:1299             
#7  0x00007ffff663091a in read_pool (level=2, length=<optimized out>, 
buffer=0x7ffff7fed050 "") at random-csprng.c:996                                                      
#8  _gcry_rngcsprng_randomize (buffer=<optimized out>, length=<optimized out>, 
level=GCRY_VERY_STRONG_RANDOM) at random-csprng.c:542                                        
#9  0x00007ffff662f520 in _gcry_random_bytes_secure (nbytes=nbytes@entry=48, 
level=level@entry=GCRY_VERY_STRONG_RANDOM) at random.c:405                                     
#10 0x00007ffff65934e3 in _gcry_dsa_gen_k (q=0x6a1ff0, 
security_level=security_level@entry=2) at dsa-common.c:57                                                            
#11 0x00007ffff66061bc in nist_generate_key (sk=sk@entry=0x7fffffffb3c0, 
E=E@entry=0x7fffffffb370, ctx=ctx@entry=0x6946c0, flags=0, nbits=384, 
r_x=0x7fffffffb348,          
    r_y=0x7fffffffb350) at ecc.c:177       
#12 0x00007ffff6606a5a in ecc_generate (genparms=<optimized out>, 
r_skey=0x7fffffffb4f8) at ecc.c:602                                                                       
#13 0x00007ffff6588f1f in _gcry_pk_genkey (r_key=r_key@entry=0x7fffffffb4f8, 
s_parms=s_parms@entry=0x6a5440) at pubkey.c:578                                                
#14 0x00007ffff6574c50 in gcry_pk_genkey (r_key=0x7fffffffb4f8, 
s_parms=0x6a5440) at visibility.c:1029                                                                      
#15 0x000000000043ee8c in ssh_client_ecdh_init (session=0x695410) at /home/
asn/workspace/projects/libssh/src/ecdh_gcrypt.c:83                                               
#16 0x00000000004161aa in dh_handshake (session=0x695410) at /home/asn/
workspace/projects/libssh/src/client.c:265                                                           
#17 0x000000000041679b in ssh_client_connection_callback (session=0x695410) at 
/home/asn/workspace/projects/libssh/src/client.c:474                                         
#18 0x000000000041abf1 in ssh_packet_kexinit (session=0x695410, type=20 
'\024', packet=0x69af90, user=0x695410) at /home/asn/workspace/projects/
libssh/src/kex.c:523        
#19 0x00000000004263e3 in ssh_packet_process (session=0x695410, type=20 
'\024') at /home/asn/workspace/projects/libssh/src/packet.c:451                                     
#20 0x0000000000425f06 in ssh_packet_socket_callback (data=0x6a6350, 
receivedlen=1192, user=0x695410) at /home/asn/workspace/projects/libssh/src/
packet.c:332               
#21 0x000000000042f187 in ssh_socket_pollcallback (p=0x6a0200, fd=4, 
revents=1, v_s=0x69f1b0) at /home/asn/workspace/projects/libssh/src/socket.c:
298                       
#22 0x00000000004586aa in ssh_poll_ctx_dopoll (ctx=0x69c110, timeout=9998) at 
/home/asn/workspace/projects/libssh/src/poll.c:632                                            
#23 0x000000000042e4ee in ssh_handle_packets (session=0x695410, timeout=9998) 
at /home/asn/workspace/projects/libssh/src/session.c:641                                      
#24 0x000000000042e5c1 in ssh_handle_packets_termination (session=0x695410, 
timeout=10000, fct=0x4168cf <ssh_connect_termination>, user=0x695410)                           
    at /home/asn/workspace/projects/libssh/src/session.c:703                          
#25 0x0000000000416cf6 in ssh_connect (session=0x695410) at /home/asn/
workspace/projects/libssh/src/client.c:611                                                            
#26 0x000000000040e0cf in test_algorithm (session=0x695410, kex=0x45bab1 
"ecdh-sha2-nistp384", cipher=0x0, hmac=0x0)                                                        
    at /home/asn/workspace/projects/libssh/tests/client/torture_algorithms.c:
110      
#27 0x000000000040e975 in torture_algorithms_ecdh_sha2_nistp384 
(state=0x692150) at /home/asn/workspace/projects/libssh/tests/client/
torture_algorithms.c:355               
#28 0x00007ffff71a4ae9 in cmocka_run_one_test_or_fixture () from /lib64/
libcmocka.so.0                                                                                      
#29 0x00007ffff71a53d1 in _cmocka_run_group_tests () from /lib64/libcmocka.so.
0       
#30 0x000000000040ea52 in torture_run_tests () at /home/asn/workspace/
projects/libssh/tests/client/torture_algorithms.c:471                                                 
---Type <return> to continue, or q <return> to quit---                                
#31 0x0000000000410558 in main (argc=1, argv=0x7fffffffd028) at /home/asn/
workspace/projects/libssh/tests/torture.c:812
asn created this task.Jan 11 2018, 11:42 AM
werner added a subscriber: werner.Jan 11 2018, 12:26 PM

Thanks for the report. I have a few questions, though
Which version of libgpg-error are you using?
What are the changes Fedora made to libgcrypt (and libgpg-error)?
Which CPU, what compile options and which compiler version?
Can you repeat this with a stock libgcrypt and libgpg-error?

asn added a comment.Jan 11 2018, 12:33 PM

libgpg-error is version 1.27: https://src.fedoraproject.org/rpms/libgpg-error/tree/f27
You can find the patches applied to libgcrypto here: https://src.fedoraproject.org/rpms/libgcrypt/tree/f27

It has been compiled using gcc 7.2.1 with:

CFLAGS='-O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -m64 -mtune=generic'

./configure --build=x86_64-redhat-linux-gnu --host=x86_64-redhat-linux-gnu --program-prefix= --disable-dependency-tracking --prefix=/usr --exec-prefix=/usr --bindir=/usr/bin --sbindir=/usr/sbin --sysconfdir=/etc --datadir=/usr/share --includedir=/usr/include --libdir=/usr/lib64 --libexecdir=/usr/libexec --localstatedir=/var --sharedstatedir=/var/lib --mandir=/usr/share/man --infodir=/usr/share/info --disable-static --enable-noexecstack --enable-hmac-binary-check '--enable-pubkey-ciphers=dsa elgamal rsa ecc' --disable-O-flag-munging

https://kojipkgs.fedoraproject.org//packages/libgcrypt/1.8.1/3.fc27/data/logs/x86_64/build.log

CPU: Intel Core i7-5600U

asn added a comment.Jan 11 2018, 12:37 PM

The issue also occurs on openSUSE Tumbleweed:

libgcrypt 1.8.2
libgpg-error 1.27

CPU: Intel Core i7-4960X

werner triaged this task as High priority.Jan 11 2018, 1:55 PM

Okay, so on Suse we have the same problem w/o the somewhat intrusive changes of Fedora. The inetresting thing is that segv code part is the same as used in Linux.

[Urgs. My company mail address as file name in the Fedora repo- maybe that is the reasons why I get more and more spam on that address. Why not using wk@gnupg.org - puzzled.]

asn added a comment.Jan 11 2018, 2:56 PM

The segfault from an openSUSE machine looks the same:

(gdb) bt
#0  0x00007ffff68c0b90 in jent_memaccess (ec=0x7ffff7feb058, loop_cnt=0) at ./jitterentropy-base.c:274
#1  0x00007ffff68c0c96 in jent_measure_jitter (ec=0x7ffff7feb058) at ./jitterentropy-base.c:341
#2  0x00007ffff68c0e1a in jent_gen_entropy (ec=0x7ffff7feb058) at ./jitterentropy-base.c:453
#3  0x00007ffff68c0f53 in jent_read_entropy (ec=0x7ffff7feb058, data=0x7fffffffb040 "", len=18) at ./jitterentropy-base.c:541
#4  0x00007ffff68c14f9 in _gcry_rndjent_poll (add=0x7ffff68bcb40 <add_randomness>, origin=RANDOM_ORIGIN_EXTRAPOLL, length=18) at ./rndjent.c:294
#5  0x00007ffff68c1f6b in _gcry_rndlinux_gather_random (add=0x7ffff68bcb40 <add_randomness>, origin=RANDOM_ORIGIN_EXTRAPOLL, length=36, level=2) at rndlinux.c:188
#6  0x00007ffff68bc7c0 in read_random_source (origin=origin@entry=RANDOM_ORIGIN_EXTRAPOLL, length=length@entry=48, level=level@entry=2) at random-csprng.c:1279
#7  0x00007ffff68bd7ba in read_pool (level=2, length=<optimized out>, buffer=0x7ffff7feb050 "") at random-csprng.c:992
#8  _gcry_rngcsprng_randomize (buffer=<optimized out>, length=<optimized out>, level=GCRY_VERY_STRONG_RANDOM) at random-csprng.c:538
#9  0x00007ffff68bc430 in _gcry_random_bytes_secure (nbytes=nbytes@entry=48, level=level@entry=GCRY_VERY_STRONG_RANDOM) at random.c:405
#10 0x00007ffff68204a3 in _gcry_dsa_gen_k (q=0x4b4ce0, security_level=security_level@entry=2) at dsa-common.c:57
#11 0x00007ffff68930ec in nist_generate_key (sk=sk@entry=0x7fffffffb7d0, E=E@entry=0x7fffffffb780, ctx=ctx@entry=0x4b9530, flags=0, nbits=384, r_x=0x7fffffffb758, 
    r_y=0x7fffffffb760) at ecc.c:177
#12 0x00007ffff689398a in ecc_generate (genparms=<optimized out>, r_skey=0x7fffffffb908) at ecc.c:602
#13 0x00007ffff6815eef in _gcry_pk_genkey (r_key=r_key@entry=0x7fffffffb908, s_parms=s_parms@entry=0x4baa90) at pubkey.c:578
#14 0x00007ffff6801d50 in gcry_pk_genkey (r_key=0x7fffffffb908, s_parms=0x4baa90) at visibility.c:1029
#15 0x000000000043f1b4 in ssh_client_ecdh_init ()
gniibe added a subscriber: gniibe.Jan 15 2018, 10:36 AM

It is reproducible on my Debian (stretch). I'm going to minimize the case.

I already talked with the upstream author and we figured a possible problem due to an non-locked use of the core function. The cause of this is

unsigned char *tmpval = ec->mem + ec->memlocation;
*tmpval = (*tmpval + 1) & 0xff;
ec->memlocation = ec->memlocation + ec->memblocksize - 1;
ec->memlocation = ec->memlocation % wrap;

which is non-atomic and will thus leads to the out-of-bounds deref. The EC object may only be used by one thread at a time.

aa added a subscriber: aa.Jan 16 2018, 1:14 AM
This comment was removed by gniibe.
gniibe added a comment.EditedApr 10 2018, 3:08 AM

I check this report again.
The test is single thread, IIUC.

The problem which causes SEGV is: the test suite (which has multiple tests) calls gcry_control(GCRYCTL_TERM_SECMEM) for each test to finalize.
The random generator uses secure memory, which is freed by TERM_SECMEM.
And then, it is accessed by another test.
We need to have finalization routines for random generator.

gniibe claimed this task.Apr 16 2018, 10:24 AM
gniibe added a comment.May 7 2018, 1:52 AM

It assumes a change of libssh like:

gniibe added a comment.May 7 2018, 1:53 AM

The patch D461 makes gcry_control(GCRYCTL_CLOSE_RANDOM_DEVICE) free the allocated secure memory.

gniibe added a comment.May 7 2018, 2:32 AM

It would be better not to require gcry_control(GCRYCTL_CLOSE_RANDOM_DEVICE). Automatic handling through gcry_control(GCRYCTL_TERM_SECMEM) would be better.

werner added a comment.May 7 2018, 8:27 AM

Am I right to assume that the test suite is terminating and restarting libgcrypt? Although we have features for this, I am still not convinced that this is a proper use of libgcrypt. There are just too many cases how this can fail. Unix is not designed to use shared libraries in so-called "plugins". I need to look closer at the libssh code.

gniibe added a comment.EditedMay 8 2018, 2:01 AM

By libssh upstream, the problem has been fixed: commit-72f6b34

Before the fix, the ssh_init and ssh_finalize were called for each session. Now, they are called only once in the test program.

gniibe lowered the priority of this task from High to Normal.May 8 2018, 2:07 AM

I changed the priority to 'Normal'. The problem now is not the libssh usage, but how we can assume use of secure memory by random generator(s).

werner closed this task as Resolved.Dec 17 2018, 10:01 AM

With GCRYCTL_AUTO_EXPAND_SECMEM we won't anymore run out of secure memory. This has even silent been backported to 1.8.x (using the numerical value of that constant) and is for long an option of gpg-agent. Thus closing.

For the correctness of rndjent implementation, I'm applying D461: jent random requires finalizer to deallocate secure memory.

For now, I don't touch gcry_control(GCRYCTL_TERM_SECMEM)'s calling of gcry_control(GCRYCTL_CLOSE_RANDOM_DEVICE).