GPGME: Random crash when verifying CMS Signature on Windows
Closed, ResolvedPublic

Description

When repeatedly verifying opaque signatures on Windows you can get GPGME to crash.

This can be tested with the new --repeat argument of run-verify.

It might be at least one cause of T4111. As such a fix for this is very important for Gpg4win.

Here is a signature that triggers this for me:

I copy the run-verify.exe into my Gpg4win folder and run c:\gpg4win\bin\run-verify.exe --repeat 5000 --cms "the sig file"

Please note that due to the crash fixed in 8f27511862cabac8fa1dd8f883cb78faebc05ef6 you also need to update libgpgme-11.dll in gpg4win to not have it crash immediately.

@werner you can already help if you like. Otherwise I'll try to whittle this down further.

I'm also seeing hangs. Sometimes with gpgsm running. Sometimes without it running.

For crashes I saw twice that the log ended with:

GPGME 2018-10-29 17:43:42 <0x34ac>  _gpgme_io_spawn: check: path=00183908, argv[ 0] = C:\Gpg4win\..\GnuPG\bin\gpgconf.exe
GPGME 2018-10-29 17:43:42 <0x34ac>  _gpgme_io_spawn: check: path=00183908, argv[ 1] = --list-dirs
GPGME 2018-10-29 17:43:42 <0x34ac>  _gpgme_io_spawn: check: path=00183908, tmp_name = C:\Users\AHEINE~1\AppData\Local\Temp\gpgme-IhWxl3

But maybe another thread caused the crash in the background.
I've added more trace info between this line and the next debug output line for future.

For a hang without gpgsm running the log ends with:

GPGME 2018-10-29 17:44:20 <0x34ac>  gpgme_op_verify: leave
GPGME 2018-10-29 17:44:20 <0x34ac>  gpgme_op_verify_result: enter: ctx=0018E8A8
GPGME 2018-10-29 17:44:20 <0x34ac>  gpgme_op_verify_result: check: ctx=0018E8A8, sig[0] = fpr 13EDC882410E770C854B1CCEBB11CE8CADF09832, summary 0x3, status Erfolg
GPGME 2018-10-29 17:44:20 <0x34ac>  gpgme_op_verify_result: check: ctx=0018E8A8, sig[0] = timestamps 0x5bd71018/0x5e24c2bd flags:pka RFU
GPGME 2018-10-29 17:44:20 <0x34ac>  gpgme_op_verify_result: check: ctx=0018E8A8, sig[0] = validity 0x4 (Erfolg), algos RSA/SHA256
GPGME 2018-10-29 17:44:20 <0x34ac>  gpgme_op_verify_result: leave: result=0018B0F8
GPGME 2018-10-29 17:44:20 <0x34ac>  gpgme_data_release: call: dh=00000000
GPGME 2018-10-29 17:44:20 <0x34ac>  gpgme_data_release: call: dh=00189B90
GPGME 2018-10-29 17:44:20 <0x34ac>  gpgme_release: call: ctx=0018E8A8
GPGME 2018-10-29 17:44:20 <0x34ac>  assuan_release (ctx=0x0018d908): call

GPGME 2018-10-29 17:44:20 <0x34ac>  chan_0x00000006 -> BYE

GPGME 2018-10-29 17:44:20 <0x34ac>  _gpgme_io_write: enter: fd=00000009, buffer=627F0760, count=3
GPGME 2018-10-29 17:44:20 <0x34ac>  _gpgme_io_write: check: 425945                            BYE
GPGME 2018-10-29 17:44:20 <0x34ac>  _gpgme_io_write: leave: result=3
GPGME 2018-10-29 17:44:20 <0x34ac>  _gpgme_io_write: enter: fd=00000009, buffer=627F07F7, count=1
GPGME 2018-10-29 17:44:20 <0x34ac>  _gpgme_io_write: check: 0a                                .
GPGME 2018-10-29 17:44:20 <0x34ac>  _gpgme_io_write: check: fd=00000009, waiting for empty buffer in thread 00000568
GPGME 2018-10-29 17:44:20 <0x2180>  gpgme:writer: check: ctx->file_hd=00000390, got data to send
GPGME 2018-10-29 17:44:20 <0x2180>  gpgme:writer: check: ctx->file_hd=00000390, writing 3 bytes
GPGME 2018-10-29 17:44:20 <0x2180>  gpgme:writer: check: ctx->file_hd=00000390, wrote 3 bytes
GPGME 2018-10-29 17:44:20 <0x2180>  gpgme:writer: check: ctx->file_hd=00000390, idle
GPGME 2018-10-29 17:44:20 <0x34ac>  _gpgme_io_write: check: fd=00000009, thread 00000568 buffer is empty
GPGME 2018-10-29 17:44:20 <0x34ac>  _gpgme_io_write: leave: result=1
GPGME 2018-10-29 17:44:20 <0x34ac>  _assuan_close (ctx=0x0018d908): call: fd=0x6

GPGME 2018-10-29 17:44:20 <0x34ac>  _gpgme_io_close: enter: fd=00000006
GPGME 2018-10-29 17:44:20 <0x2180>  gpgme:writer: check: ctx->file_hd=00000390, got data to send
GPGME 2018-10-29 17:44:20 <0x2180>  gpgme:writer: check: ctx->file_hd=00000390, writing 1 bytes
GPGME 2018-10-29 17:44:20 <0x34ac>  _gpgme_io_close: check: fd=00000006, fd=6 -> handle=00000380 socket=-1 dupfrom=-1
GPGME 2018-10-29 17:44:20 <0x2180>  gpgme:writer: check: ctx->file_hd=00000390, wrote 1 bytes
GPGME 2018-10-29 17:44:20 <0x338c>  gpgme:writer: check: ctx->file_hd=00000380, got data to send
GPGME 2018-10-29 17:44:20 <0x2180>  gpgme:writer: check: ctx->file_hd=00000390, idle
GPGME 2018-10-29 17:44:20 <0x0408>  gpgme:reader: check: ctx->file_hd=00000380, waiting for close
GPGME 2018-10-29 17:44:20 <0x338c>  gpgme:writer: check: ctx->file_hd=00000380, waiting for close
GPGME 2018-10-29 17:44:20 <0x338c>  gpgme:writer: leave
GPGME 2018-10-29 17:44:20 <0x0408>  gpgme:reader: leave
GPGME 2018-10-29 17:44:20 <0x34ac>  _gpgme_io_close: leave: result=0
GPGME 2018-10-29 17:44:20 <0x34ac>  _assuan_close (ctx=0x0018d908): call: fd=0x9

GPGME 2018-10-29 17:44:20 <0x34ac>  _gpgme_io_close: enter: fd=00000009
GPGME 2018-10-29 17:44:20 <0x34ac>  _gpgme_io_close: check: fd=00000009, fd=9 -> handle=00000390 socket=-1 dupfrom=-1
GPGME 2018-10-29 17:44:20 <0x34ac>  _gpgme_io_close: leave: result=0

I've seen now four crashes in _gpgme_io_spawn. I've added tracing that shows that the CreateProcess itself is crashing. I do not see how this is possible. I'm printing the command line and the program name in debug output and both look fine.
The command line is also mutable.

I guess this must be a stack / heap corruption if createprocess is crashing with valid parameter input. Probably somehow related to libassuan as assuan is used for SM and not for Gpg.

Btw I've checked that the errors are the same in T4111 and this:

Name der fehlerhaften Anwendung: run-verify.exe, Version: 0.0.0.0, Zeitstempel: 0x5bd725e3
Name des fehlerhaften Moduls: ntdll.dll, Version: 10.0.17134.254, Zeitstempel: 0xc993940a
Ausnahmecode: 0xc0000005
Fehleroffset: 0x0003d7b5
ID des fehlerhaften Prozesses: 0x2674
Startzeit der fehlerhaften Anwendung: 0x01d46f9b8d171728
Pfad der fehlerhaften Anwendung: c:\Gpg4win\bin\run-verify.exe
Pfad des fehlerhaften Moduls: C:\WINDOWS\SYSTEM32\ntdll.dll
Berichtskennung: 6c908139-26f1-4c83-ac43-2a60a64496aa
Name der fehlerhaften Anwendung: OUTLOOK.EXE, Version: 16.0.10827.20138, Zeitstempel: 0x5bac37b2
Name des fehlerhaften Moduls: ntdll.dll, Version: 10.0.17134.254, Zeitstempel: 0xc993940a
Ausnahmecode: 0xc0000005
Fehleroffset: 0x0003d7b5
ID des fehlerhaften Prozesses: 0x440
Startzeit der fehlerhaften Anwendung: 0x01d45f0e247d7255
Pfad der fehlerhaften Anwendung: C:\Program Files\Microsoft Office\root\Office16\OUTLOOK.EXE
Pfad des fehlerhaften Moduls: C:\WINDOWS\SYSTEM32\ntdll.dll
Berichtskennung: 934e414f-99d2-4a53-b7a7-cefff7e38573

I'm currently looking at the CloseHandle in _gpgme_io_close:

  if (fd_table[fd].dup_from == -1)
    {
      if (fd_table[fd].handle != INVALID_HANDLE_VALUE)
	{
	  if (!CloseHandle (fd_table[fd].handle))
	    {
	      TRACE_LOG1 ("CloseHandle failed: ec=%d", (int) GetLastError ());
	      /* FIXME: Should translate the error code.  */
	      gpg_err_set_errno (EIO);
	      return TRACE_SYSRES (-1);
	    }
	}

In one of my logs I have a single:

GPGME 2018-10-30 18:29:43 <0x0730>    _gpgme_io_close: enter: fd=00000002
GPGME 2018-10-30 18:29:43 <0x0730>    _gpgme_io_close: check: fd=00000002, fd=2 -> handle=00000020 socket=-1 dupfrom=-1
GPGME 2018-10-30 18:29:43 <0x0730>    _gpgme_io_close: check: fd=00000002, CloseHandle failed: ec=6
GPGME 2018-10-30 18:29:43 <0x0730>    _gpgme_io_close: error: Input/output error

ec=6 is error invalid handle. As I understand it this can only mean that the handle was already closed.
If we can call CloseHandle on a closed handle, that handle could already be reused by the system for other things and we accidentally close a handle we know nothing about. Which might lead to anything.

werner changed the task status from Open to Testing.
aheinecke closed this task as Resolved.Nov 2 2018, 12:58 PM

Yes! Thank you very much. My test runs and my Outlook has verified 2500 S/MIME Mails without a crash.