gpg --gen-key from gnupg-w32 2.1.7 fails on Windows > 8.1 (AESNI))
Closed, ResolvedPublic

Description

Testing on Windows 10 32 bit.

After the pinentry window shows up you get:

gpg: agent_genkey failed: End of file
Key generation failed: End of file

Last lines from the gpg-agent log (debug 1024)

2015-08-31 18:41:42 gpg-agent[976] DBG: connection to PIN entry established
2015-08-31 18:41:42 gpg-agent[976] DBG: chan_000001B8 -> INQUIRE
PINENTRY_LAUNCHED 1540
2015-08-31 18:41:42 gpg-agent[976] DBG: chan_000001B8 <- END
2015-08-31 18:41:44 gpg-agent[976] S2K calibration: 4485120 -> 93m

I guess I'll assign it to me for now qualify this problem further. I'm also
interested what changed in Windows between 7 and 10 / 8.1 to cause this.

Details

Version
1.6
aheinecke set Version to 2.1.7.
aheinecke added subscribers: werner, aheinecke.

This was already reported in T1819 and T2083.

Let's fix it here :-)

Surprise. This issue is weird.

Agent calls: hash_passphrase in agent/protect.c:do_encryption
I've added a load of debug output there but this is where it crashes.
I've moved the get_standard_s2k_count out of that call to verify that this is
not he crashing part.

My code looks like this:

  log_debug ("%s:%s: Line: %d", __FILE__, __func__, __LINE__);
  unsigned long s2kcnt = get_standard_s2k_count();
  log_debug ("%s:%s: Line: %d", __FILE__, __func__, __LINE__);
  rc = hash_passphrase (passphrase, GCRY_MD_SHA1,
                        3, iv+2*blklen,
                        s2kcnt,
			key, keylen);
  log_debug ("%s:%s: Line: %d", __FILE__, __func__, __LINE__);

The debug output after the hash_passphrase is not reached. The line before is.

But now this is where it gets weird.

With (debug enhanced):

static int
hash_passphrase (const char *passphrase, int hashalgo,

int s2kmode,
const unsigned char *s2ksalt,
unsigned long s2kcount,
unsigned char *key, size_t keylen)

{

  /* The key derive function does not support a zero length string for
     the passphrase in the S2K modes.  Return a better suited error
     code than GPG_ERR_INV_DATA.  */
  int ret;
  log_debug ("%s:%s: Line: %d", __FILE__, __func__, __LINE__);
  if (!passphrase || !*passphrase)
    return gpg_error (GPG_ERR_NO_PASSPHRASE);
  log_debug ("%s:%s: Line: %d", __FILE__, __func__, __LINE__);
  ret = gcry_kdf_derive (passphrase, strlen (passphrase),
                         s2kmode == 3? GCRY_KDF_ITERSALTED_S2K :
                         s2kmode == 1? GCRY_KDF_SALTED_S2K :
                         s2kmode == 0? GCRY_KDF_SIMPLE_S2K : GCRY_KDF_NONE,
                         hashalgo, s2ksalt, 8, s2kcount,
                         keylen, key);
  log_debug ("%s:%s: Line: %d", __FILE__, __func__, __LINE__);
  log_debug ("ret: %i ", ret);

  return ret;

}

I can see the debug line above the return statement is executed and that it
returns 0! But i don't see the call returning to do_encryption.

The only idea explaining this behavior that i have so far is some kind of stack
corruption where has_passphrase tries to return to an invalid pointer. But i
don't see the problem atm.

...
Or printf debugging was the wrong approach here.

Attaching gdb to the agent led to the following backtrace:

#0 0x655ea3e9 in aesni_do_setkey () from C:\Program
Files\GnuPG\bin\libgcrypt-20.dll
#1 0x655ead8a in do_setkey () from C:\Program Files\GnuPG\bin\libgcrypt-20.dll
#2 0x655eb2b1 in rijndael_setkey () from C:\Program
Files\GnuPG\bin\libgcrypt-20.dll
#3 0x655edadd in selftest_basic_128 () from C:\Program
Files\GnuPG\bin\libgcrypt-20.dll
#4 0x655ede09 in selftest () from C:\Program Files\GnuPG\bin\libgcrypt-20.dll
#5 0x655eabfc in do_setkey () from C:\Program Files\GnuPG\bin\libgcrypt-20.dll
#6 0x655eb2b1 in rijndael_setkey () from C:\Program
Files\GnuPG\bin\libgcrypt-20.dll
#7 0x655cd4ae in cipher_setkey () from C:\Program Files\GnuPG\bin\libgcrypt-20.dll
#8 0x655ce076 in _gcry_cipher_setkey () from C:\Program
Files\GnuPG\bin\libgcrypt-20.dll
#9 0x655c2308 in gcry_cipher_setkey () from C:\Program
Files\GnuPG\bin\libgcrypt-20.dll
#10 0x0041aea8 in agent_protect ()
#11 0x004189a9 in store_key ()
#12 0x0041950b in agent_genkey ()
#13 0x00407a5e in cmd_genkey ()

So I've built libgcrypt again with --disable-aesni-support (Which is also what
gpg4win uses). And the crash goes away.

Backtrace with debug symbols:

(gdb) bt full
#0 0x655ea3e9 in aesni_do_setkey (ctx=0xc6f868,

key=0x6565dc10 <key_128.65421>

"\350\351\352\353\355\356\357\360\362\363\364\365\367\370\371\372\001K\257\"x\246\235\063\035Q\200\020\066C\351\232gC\303\321Q\232\264\362͚x\253\t\245\021\275]\036\362\r\316ּ\274\022\023\032\307\305G\210\252\b\016\225\027\353\026wq\232\317r\200\206\004",
<incomplete sequence \343>)

at

/home/aheinecke/arbeit/gpg4win/src/gnupg-w32-2.1.7/PLAY/src/libgcrypt/cipher/rijndael.c:248
No locals.
#1 0x655ead8a in do_setkey (ctx=0xc6f868,

key=0x6565dc10 <key_128.65421>

"\350\351\352\353\355\356\357\360\362\363\364\365\367\370\371\372\001K\257\"x\246\235\063\035Q\200\020\066C\351\232gC\303\321Q\232\264\362͚x\253\t\245\021\275]\036\362\r\316ּ\274\022\023\032\307\305G\210\252\b\016\225\027\353\026wq\232\317r\200\206\004",
<incomplete sequence \343>, keylen=16)

at

/home/aheinecke/arbeit/gpg4win/src/gnupg-w32-2.1.7/PLAY/src/libgcrypt/cipher/rijndael.c:569

initialized = 1
selftest_failed = 0x0
rounds = 10
i = 1
j = 1
r = 1
t = 13813018
rconpointer = 0
KC = 4
hwfeatures = 1472

#2 0x655eb2b1 in rijndael_setkey (context=0xc6f868,

key=0x6565dc10 <key_128.65421>

"\350\351\352\353\355\356\357\360\362\363\364\365\367\370\371\372\001K\257\"x\246\235\063\035Q\200\020\066C\351\232gC\303\321Q\232\264\362͚x\253\t\245\021\275]\036\362\r\316ּ\274\022\023\032\307\305G\210\252\b\016\225\027\353\026wq\232\317r\200\206\004",
<incomplete sequence \343>, keylen=16)

at

/home/aheinecke/arbeit/gpg4win/src/gnupg-w32-2.1.7/PLAY/src/libgcrypt/cipher/rijndael.c:668

ctx = 0xc6f868

...

info registers
eax 0x6565dc10 1701174288
ecx 0xd25110 13783312
edx 0xc6f868 13039720
ebx 0x0 0
esp 0xc6f760 0xc6f760
ebp 0xc6f760 0xc6f760
esi 0x0 0
edi 0xd24478 13780088
eip 0x655ea3e9 0x655ea3e9 <aesni_do_setkey+31>
eflags 0x10297 [ CF PF AF SF IF RF ]
cs 0x1b 27
ss 0x23 35
ds 0x23 35
es 0x23 35
fs 0x3b 59
gs 0x0 0

disas 0x655ea3e2,0x655ea3ff

Dump of assembler code from 0x655ea3e2 to 0x655ea3ff:

0x655ea3e2 <aesni_do_setkey+24>:     mov    0xc(%ebp),%eax
0x655ea3e5 <aesni_do_setkey+27>:     movdqu (%eax),%xmm1

> 0x655ea3e9 <aesni_do_setkey+31>: movdqa %xmm1,(%edx)

   0x655ea3ed <aesni_do_setkey+35>:     aeskeygenassist $0x1,%xmm1,%xmm2
   0x655ea3f3 <aesni_do_setkey+41>:     pshufd $0xff,%xmm2,%xmm2
   0x655ea3f8 <aesni_do_setkey+46>:     movdqa %xmm1,%xmm3
   0x655ea3fc <aesni_do_setkey+50>:     pslldq $0x4,%xmm3

It appears to be that this is crash is due to the fact that windows uses a 4
Byte stack alignment and the movdqa call expects 16 byte alignment.

I've found some info on this here:
http://www.peterstock.co.uk/games/mingw_sse/

I also confirmed that with "-mstackrealign" the crash no longer happens.

Werner: should we add this globaly to the configure options of gcrypt or do you
have a better fix for this?

aheinecke reassigned this task from aheinecke to werner.Sep 1 2015, 6:53 PM

IIRC, we fixed the alignment in Libgcrypt but I am not sure whether this has
been backported to Libgcrypt 1.6. Which libgcrypt version is used?

Ooops - I should know it is my installer :-(
1.6.3.

werner renamed this task from gpg --gen-key from gnupg-w32 2.1.7 fails on Windows > 8.1 to gpg --gen-key from gnupg-w32 2.1.7 fails on Windows > 8.1 (AESNI)).Sep 2 2015, 12:32 PM
werner changed Version from 2.1.7 to 1.6.
werner removed a project: gnupg.

This problem still occurs with libgcrypt from current master:
libgcrypt 1.7.0-beta259

#0 0x655f24a7 in _gcry_aes_aesni_do_setkey (ctx=0x97f868,

key=0x656621b4 <key_128>

"\350\351\352\353\355\356\357\360\362\363\364\365\367\370\371\372\001K\257\"x\246\235\063\035Q\200\020\066C\351\232gC\303\321Q\232\264\362͚x\253\t\245\021\275]\036\362\r\316ּ\274\022\023\032\307\305G\210\252\b\016\225\027\353\026wq\232\317r\200\206\004",
<incomplete sequence \343>) at rijndael-aesni.c:117

werner added a comment.Sep 4 2015, 1:12 PM

The GIT master and also the 1.6 branch now has a fix for that problem. A 1.6.4
release sill be done soon.

werner removed a project: In Progress.

1.6.4 has been released

werner closed this task as Resolved.Sep 21 2015, 9:00 AM
werner removed a project: Testing.