Page MenuHome GnuPG

gpg --generate-key --batch from existing key (with Key-Grip:) fails on 64-bit big-endian architectures
Closed, ResolvedPublic

Description

the build logs for monkeysphere 0.43-3 on sparc64 s390x and ppc64 show it failing with:

gpg: key generation failed: Invalid S-expression

This happens based on this part of the test suite which invokes gpg --generate-key with this input:

Key-Type: $keyType
Key-Grip: $keyGrip
Key-Usage: auth
Name-Real: $serviceName
%no-protection
%commit

the same test suite appears to work fine on the other builders, and i've seen the same test suite work fine on 32-bit big-endian powerpc architecture running 2.2.13 as well.

Details

Version
2.2.13

Event Timeline

This might be related to T4490, since it's the same sort of key generation process.

dkg set Version to 2.2.13.
dkg updated the task description. (Show Details)

I ran the example script from T4490 on an s390x machine, and got the following output:

+ HOSTNAME=test.example
+ EXPIRY=1m
++ mktemp -d
+ export GNUPGHOME=/tmp/tmp.DwUgS4Tsfq
+ GNUPGHOME=/tmp/tmp.DwUgS4Tsfq
+ trap cleanup EXIT
+ cat
+ cat
+ gpgconf --launch gpg-agent
+ ssh-keygen -q -t rsa -N '' -f /tmp/tmp.DwUgS4Tsfq/example_ssh_rsa_key
++ gpgconf --list-dirs agent-ssh-socket
+ export SSH_AUTH_SOCK=/tmp/tmp.DwUgS4Tsfq/S.gpg-agent.ssh
+ SSH_AUTH_SOCK=/tmp/tmp.DwUgS4Tsfq/S.gpg-agent.ssh
+ ssh-add /tmp/tmp.DwUgS4Tsfq/example_ssh_rsa_key
Identity added: /tmp/tmp.DwUgS4Tsfq/example_ssh_rsa_key (dkg@zelenka)
++ awk '/^[0-9A-F]/{print $1 }'
+ KEYGRIP=3DFC2730B28F3E7DA0540AE33351763D76C374C1
+ test -n 3DFC2730B28F3E7DA0540AE33351763D76C374C1
+ gpg --full-generate-key
gpg: keybox '/tmp/tmp.DwUgS4Tsfq/pubring.kbx' created
gpg: key generation failed: Invalid S-expression
+ cleanup
+ tail -v /tmp/tmp.DwUgS4Tsfq/sshcontrol /tmp/tmp.DwUgS4Tsfq/gpg-agent.log /tmp/tmp.DwUgS4Tsfq/status
==> /tmp/tmp.DwUgS4Tsfq/sshcontrol <==
# RSA key added on: 2019-05-11 02:12:08
# Fingerprints:  MD5:ff:0c:a1:73:d7:6c:c3:b6:75:97:16:86:d5:5d:2a:03
#                SHA256:g7ImD7vIefJIgdEqIDig6BnLYYHiXR4Fogphr3hHPco
3DFC2730B28F3E7DA0540AE33351763D76C374C1 0

==> /tmp/tmp.DwUgS4Tsfq/gpg-agent.log <==
2019-05-11 02:12:08 gpg-agent[60894] DBG: chan_10 <- OPTION allow-pinentry-notify
2019-05-11 02:12:08 gpg-agent[60894] DBG: chan_10 -> OK
2019-05-11 02:12:08 gpg-agent[60894] DBG: chan_10 <- OPTION agent-awareness=2.1.0
2019-05-11 02:12:08 gpg-agent[60894] DBG: chan_10 -> OK
2019-05-11 02:12:08 gpg-agent[60894] DBG: chan_10 <- RESET
2019-05-11 02:12:08 gpg-agent[60894] DBG: chan_10 -> OK
2019-05-11 02:12:08 gpg-agent[60894] DBG: chan_10 <- READKEY 3DFC2730B28F3E7DA0540AE33351763D76C374C1
2019-05-11 02:12:08 gpg-agent[60894] DBG: chan_10 -> [ 44 20 28 31 30 3a 70 75 62 6c 69 63 2d 6b 65 79 ...(301 byte(s) skipped) ]
2019-05-11 02:12:08 gpg-agent[60894] DBG: chan_10 -> OK
2019-05-11 02:12:08 gpg-agent[60894] DBG: chan_10 <- [eof]

==> /tmp/tmp.DwUgS4Tsfq/status <==
[GNUPG:] ERROR key_generate 33554515
[GNUPG:] KEY_NOT_CREATED
+ rm -rf /tmp/tmp.DwUgS4Tsfq

I also did a base64 < "$GNUPGHOME/private-keys-v1.d/".key at the end of a different run of that script, and it produced this output, if you'd like to inspect the actual S-expression stored:

KDExOnByaXZhdGUta2V5KDM6cnNhKDE6bjI1NzoAqpp8K/G+MlKmDPN+d1MUBPBgJ/UNM6SmO4aV
PSzPWCYZ8zcsLQqltPasskDy0AVjBUTugESKXmom4HRbsxbqcOmw/E8l4aluDXXhF76ncXqE94eb
aYimfz2zXGxZp0hRk5+vSM01KWu9iZOD9MCA9ky4U9nwuisngGQfX5vrOL/Qtq3ui2i6ZnHn+SRV
hOpTU8ouHPZvL/l0zn5fx4Lep1Khas5jwHbu5xCi0Va1+YuBfbZmjJXPi1ffuDxSdmEQhrQxNlGh
s1PtCpyDu5jdLExVEzlT6Xv7/hCU1EiE6vAkZNaMu5MU5+22owyMRh8DfHfN/lc1f1eGf2h8Cl+2
ySkoMTplMzoBAAEpKDE6ZDI1NjpaZmfbxUo6Ui8o97GQuxYFk/Xv1lr7fYiUTDkyZFcuZ2oixZ6D
83thC8Dw55eCgQg6V49lqmwDoY4eK6oXmNH7qOkemTYCgIwPV+JBecYyTHC+1T2Vy1ImZGRxWfyb
tqd7aQcxtYMxAqU5jFBE1ejAGmEmFXsIuytATk2NlQosVCIx3emMy+IFkm5da6mL8I6M/BglL9Vo
r/XVmWs4+e2ihBWQgXL5toylUvGJkFx0vng36Abz9EQZRanzkaCPNBjdnUltqMvyuiO39L12prpG
Q0t39yPeAoOvEuKYBNo6YesCBWEEeQcyikeKxvNvlq3P6fJ+ghHH89d2xB2oWPiNKSgxOnAxMjk6
AMh4u7YUZEgnkEyjMWkh8Jc0+CMMclnS9GYMhw1Lb9kxitAE3I4/o8VV1oLBHo06stUBgaYZ2No/
ayU7EY1w6D09vTYbTigTElI0wE3XMApNylscz/2oQbFSblnaFMztzI95eFDCoCadH+3OuM43EPjU
6gdwe8XHh9E5+K75CdbXKSgxOnExMjk6ANnb07Dgt9y0ooT+Mepghv8rj8FSjXS6cnGYLlMMpCrT
Cwaj6WtTgbGyFClc4RsIf5nLgKcW4lFpExMplN0IU8+jW0JLlakWzSJ7oEwDFuTsOPBRfiYSWuY7
n2Mnf5R2Y5htx43/2T4aMrnKYQKTWRsF2v/aBpLu0KYLROEfeEtfKSgxOnUxMjg6HpzBHvPFirM+
gWafiYuf3+HrorIqqr191+TnBBYUKLRhU99c+DWrTOfTN8sHnznKe+eLoZq1lsGcZMbn0MQJwzNe
tL5nB4PiFKvQsWWMVohiwR3O0jVlkCEpPSMgOKsR5Dml6jxrH93GuNpu7heUEtcvKh6PluV6wcOo
+GxS918pKSg3OmNvbW1lbnQxMTpka2dAemVsZW5rYSkpAA==


here is a copy of another example generated key (not b64-encoded), if you want to just download it.

fwiw, i've just tried loading the same keyfile that the s390x (64-bit big-endian) implementation choked on into a running gpg-agent on an amd64 machine (64-bit little-endian) and gpg --full-generate-key succeeded with that same key on amd64.

Likewise, i've taken a *.key S-expression file from an amd64 machine where it was used successfully to generate a key and put it on an s390x machine; it failed on the s390x machine.

So there doesn't seem to be anything wrong with the key file itself or how it was created, just with how the 64-bit big-endian machines deal with it during key generation.

And, i just discovered that when i manually edit the key to remove the (comment) list from the *.key S-expression file, everything works fine on s390x. so the failure appears to be due to the (comment), just like in T4490.

It looks to me like gcry_sexp_canon_len is returning 0 on these platforms from within a backtrace like this:

#0  agent_readkey (ctrl=<optimized out>, fromcard=fromcard@entry=0,
    hexkeygrip=<optimized out>, r_pubkey=r_pubkey@entry=0x3ffffffe9b0)
    at /usr/include/s390x-linux-gnu/gpg-error.h:924
#1  0x00000001000785e0 in do_create_from_keygrip (
    ctrl=<error reading variable: value has been optimized out>,
    algo=algo@entry=1, hexkeygrip=<optimized out>, pub_root=0x10011b2c0,
    timestamp=timestamp@entry=1557800141, expireval=0, is_subkey=0)
    at ../../g10/keygen.c:1294
#2  0x000000010007cc4c in do_generate_keypair (card=0, outctrl=0x3ffffffec58,
    para=0x10011b210, ctrl=<optimized out>) at ../../g10/keygen.c:4704
#3  proc_parameter_file (ctrl=<optimized out>, ctrl@entry=0x100118ec0,
    para=para@entry=0x10011b210, fname=<optimized out>,
    fname@entry=<error reading variable: value has been optimized out>,
    outctrl=0x3ffffffec58,
    outctrl@entry=<error reading variable: value has been optimized out>,
    card=card@entry=0) at ../../g10/keygen.c:3668
#4  0x000000010007f6a0 in read_parameter_file (fname=<optimized out>,
    ctrl=0x100118ec0) at ../../g10/keygen.c:3777
#5  generate_keypair (ctrl=0x100118ec0, full=<optimized out>,
    fname=<optimized out>, card_serialno=<optimized out>,
    card_backup_key=<optimized out>) at ../../g10/keygen.c:4146
#6  0x0000000100016248 in main (argc=<optimized out>, argv=<optimized out>)
    at ../../g10/gpg.c:4468

Ok, the difference appears to be that on these 64-bit big-endian platforms, they're returning a zero-byte string for the associated comment. When this happens, gcry_sexp_canon_len returns 0 because of GPG_ERR_SEXP_ZERO_PREFIX. The same thing happens on x86_64 platforms when confronted with such an s-expression.

Fwiw, i don't understand S-expressions well enough to know why that is actually a problem, but i'll just assume that it actually is a problem.

So anyway, the problem is now narrowed down to why gpg-agent on these 64-bit big-endian platforms do this to the comment string during a READKEY, when they don't do it on other platforms.

OK, i think the reason this is happening is that agent_public_key_from_file (in agent/findkey.c) is screwing up a %b format string in gcry_sexp_build_array.

That function assumes that a %b points to an int, and then a char *. But comment_length (and uri_length) are not ints, they are size_ts.

On many 64-bit platforms, int is 4 octets long, and size_t is 8 octets. On little-endian platforms, a pointer to an size_t will point to the least-significant octets, so it works (accidentally) to pass a pointer to the size_t itself where a pointer to an int is expected. But on big-endian platforms, the pointer to the size_t points to the most-significant octets.

There are 36 lines in the GnuPG codebase that mention %b. i haven't audited all of them for this kind of thing, but it seems pretty clear that the uri and comment in agent_public_key_from_file are wrong.

I've just pushed e4a158faacd67e15e87183fb48e8bd0cc70f90a8 to branch dkg/fix-T4501 as a proposed fix for this specific problem (it doesn't introduce anything in the test suite, or try to deal with any of the other %b problems).

I can confirm that this fix repairs the problem on debian's s390x.

I think this patch should be backported to STABLE-BRANCH-2-2

Good catch. Thanks for that work. I'll apply it to master and 2.2.

werner claimed this task.