Page MenuHome GnuPG

scdaemon randomly hangs when trying to access a token
Closed, ResolvedPublic

Description

With GnuPG 2.5.x, I observe that scdaemon sometimes gets stuck when it tries to access a token (e.g. when asking GnuPG to sign something with a on-token private key, or when running gpg --card-status). It happens irregularly (most of the time everything works fine), but at least once or twice a day. Unfortunately I have not been able to find a way to reliably trigger the bug. It seems slightly more likely to happen if the daemon has been running for several hours, but it can also happen with a freshly spawn daemon.

When the bug happens, scdaemon does not seem to respond to anything. For example, if I try to send a SCD... command through gpg-connect-agent, then gpg-connect-agent itself will appear to be stuck as it waits for a reply. The only thing left to do is to forcibly kill (kill -9) scdaemon.

I have enabled full debugging output on scdaemon, but didn’t find anything noteworthy (to me at least) in the log. Here is the contents of the log from the last occurrence of the bug (triggered by a gpg --card-status command):

2024-09-13 22:19:33 scdaemon[9962] DBG: chan_9 <- GETINFO version
2024-09-13 22:19:33 scdaemon[9962] DBG: chan_9 -> D 2.5.1
2024-09-13 22:19:33 scdaemon[9962] DBG: chan_9 -> OK
2024-09-13 22:19:33 scdaemon[9962] DBG: chan_9 <- SERIALNO
2024-09-13 22:19:33 scdaemon[9962] DBG: apdu_open_reader: BAI=13302
2024-09-13 22:19:33 scdaemon[9962] DBG: apdu_open_reader: new device=13302
2024-09-13 22:19:33 scdaemon[9962] reader slot 0: using ccid driver
2024-09-13 22:19:33 scdaemon[9962] slot 0: ATR=3bfd1300008131fe158073c021c057597562694b657940
2024-09-13 22:19:33 scdaemon[9962] DBG: enter: apdu_connect: slot=0
2024-09-13 22:19:33 scdaemon[9962] DBG: leave: apdu_connect => sw=0x0
2024-09-13 22:19:33 scdaemon[9962] DBG: send apdu: c=00 i=A4 p1=00 p2=0C lc=2 le=-1 em=0
2024-09-13 22:19:33 scdaemon[9962] DBG:  raw apdu: 00a4000c023f00
2024-09-13 22:19:33 scdaemon[9962] DBG:  response: sw=6D00  datalen=0
2024-09-13 22:19:33 scdaemon[9962] DBG: send apdu: c=00 i=A4 p1=04 p2=00 lc=8 le=-1 em=0
2024-09-13 22:19:33 scdaemon[9962] DBG:  raw apdu: 00a4040008a000000527471117
2024-09-13 22:19:33 scdaemon[9962] DBG:  response: sw=9000  datalen=30
2024-09-13 22:19:33 scdaemon[9962] DBG:      dump: 5669727475616c206d6772202d2046572076657273696f6e20352e322e34
2024-09-13 22:19:33 scdaemon[9962] DBG:  raw apdu: 001d000000
2024-09-13 22:19:33 scdaemon[9962] DBG:  response: sw=9000  datalen=39
2024-09-13 22:19:33 scdaemon[9962] DBG:       dump:  260102023f0302023f020400b9828104010305030502040602000007010f0801 \
2024-09-13 22:19:33 scdaemon[9962] DBG:              000a01000f01009000
2024-09-13 22:19:33 scdaemon[9962] Yubico: config=260102023f0302023f020400b9828104010305030502040602000007010f0801000a01000f0100
2024-09-13 22:19:33 scdaemon[9962] DBG: send apdu: c=00 i=A4 p1=04 p2=00 lc=6 le=-1 em=0
2024-09-13 22:19:33 scdaemon[9962] DBG:  raw apdu: 00a4040006d27600012401
2024-09-13 22:19:33 scdaemon[9962] DBG:  response: sw=9000  datalen=0
2024-09-13 22:19:33 scdaemon[9962] DBG: send apdu: c=00 i=CA p1=00 p2=4F lc=-1 le=256 em=0
2024-09-13 22:19:33 scdaemon[9962] DBG:  raw apdu: 00ca004f00
2024-09-13 22:19:33 scdaemon[9962] DBG:  response: sw=9000  datalen=16
2024-09-13 22:19:33 scdaemon[9962] DBG:      dump: d2760001240103040006121575690000
2024-09-13 22:19:33 scdaemon[9962] AID: d2760001240103040006121575690000
2024-09-13 22:19:33 scdaemon[9962] DBG: send apdu: c=00 i=CA p1=5F p2=52 lc=-1 le=256 em=0
2024-09-13 22:19:33 scdaemon[9962] DBG:  raw apdu: 00ca5f5200
2024-09-13 22:19:33 scdaemon[9962] DBG:  response: sw=9000  datalen=8
2024-09-13 22:19:33 scdaemon[9962] DBG:      dump: 00730000e0059000
2024-09-13 22:19:33 scdaemon[9962] Historical Bytes: 00730000e0059000
2024-09-13 22:19:33 scdaemon[9962] DBG: send apdu: c=00 i=CA p1=00 p2=C4 lc=-1 le=256 em=0
2024-09-13 22:19:33 scdaemon[9962] DBG:  raw apdu: 00ca00c400
2024-09-13 22:19:33 scdaemon[9962] DBG:  response: sw=9000  datalen=7
2024-09-13 22:19:33 scdaemon[9962] DBG:      dump: ff7f7f7f030003
2024-09-13 22:19:33 scdaemon[9962] DBG: send apdu: c=00 i=CA p1=00 p2=6E lc=-1 le=256 em=0
2024-09-13 22:19:33 scdaemon[9962] DBG:  raw apdu: 00ca006e00
2024-09-13 22:19:33 scdaemon[9962] DBG:  response: sw=614B  datalen=256
2024-09-13 22:19:33 scdaemon[9962] DBG: apdu_send_simple(0): 75 more bytes available
2024-09-13 22:19:33 scdaemon[9962] DBG:  raw apdu: 00c000004b
2024-09-13 22:19:33 scdaemon[9962] DBG:      more: sw=9000  datalen=75
2024-09-13 22:19:33 scdaemon[9962] DBG:      dump: 6e8201474f10d27600012401030400061215756900005f520800730000e00590 \
2024-09-13 22:19:33 scdaemon[9962] DBG:            007f740381012073820120c00a7d000bfe080000ff0000c10b162b06010401da \
2024-09-13 22:19:33 scdaemon[9962] DBG:            470f012bc20c122b060104019755010501b5c30b162b06010401da470f01bcda \
2024-09-13 22:19:33 scdaemon[9962] DBG:            06010800001100c407ff7f7f7f030003c55080cc1b8d04c262ddfee1980c6f7f \
2024-09-13 22:19:33 scdaemon[9962] DBG:            0f91d138fc7b4bbb9bb829a79d92861ea79dbf74efbebf46197e12056232ae34 \
2024-09-13 22:19:33 scdaemon[9962] DBG:            95d37cf0a61ffa5810ba9eebad8f000000000000000000000000000000000000 \
2024-09-13 22:19:33 scdaemon[9962] DBG:            0000c65000000000000000000000000000000000000000000000000000000000 \
2024-09-13 22:19:33 scdaemon[9962] DBG:            0000000000000000000000000000000000000000000000000000000000000000 \
2024-09-13 22:19:33 scdaemon[9962] DBG:            0000000000000000000000000000000000000000cd105b16f37d5b16f3ab5b0b \
2024-09-13 22:19:33 scdaemon[9962] DBG:            1a7b00000000de0801020202030281027f660802020bfe02020bfed6020020d7 \
2024-09-13 22:19:33 scdaemon[9962] DBG:            020020d8020020d9020020
2024-09-13 22:19:33 scdaemon[9962] DBG: send apdu: c=00 i=CA p1=00 p2=5E lc=-1 le=65534 em=255
2024-09-13 22:19:33 scdaemon[9962] DBG:  raw apdu: 00ca005e00fffe
2024-09-13 22:19:33 scdaemon[9962] DBG:  response: sw=9000  datalen=0
2024-09-13 22:19:33 scdaemon[9962] DBG:      dump: [all zero]
2024-09-13 22:19:33 scdaemon[9962] Version-2+ .....: yes
2024-09-13 22:19:33 scdaemon[9962] Version-3+ .....: yes
2024-09-13 22:19:33 scdaemon[9962] Button .........: yes
2024-09-13 22:19:33 scdaemon[9962] SM-Support .....: no
2024-09-13 22:19:33 scdaemon[9962] Get-Challenge ..: yes (3070 bytes max)
2024-09-13 22:19:33 scdaemon[9962] Key-Import .....: yes
2024-09-13 22:19:33 scdaemon[9962] Change-Force-PW1: yes
2024-09-13 22:19:33 scdaemon[9962] Private-DOs ....: yes
2024-09-13 22:19:33 scdaemon[9962] Algo-Attr-Change: yes
2024-09-13 22:19:33 scdaemon[9962] Symmetric Crypto: no
2024-09-13 22:19:33 scdaemon[9962] KDF-Support ....: yes
2024-09-13 22:19:33 scdaemon[9962] Max-Cert-Len ...: 2048
2024-09-13 22:19:33 scdaemon[9962] PIN-Block-2 ....: no
2024-09-13 22:19:33 scdaemon[9962] MSE-Support ....: no
2024-09-13 22:19:33 scdaemon[9962] Max-Special-DOs : 255
2024-09-13 22:19:33 scdaemon[9962] Cmd-Chaining ...: yes
2024-09-13 22:19:33 scdaemon[9962] Ext-Lc-Le ......: yes
2024-09-13 22:19:33 scdaemon[9962] Status-Indicator: 05
2024-09-13 22:19:33 scdaemon[9962] GnuPG-No-Sync ..: no
2024-09-13 22:19:33 scdaemon[9962] GnuPG-Def-PW2 ..: no
2024-09-13 22:19:33 scdaemon[9962] Key-Attr-sign ..: ECC, curve=Ed25519 (eddsa)
2024-09-13 22:19:33 scdaemon[9962] Key-Algo-sign ..: ed25519
2024-09-13 22:19:33 scdaemon[9962] Key-Attr-encr ..: ECC, curve=Curve25519 (djb-tweak)
2024-09-13 22:19:33 scdaemon[9962] Key-Algo-encr ..: cv25519
2024-09-13 22:19:33 scdaemon[9962] Key-Attr-auth ..: ECC, curve=Ed25519 (eddsa)
2024-09-13 22:19:33 scdaemon[9962] Key-Algo-auth ..: ed25519
2024-09-13 22:19:33 scdaemon[9962] DBG: chan_9 -> S SERIALNO D2760001240100000006121575690000
2024-09-13 22:19:33 scdaemon[9962] DBG: chan_9 -> OK
2024-09-13 22:19:33 scdaemon[9962] DBG: enter: apdu_get_status: slot=0 hang=0
2024-09-13 22:19:33 scdaemon[9962] DBG: leave: apdu_get_status => sw=0x0 status=7
2024-09-13 22:19:33 scdaemon[9962] DBG: chan_9 <- LEARN --force

System details: GNU/Linux 6.6.37, glibc-2.33, GnuPG 2.5.0 and 2.5.1; the tokens involved are a Yubikey 5 and a FST-01/GnuK (both tokens are working fine on another machine with GnuPG 2.4.5).

Details

Version
2.5.1

Event Timeline

werner triaged this task as Unbreak Now! priority.Sep 15 2024, 10:22 AM
werner added a subscriber: werner.

Actually we have similar problems with the 2.4 branch - in particular on Windows. We can quite easy reproduce this when using Kleopatra. Thus the working hypothesis is a locking problem because Kleopatra uses several threads and the order of actions recently changed.

It is good that you can sometimes replicate this with standard tools on Windows. Thanks for reporting.

gniibe lowered the priority of this task from Unbreak Now! to High.Sep 18 2024, 6:55 AM

For POSIX, I found an issue of T7151.
If you are using multiple clients which invoke SCD DEVINFO --watch, the problem should be fixed.
If it's not the case, let us locate a bug (or two).

Do you have scd-event script in your GNUPGHOME?

I don’t think I have any other client that would invoke SCD DEVINFO --watch.

But I do have a scd-event script (sorry for not mentioning that earlier, I had almost forgotten about that script), and I realized that it was in fact not called! So I tried removing it, and since then the bug has not occurred yet.

The change of spawn functions (basically, it was factored out to libgpg-error) results more possibilities to have blocked scdaemon, since the implementation in libgpg-error introduced context switch points when it spawns a child process.

With no scd-event script, it might improve the situation (less context switches among threads), but I think there is a real bug.

I don’t think I have any other client that would invoke SCD DEVINFO --watch.

That's my badness, I didn't consider well about this particular use case (which is common). I fixed another bug in T7151: graceful shutdown: DEVINFO should be a gpg-agent command: also watching input close.
rG0a94582af5b1: scd: Fix DEVINFO, allowing no clients which watch the change.

With no scd-event script, it might improve the situation

That clearly seems to be the case indeed, as I have not observed the bug at all ever since I removed the scd-event script.