dirmngr fails repeatedly with "invalid argument", without kicking the host from its list
Closed, DuplicatePublic

Description

I have dirmngr configured to use hkps://hkps.pool.sks-keyservers.net.

when i try to retrive a key, sometimes gpg fails with:

0 dkg@alice:~/src/sks$ gpg --refresh 41259773973A612A
gpg: refreshing 1 key from hkps://hkps.pool.sks-keyservers.net
gpg: keyserver refresh failed: Invalid argument
2 dkg@alice:~/src/sks$

dirmngr's logs show:

2016-08-08 02:13:26 dirmngr[2805.1] DBG: chan_1 <- GETINFO version
2016-08-08 02:13:26 dirmngr[2805.1] DBG: chan_1 -> D 2.1.14
2016-08-08 02:13:26 dirmngr[2805.1] DBG: chan_1 -> OK
2016-08-08 02:13:26 dirmngr[2805.1] DBG: chan_1 <- KEYSERVER
2016-08-08 02:13:26 dirmngr[2805.1] DBG: chan_1 -> S KEYSERVER
hkps://hkps.pool.sks-keyservers.net
2016-08-08 02:13:26 dirmngr[2805.1] DBG: chan_1 -> OK
2016-08-08 02:13:26 dirmngr[2805.1] DBG: chan_1 <- KS_GET --
0xC90EF1430B3AC0DFD00E6EA541259773973A612A
2016-08-08 02:13:26 dirmngr[2805.1] DBG: gnutls:L3: ASSERT:
mpi.c[_gnutls_x509_read_uint]:246
2016-08-08 02:13:26 dirmngr[2805.1] DBG: gnutls:L5: REC[0x7f1a48016e20]:
Allocating epoch #0
2016-08-08 02:13:26 dirmngr[2805.1] can't connect to '2001:ba8:1f1:f2d4::2':
Invalid argument
2016-08-08 02:13:26 dirmngr[2805.1] error connecting to
'https://[2001:ba8:1f1:f2d4::2]:443': Invalid argument
2016-08-08 02:13:26 dirmngr[2805.1] DBG: gnutls:L5: REC[0x7f1a48016e20]: Start
of epoch cleanup
2016-08-08 02:13:26 dirmngr[2805.1] DBG: gnutls:L5: REC[0x7f1a48016e20]: End of
epoch cleanup
2016-08-08 02:13:26 dirmngr[2805.1] DBG: gnutls:L5: REC[0x7f1a48016e20]: Epoch
#0 freed
2016-08-08 02:13:26 dirmngr[2805.1] command 'KS_GET' failed: Invalid argument
2016-08-08 02:13:26 dirmngr[2805.1] DBG: chan_1 -> ERR 167804976 Invalid
argument <Dirmngr>

When i simply retry the query, i end up with the same exact failure on the same
host.

Ideally, dirmngr should happily connect. gnutls-cli is capable of connecting to
that host using TLS (though i haven't tried verifying the certificate through
gnutls-cli).

At the very least, i'd expect dirmngr to reject that particular member of the
pool and try a different pool member on subsequent attempts.

Details

Version
2.1.14
dkg set Version to 2.1.14.Aug 8 2016, 8:23 AM
dkg added projects: dirmngr, Bug Report.
dkg added a subscriber: dkg.
dkg added a comment.Aug 8 2016, 8:25 AM

I note that if i restart dirmngr it will just choose a new member of the pool
and that member will work.

dkg added a comment.Nov 8 2016, 6:00 PM

I'm also seeing this behavior when there is something wrong with the reverse DNS
lookups. For example:

Nov 08 10:54:36 alice dirmngr[1714]: handler for fd 5 started
Nov 08 10:54:36 alice dirmngr[1714]: DBG: chan_5 -> # Home: /home/dkg/.gnupg
Nov 08 10:54:36 alice dirmngr[1714]: DBG: chan_5 -> # Config:
/home/dkg/.gnupg/dirmngr.conf
Nov 08 10:54:36 alice dirmngr[1714]: DBG: chan_5 -> OK Dirmngr 2.1.15 at your
service
Nov 08 10:54:36 alice dirmngr[1714]: connection from process 7623 (1000:1000)
Nov 08 10:54:36 alice dirmngr[1714]: DBG: chan_5 <- GETINFO version
Nov 08 10:54:36 alice dirmngr[1714]: DBG: chan_5 -> D 2.1.15
Nov 08 10:54:36 alice dirmngr[1714]: DBG: chan_5 -> OK
Nov 08 10:54:36 alice dirmngr[1714]: DBG: chan_5 <- KEYSERVER
Nov 08 10:54:36 alice dirmngr[1714]: DBG: chan_5 -> S KEYSERVER
hkps://hkps.pool.sks-keyservers.net
Nov 08 10:54:36 alice dirmngr[1714]: DBG: chan_5 -> OK
Nov 08 10:54:36 alice dirmngr[1714]: DBG: chan_5 <- KS_GET --
0x2E8DD26C53F1197DDF403E6118E667F1EB8AF314
Nov 08 10:54:36 alice dirmngr[1714]: DBG: gnutls:L3: ASSERT:
mpi.c[_gnutls_x509_read_uint]:246
Nov 08 10:54:36 alice dirmngr[1714]: DBG: gnutls:L5: REC[0x7f7458003000]:
Allocating epoch #0
Nov 08 10:54:36 alice dirmngr[1714]: can't connect to 'oteiza.siccegge.de':
Invalid argument
Nov 08 10:54:36 alice dirmngr[1714]: error connecting to
'https://oteiza.siccegge.de:443': Invalid argument
Nov 08 10:54:36 alice dirmngr[1714]: DBG: gnutls:L5: REC[0x7f7458003000]: Start
of epoch cleanup
Nov 08 10:54:36 alice dirmngr[1714]: DBG: gnutls:L5: REC[0x7f7458003000]: End of
epoch cleanup
Nov 08 10:54:36 alice dirmngr[1714]: DBG: gnutls:L5: REC[0x7f7458003000]: Epoch
#0 freed
Nov 08 10:54:36 alice dirmngr[1714]: command 'KS_GET' failed: Invalid argument
Nov 08 10:54:36 alice dirmngr[1714]: DBG: chan_5 -> ERR 167804976 Invalid
argument <Dirmngr>
Nov 08 10:54:36 alice dirmngr[1714]: DBG: chan_5 <- BYE
Nov 08 10:54:36 alice dirmngr[1714]: DBG: chan_5 -> OK closing connection
Nov 08 10:54:36 alice dirmngr[1714]: handler for fd 5 terminated

This appears to be because the pool included 92.43.111.21, which has a PTR of
oteiza.siccegge.de, despite the fact that oteiza.siccegge.de has no A record.

There is no reason for dirmngr to be talking to the member of the pool by its
hostname, anyway -- it should make the connection by IP address, with the TLS
SNI set to the pool name.

gniibe added a subscriber: gniibe.Jun 20 2018, 3:46 AM

For the problem in the last comment, it was fixed in T2928: stop fetching PTR records entirely.
For the original issue, it looks that EINVAL is returned by the system call of connect(2).
That's quite strange, but, it was possible for IPv6.

One problem was fixed for handling IPv6 name resolution: rG892b33bb2c57: dirmngr: Fix alignment of ADDR.

This might be the cause of this issue.