Page MenuHome GnuPG

dirmngr: retry without SRV due to buggy routers
Open, HighPublic

Description

It took me a while to realize that the below error was first related to dirmgr:

$ gpg --search-keys E4053F8D0E7C4B9A0A20AB27DC553250F8FE7407
gpg: error searching keyserver: Server indicated a failure
gpg: keyserver search failed: Server indicated a failure

It took me a bit longer then to realize that the issue was that my router, a R6300v2 replies to SRV queries for _pgpkey-https as a "format error". One can use dirmgr to reproduce the issue:

$ dirmngr
KEYSERVER --resolve
S # hkps://hkps.pool.sks-keyservers.net:443: resolve failed: Server indicated a failure

KS_SEARCH -- E4053F8D0E7C4B9A0A20AB27DC553250F8FE7407
dirmngr[18749.0]: command 'KS_SEARCH' failed: Server indicated a failure <Unspecified source>
ERR 219 Server indicated a failure <Unspecified source>

Using my ISP DNS server (comcat, 75.75.75.75) works, just as using Google DNS (8.8.8.8), however by default these routers are configured by default to tell clients they are the DNS server, and they scrap your request prior to forwarding them to your ISP DNS server.

The Router has the latest firmware: V1.0.4.12_10.0.81 so this likely won't be fixed any time soon...

I've captured through tcpdump what the router replies with, and I'll attach it to the bug report. I've also come up with what I thought would have been a patch to fix the issue, I'll send an RFC to the devel mailing list and refer to this bug report.

Details

Version
gnupg-2.2.3-71-g918792befd83

Event Timeline

is a tcpdump you can visualize with wireshark to see the response from the buggy AP. This begs the question how many other buggy APs are out there. Note that the issues would happen even if I did not use https or hkps, I actually tried all sorts of combinations with this AP and the only thing that worked was to not use it for DNS for hkp, but note that regular DNS requests do work.

This is a cheesy attempt on my part to try to resolve the issue, but it did not work to fix it.

old SRV bug which probably induced code changes for a regression. Its not sure if this is a regression yet or if the router issue is a regression / "feature".

Unconditionally retrying without SRV lookup is not a good idea. SRV record are there for a reason. What we could do is an option to skip SRV record lookups.

For reference here is @mcgrof's dump in a directly readable format:

00:29:33.472844 IP 192.168.4.7.10218 > 192.168.4.1.domain: 53039+ SRV? _pgpkey-https._tcp.hkps.pool.sks-keyservers.net. (65)
00:29:33.879268 IP 192.168.4.1.domain > 192.168.4.7.10218: 53039 FormErr 0/0/0 (65)
00:29:33.880719 IP 192.168.4.7.10218 > 192.168.4.1.domain: 51133+ Type0 (Class 8448)? _pgpkey-https._tcp.hkps.pool.sks-keyservers.net. (66)
00:29:33.902115 IP 192.168.4.1.domain > 192.168.4.7.10218: 51133 FormErr 0/0/0 (65)
werner edited projects, added Feature Request; removed Bug Report.

An option to ignore SRV records would also be good for debugging. Thus I raised the priority and truned this into a feature request.