stop fetching PTR records entirely
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	dkg
	Jan 23 2017, 10:25 PM

Description

I'm breaking out the discussion about reliance on PTR records into a new issue here.

Over on T2902 , Werner wrote:

To answer your question:
[dkg wrote:]

Can you explain why dirmngr does the DNS roundtrip lookup, mapping
from the pool's A and AAAA addresses back to names? It seems like
it'd be a lot simpler (and faster, and less error-prone) to avoid
the PTR lookups if we have the IP addresses already.

If it is a plain server and not a pool, looking up the PTR is
necessary to get the hostname for SNI and possible also for Host:
header.

This is *not* the standard way to set SNI for TLS, or the standard way to set
the Host: header for HTTP.

The expectation in the TLS and HTTP worlds is that the SNI and the Host: headers
will contain the name that the client is configured with, *not* the name
reported by PTR records.

Using PTR records to "scrub" SNI actually leaves a security vulnerability if the
client is willing to accept certificates that match the PTR-derived hostname,
instead of requiring certificates to match the client-configured hostname.

As for sending the Host: header -- as with a browser lookup, i'd expect any HTTP
client to send the requested (client-configured) hostname, not anything that has
passed through a DNS reverse lookup.

For a pool we would not need the name because the already known name
of the pool is used for SNI. However, to find duplicate hosts in the
hosttable it is useful to have the hostname.

I'm assuming that "duplicate hosts" means "hosts reachable via multiple
addresses". Is that what you mean? For example, if a keyserver foo.example.com
is reachable on both an IPv4 address, and is also reachable on an IPv6 address,
then those two addresses are in some sense "duplicates". is that right?

If so, i'm not convinced what dirmngr gains from that information. If i can't
reach host X via address A, does that mean i *should* or *should not* try
reaching it on address B?

We also return the
actual used hostname to gpg for information purposes and to eventually
store this with the key as meta info.

The "actual used hostname" should be the configured hostname, i think. any
additional metadata about the path that the key was recovered would be
interesting to have, but none of it is cryptographically authenticated. And if
we're not doing anything with it today, why should dimrngr block for several
more seconds (on what can already be a fairly high-latency operation) for info
that we're likely to just throw away?

Yes, we could the PTR lookup of pools faster or in the background -
but for now a simple appraoch is better for debugging.

sure, we can leave the discussion about parallelizing DNS queries and responses
to T2907 :)

For this ticket, let's focus on whether we can just do away with PTR entirely.

What would make sense: a patch to make it optional, with another config option
for dirmngr? or just a patch to remove reverse DNS requests entierly?

Details

External Link: https://bugs.debian.org/854359
Version: 2.1.18

Related Objects

Mentioned In: T2438: dirmngr fails repeatedly with "invalid argument", without kicking the host from its list
T2902: dimrngr over tor fails obscurely on IPv6 records when NoIPv6Traffic flag is set

Event Timeline

dkg added projects: dirmngr, gnupg, Bug Report, Debian.Jan 23 2017, 10:25 PM

dkg set Version to 2.1.17.

dkg added a subscriber: dkg.

dkg mentioned this in T2902: dimrngr over tor fails obscurely on IPv6 records when NoIPv6Traffic flag is set.Mar 30 2017, 7:19 PM

Here's a concrete example of how using PTR records gets things mixed up.

keyserver.stack.nl offers keyserver service on port 443.

It has an A record at 131.155.141.70.

But the ptr is to mud.stack.nl:

70.141.155.131.in-addr.arpa. 69674 IN PTR mud.stack.nl.

and the https SNI and HTTP Host: directives provide an entirely different
website depending on whether you access it with:

  https://mud.stack.nl/

  https://keyserver.stack.nl/

If you access it as https://hkps.pool.sks-keyservers.net/, you get the
"keyserver" view. But if you access it by the name in the PTR record
("mud.stack.nl") then you get the mud view (and a 404 on any /pks URLs)

Even more troubling is that dirmngr successfully connects to mud.stack.nl and
does the query, even though it is configured to only talk to
hkps.pool.sks-keyservers.net

This suggests that anyone able to spoof a PTR record to me can get my dirmngr to
send my potentially-sensitive keyserver queries to an entirely different webserver.

dkg changed Version from 2.1.17 to 2.1.18.Jan 24 2017, 5:39 AM

We have several cases:

A pool accessed via round-robin A/AAAA record: We do not use the canonical hostname (i.e. from the PTR) but the name of the pool for the certificate. This is the classical way how keyserver pools.

A pool access via SRV records: The SRV record has the canonical name and thus we do not need a PTR lookup. But we need a address lookup.

A keyserver specified by its name: We alread have the name thus no need for PTR lookup.

A keyserver specified by literal IP address: We need a host name for the certificate. Either we take it from the PTR record or we reject TLS access. I don't think that is is a real world use case but for debugging it is/was really helpful. Should we reject hkps via literal IP addresses?

It is quite possible that some of these cases do not work right. I
have done only manual testing and the matrix is pretty complex: We
have all combinations of direct/Tor, v4 only, v6 only, v4, v6,
interface up, network down.

Right, by "duplicate host", I mean hosts reachable by several addresses
and in particular by v4 and v6. My test back when I originally
implemented the code showed that when hosts are down their other
addresses are also down. Without marking the host dead, the code
would have tried the same request on another address and would run
into the next timeout.

I also think that most delays are due to connection problems and not due to DNS
problems. And most connection problems are due to lost network access. There
we might need to tweak the code a bit similar to what I did for ADNS.

for cases (1), (2), and (3) it sounds like you don't need the PTR at all. right?

For your case (4), i think we should reject hkps via literal IP addresses. It's
not a real-world use case, and if you want to test/experiment with hkps as a
developer, you should have at least the capacity to edit /etc/hosts (or whatever
your system's equivalent is). Anyway, trying to support this case for the
purposes of debugging doesn't make sense if support for this case is the cause
of the bugs in the first place ;)

re: duplicate hosts: I live in a part of the world where dual-stack
connectivity is sketchy at best. And, when connecting to things over Tor, it's
possible that connections to IPv4 hosts will have a different failure rate than
IPv6 connections.
So unless you already know that the host itself is down, why would you avoid
trying the other routes you have to it?

Look at it another way: when trying to reach host X, you discover that X has two
IP addresses, A and B. You try to reach A and it's not available. Isn't it
better to try B instead, rather than to avoid trying B at all just because A was
unreachable?

In a pool scenario, you might want to try to cluster addresses together by
perceived identity so that you can try an entirely different host first, rather
than a different address for the same host who happens to be in the pool twice.
But that strikes me as a very narrow optimization, certainly something that'd
only be worth implementing after we've squeezed the last bit of performance out
of other parts of the code (parallel connections, "happy eyeballs", etc).
Definitely not something to bother with at the outset. So i'd say drop that
optimization for simplicity's sake.

So the simplest approach is:

a) know the configured name of the keysserver
b) resolve it to a set of addresses
c) try to connect to those addresses, using the configured name of the server
for SNI and HTTP Host:

This is all that's needed for cases (1) and (3), and it could also be used in
case (2) if you see (b) as a two-stage resolution process (name→SRV→A/AAAA),
discarding the intermediate names from the SRV. Given that some people may
access the pool via case (1), and servers in the pool won't be able to
distinguish between how they were selected (SRV vs. A/AAAA), they'll still
accept the connections.

If you decide the additional complexity is worthwhile for tracking the
intermediate names in the SRV records, you can always propagate the intermediate
names wherever you like locally without changing the "simplest" algorithm.

If you really want to use the names from the SRV in collecting, then the
algorithm should change to:

a) know the configured name of the keyserver
b) resolve it to a set of intermediate names
c) resolve the intermediate names to a set of addresses
d) try to connect to those addresses, using the intermediate name of the server
for SNI and HTTP host.

But still, no PTR records are needed.

georg added a subscriber: georg.Feb 1 2017, 4:17 PM

The unnecessary PTR lookup is causing problems for other people too, over on
https://bugs.debian.org/854359