Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Why updating DNS is slow (wizardzines.com)
121 points by 0xedb on Feb 7, 2021 | hide | past | favorite | 40 comments


As as aside, I recently goofed up our company website DNS (updated a record on long TTL with an incorrect), but quickly fixed it and found a partial workaround to propagation: flush the DNS cache of Google[1] and Cloudflare[2]. It helped with DNS cache refresh within minutes from most global locations, if not all.

[1] https://dns.google/cache [2] https://1.1.1.1/purge-cache/


OpenDNS was the first to expose cache clearing to end users: http://cachecheck.opendns.com/


I find it interesting that you can do this for any website at all…


It only costs them a single lookup


It's not an attack surface. You're posting relatively tons of data to ask them to do one small lookup.


It could be one aspect of a cache poisoning attack. One of the things DNSSEC was to protect against.


I was curious as to how a certificate could be issued to an IP address and not a domain (I didn't know that such a thing was possible), and learned this:

    X509v3 Subject Alternative Name: 
        DNS:cloudflare-dns.com, DNS:*.cloudflare-dns.com, DNS:one.one.one.one, IP Address:1.1.1.1, IP Address:1.0.0.1, IP Address:162.159.36.1, IP Address:162.159.46.1, IP Address:2606:4700:4700:0:0:0:0:1111, IP Address:2606:4700:4700:0:0:0:0:1001, IP Address:2606:4700:4700:0:0:0:0:64, IP Address:2606:4700:4700:0:0:0:0:6400
You can inspect the certificate yourself with:

    openssl s_client -showcerts -connect 1.1.1.1:443 < /dev/null | openssl x509 -text -noout
Too bad Let's Encrypt will not issue certificates for IP's at this time.


I just did it for fb.com. What impact does this have on such huge websites?


None. For the recursive resolver it is effectively the same as a TTL expiration. The next request initiates a recursion which adds a small number of milliseconds to the response time and is recached.


I would imagine it does some kind of atomic swap, where it's just force updating an existing cache and not actually flushing it.


The cache probably gets re-filled again pretty quickly, if you were to spam this endpoint it'd be more of a problem but I would assume/hope there's rate limiting on it.


Nice find! Filing that away for future reference.


Would be nice if they didn't stomp on real addresses.

> The blocks 192.0.2.0/24 (TEST-NET-1), 198.51.100.0/24 (TEST-NET-2), and 203.0.113.0/24 (TEST-NET-3) are provided for use in documentation.

-- RFC5737 https://tools.ietf.org/html/rfc5737


Just wondering, how can a network block have a "2" in "192.0.2.0/24"? Shouldn't it be "192.0.0.0/24"?


No, 192.0.0.0/24 and 192.0.2.0/24 are completely separate networks.

See "Classless Inter-Domain Routing" [0] (a.k.a. "CIDR")

--

[0]: https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing


/24 is a 256 host subnet, so you can have 1-254 (0 and 255 are special) in the last "octet" (each section of an IP address is denoted as an octet).

You can play with the numbers here [0] to see valid combos, like this: 192.1.254.19/24 (or use a 255.255.255.0 subnet mask)

0 - https://www.calculator.net/ip-subnet-calculator.html?cclass=...


/24 means that the first 24 bits are the subnetwork mask, ie: xxxxxxxx.xxxxxxxx.xxxxxxxx.NNN, where the x represent the network and the N represent hosts on said network.

192.168.1.x/24 and 192.168.2.x/24 are two separate networks.


On the other hand, there's something to be said for not using real identifiers in examples.


Hmm.. now I realise that 'live' in 'time to live' is 'live' as in 'life', not 'live' in the 'going live' sense ... It's the record's cache time! Engineering/techy acronyms are a bad place for homographs...


I've always thought of it as 'live' as in 'life', but I learned about TTL first in the context of IP packets where the meaning is probably less ambiguous.


I've had this discussion with multiple colleagues. I say it live as in life. However, most people I know say live as in "until the new record is live"


Is homograph a synonym of homophone or homonym?


technically neither;

homographs are two words written the same way but with different meaning.

homophones are two words that sound the same but written differently.

homonyms are somewhat a combination: written and sound the same, but have different meaning.

since synonym means "having the same meaning" you might get away with considering a homograph a synonym of homonym, but it's best to be precise in these cases.


There are two types of useful documentation or training materials.

1. The detailed and accurate type which covers everything you need to know and takes time to work through and gather the parts you need to learn.

2. High level gists which share only the basics you need to gather the intuition for the space. This type helps make learning from the first type easier. These comics are the second type. I ran through the questions for TLS and found them helpful. I had so many questions a few months ago and these types of resources are so good for getting you to asking the right questions.


Once you do a (planned) migration of a few sites, you quickly learn to set the DNS TTL to something small beforehand. Alternatively (if possible), keep both IP blocks active for the TTL duration.


Many (most?) last hop DNS resolvers (ISP) set their own cache time, sadly.


Not to mention some IOT devices or cellular routers/devices might force a longer minimum cache TTL, to "minimize data usage". I worked on a device that did so (minimum pdnsd cache time of 25 hours). It ended up causing failures when our cloud server did a cut-over, dropping the TTLs 24 hours in advance. :(


I wonder how relevant "long" TTLs (5min, 30min, etc.) still are in an age of massive multi-gigabit fiber links? DNS was invented long ago, a very simple protocol with a handful of bytes per packet - a trickle amid the torrent. Waiting an hour or more for a 4-byte change (an IPv4 address) to be committed to a distributed database seems incredibly antiquated.


Argh, I hate this kind of question, it overtly implies that because we have more of something (computing resources, memory, networking bandwidth) we should fill the void with something "better", where "better" means different things to different people.

DNS doesn't really work this way anyway, you have a tree-like structure and if you're a good little resolver you don't just talk to your upstream DNS server, you walk the path from the root to the stem.

I.E; you don't ask your ISPs resolver for server-a.www.google.com you ask the DNS root (".") for who owns com, then you ask "com" who owns google, and you keep going down the NS records you're on the stem, then you ask for an A or a CNAME record.. Then you cache that result _because it's expensive to do that lookup_- and often (ESPECIALLY) on the internet a webpage will ask you to download a dozen more things from the same domain, so a cache makes absolute sense.

But, regardless of how expensive DNS is; you miss a _huge_ point about the tech that came before: it's foundational and fundamental to how things work, it's so "light" that we bake it into our products, and increasing the "cost" causes an exponential cost to end users because it's so widely adopted and essential.

This industry seems to want to eat itself, taking things that work decently well and noodling on them until something much more costly comes along, why on earth would we take DNS before tackling the much more necessary project of fixing email?


I don't know how suggesting tighter TTLs is indicative of "the industry eating itself" or that we can't fix email at the same time. Relax.

"com" and "." can obviously be cached a long time, they'll rarely change servers. No need to walk the whole tree every time - god forbid between multiple requests made by a single webpage like you're implying. Subdomains are typically only 1 or 2 names on top of the TLD.

I'm all for being efficient with computer resources, and like others said there's a latency component here. But seriously, DNS packets are tiny, typically 50 bytes or so.

Assuming, say, 80 bytes after UDP, IP, Ethernet framing etc. (just a ballpark) that's 640/10,000,000,000 bits in 1 second of time on a 10gbps fiber, or about 1 of 1.5 million DNS packets that fiber (of many fibers) could send every second.

At that scale, I don't see why TTLs of say 1 minute wouldn't be fine. Multiply everything above by 60 seconds and that's still an effective cache of a 50 byte response in computer time. Authoritative NS records can change less often, so intermediate servers can just check direct with the authoritative servers when records go stale. Say 100 bytes roundtrip once a minute.

I'm not taking about throwing away or redesigning DNS - just that tighter TTLs than typically recommended are probably fine and would make life a little easier with propagation delays.


Cloudflares automatic TTL is 5 minutes. Web browsers will look up your records and try to connect to one of the returned ips, if that one doesn't work it'll use another one. The fact that the client does this for us means we can keep the TTL quite high (5 minutes) without 5 minutes downtime if we lose connectivity to one IP.

This is also used to load balance, a single load balancer can't handle a trillion connections, so we round robin DNS for load balancing before the load balancers. There's also anycasting which is something cloudflare uses a lot. Which means you're announcing the same IP block from many locations, and with how BGP works (which is the defacto protocol for exchanging route information on the internet) the "closer" you are the more likely the route is to be chosen (AS path distance). Meaning sites behind Cloudflare is protected by thousands of servers listening to the two same addresses you see if you're behind a Cloudflare protected site and behind these thousands of servers are even more thousands of servers.

This means if i were to launch a 10Gbps DoS from my server, it would hit the Cloudflare datacenter closest to me, and there's nothing i can do about it, meaning the rest of the world is unaffected.

No, I'm not afiliated with CF. It's just an approachable example, i believe (but don't know) that Fastly has the same featureset (that I've mentioned above)


> But, regardless of how expensive DNS is; you miss a _huge_ point about the tech that came before: it's foundational and fundamental to how things work, it's so "light" that we bake it into our products, and increasing the "cost" causes an exponential cost to end users because it's so widely adopted and essential.

Not at all! It's a linear cost and if you set a 5 minute TTL for everything the impact to any particular end user would be almost nothing.


Which resolvers actually follow that path?


All resolvers either follow that path or ask someone else to follow that path. It’s the only way to get the answer.


It's arguably even MORE important in the age of fast links - because now a great percentage of the clock time is spent on handshakes and other "overhead" - so saving a round trip or three on a DNS lookup is comparatively larger than it used to be.


Caching this lookup helps to reduce the __latency__ at the client end. A local, often in RAM, DNS cache should honor the TTLs (for validity expiration) if the entries aren't purged out by other activity.


Assume you‘ll use Amazon Route53 as your DNS provider. You‘ll pay per lookup after some free-tier so doubling your TTL from 5min to 10min halfes your cost [1] . Sure, it‘s ~$0.40 / million requests but if you have a TTL of a few minutes and a often-accessed site you‘ll have some requests per day.

[1] https://aws.amazon.com/route53/pricing/?nc1=h_ls


Not too bad! Covers the major things most forget when trying to understand this question.


In my experience, updating DNS is surprisingly fast, even though I typically set my TTLs to 3 hours. Typically I don't have to wait more than 2-5 minutes. Faster than e.g. updating an avatar on GitHub.


This problem is pretty much fixed in Firefox. The cloudflair doh server updates records very quickly. Also has the side effect of unblocking the pirate bay for me.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: