Linux connection tracking and (Slow) DNS

kb.isc.org

46 points by justinludwig 3 days ago

kokey 2 days ago

The fun part is that in some cases just listing the iptables rules with an iptables -L will cause it to load the conntrack module and the default max for this is very low for anything that is a DNS server or perform a lot of DNS lookups. That's why it's a good idea to always set the sysctl nf_conntrack_max value quite high even if you aren't using conntrack. The actual sysctl key for nf_conntrack is different depending on the version of the kernel, it's net.netfilter.nf_conntrack_max nowadays.

lathiat 2 days ago

There is also a trap, which is that setting this in /etc/sysctl.conf or /etc/sysctl.d doesn't work, because the module isn't always yet loaded when those are set.
One fix is to load nf_conntrack at boot by adding it to the module load list
https://bugs.launchpad.net/bugs/1922778 https://github.com/canonical/microk8s/issues/4462
On a related note, the sosreport tool which collects outputs of a zillion different commands for diagnostics purposes, goes to great lengths and CI tests that no kernel modules are loaded by any of the plugins, for basically this same reason.
e.g. If the modules aren't already loaded, it will avoid running iptables -L and various other tricks: https://github.com/sosreport/sos/issues/1435 https://github.com/sosreport/sos/issues/2978

burnt-resistor 2 days ago

What a terrible website. It moves the viewport around and interferes with scrolling.

nubinetwork 2 days ago

This sounds like bad advice, I don't know why ISC is pushing this... they would be better off trying to make DNS a TCP-only service to stop amplification attacks.

bogantech 2 days ago

> This sounds like bad advice
Please elaborate.
As they say a typical DNS request comes in as one packet and is replied in one packet, there is no ongoing connection so there's no point keeping tracking information.
The implication of not tracking the connection is that any packets will have to match a more specific rule than the "allow established,related" at the top of the firewall chain.
> they would be better off trying to make DNS a TCP-only service to stop amplification attacks.
Sure, lets get literally everyone on the intenet to agree to a new version of DNS that uses TCP...
Even if you do that - the problem moves from conntrack filling up we can fill up on ephemeral ports stuck in TIME_WAIT because some genius thought a service that doesn't maintain a connection should use TCP
- josephcsible 2 days ago
  
  > lets get literally everyone on the intenet to agree to a new version of DNS that uses TCP
  That's already done. DNS servers already all speak both TCP and UDP. Try "dig google.com @8.8.8.8 +tcp".
citrin_ru 2 days ago

TCP is less efficient for request-response protocol. The root of the problem (DDoS with amplification) IMHO is not DNS but ISPs which allow to spoof source addresses. Most don’t allow. RFC2827 (BCP38) was published >20 years ago and the problem was not new even back then. How bad guys find ISP (or hostings) permitting src IP spoofing? Is there a way to encourage such ISP to follow BCP38?
- vetrom 2 days ago
  
  You could de-peer/internet-death-penalty them, but, as is often the case, there is not alignment between the business objectives and network operator objectives.
Dylan16807 2 days ago

If you want to stop UDP DNS from being able to amplify, require bigger query datagrams.
- citrin_ru 2 days ago
  I would rather prefer responses to become smaller. If you would check TXT record for almost any big company you'll find a lot of verification records which either unnecessary (because better way to confirm domain ownership exists, e. g. by adding a DNS record with unique name instead of using main domain TXT record) or outdated (e. g. they did verify multiple times but kept records from all attempts). And more generally big companies tend to treat domain's DNS TXT record as an append-only structure and never clean junk it accumulates.
  host -t txt amazon.com | wc -l 35
  and that's not the worst example unfortunately.
  - burnt-resistor 2 days ago
    
    > never clean junk it accumulates
    That's true of everything inside a corporate codebase. There's no reward for refactoring, only adding new features or fixing a SEV1. Why should that be everyone else's problem because they can't clean it up?

dijit 2 days ago

I’m concerned that this is output generated by an LLM (specifically chatgpt) as the writing style is eerily similar.

iptables conntrack is indeed a huge menace, but you should bypass conntrack entirely for local network connections as you don’t need it.

The only thing conntrack would give you for local requests is better logging, but YAGNI.

dotancohen 2 days ago

Doesn't seem like LLM output to me. Rather, it seems like unnecessary text padding with pseudo stories to please (possibly outdated) SEO strategies.
zamadatix 2 days ago

The original page is near identical to the 2020 version https://web.archive.org/web/20200812150324/https://kb.isc.or... (change from some units going from e.g. M to MB and the console snippets going from being labeled as "Plain text" rather than "Shell") and it certainly wasn't written with a GPT-2 quality system.
I'm concerned when enough time passes it'll be impossible to do that kind of back-checking proof so "it seems like an LLM"-isms will become a self-confirming prophecy which can't reasonably be disproven. If one feels an article is bad it's sufficient to talk about how the article itself is bad. Hypothesizing of how the bad article came to be written doesn't offer insight to what's wrong with the article, it does instead introduce prejudices based on expectations from perceived style of writing rather than content though.
> iptables conntrack is indeed a huge menace, but you should bypass conntrack entirely for local network connections as you don’t need it.
> The only thing conntrack would give you for local requests is better logging, but YAGNI.
Some people and places don't like the idea an internal device should unilaterally be able to communicate to anything internal and vice versa. Particularly for devices hosting external services. Many even go as far as to host external DNS as a fully isolated service in a DMZ with no internal access allowed by both self and connected FW filters. Not everyone goes full hog with it though but those that do will want to keep not only attempted connection logging but also that multiple layers of security beyond "internal vs external conversation source/destination".

TacticalCoder a day ago

How old is that article? (not that it's bad but it feels a bit old)

It references very old Linux kernels, Slackware 10 (released in 2004), old hardware with little RAM, it talks about iptables (it still exists and the syntax is fine, but it's just now mostly an abstraction/compatibility layer on top of nftables) and there's no mention of IPv6 (if I'm not mistaken on most stacks now DNS queries are made for both IPv4 and IPv6).