I’ve always been more curious about how load balancers are supposed to be highly...

toast0 · on April 18, 2023

A classical load balancer runs in an HA hot-warm pair, with IP takeover --- when the secondary senses the primary has failed, it takes over the IP and begins serving. Depending on the type of load balancing and the software involved, this could be nearly seamless, or it could end all sessions in progress.

If you want to run hot-hot load balancing on a single IP, it's generally done with routing protocols. Equal cost multi-path (ECMP) will split traffic by hashing on some portion of the (source IP, dest IP, protocol, source port, dest port) 5-tuple; you'd configure your router to enable ECMP, and then your load balancers would advertise the IP via BGP or RIP or whatever is cool these days. Communication between load balancers to handle sessions that move during failover and bring-up is optional (if you don't do it, sessions will end abruptly); this setup is similar to anycast, although with anycast you may also see sessions move when external routing changes, and you really should manage that. You also should have a method to handle ICMP packets, most specificially needs-frag packets, as they will be sent from a different IP than the connection peer and will likely hash differently and may likely route differently for anycast, too.

You can use DNS to direct traffic to multiple load balancers, but DNS is not a precision instrument. It's useful for geographic balancing (in addition to anycast), but resolvers have a tendancy to cache results for longer than published TTLs and it takes significant effort to understand how much request traffic a given resolver will generate from one lookup. For balancing between two load balancers where you want roughly equal traffic, you also need to consider pathologic behavior like RFC 3484 and RFC 6724. These two RFCs suggest preferentially using IPs with a larger common prefix when multiple options are available. This only makes sense when the common prefix is meaningful. If your ISP was assigned 10.1.2.0/24 and I have service IPs of 10.1.7.3 and 10.2.4.5 and return both of those as A records, your resolver shouldn't really prefer one or the other, because beyond your ISP prefix, there's no actual network closeness implied by a similar IP. 'Smart' resolvers that follow this RFC can cause large scale traffic imbalances, so fun times there.

null0pointer · on April 19, 2023

Awesome response. You've answered a lot of questions I've been wondering about. Thanks!

samwho · on April 18, 2023

You've got it, DNS is what's often used to solve this problem. Do a few resolutions of reddit.com, you can see them returning 4 different IP addresses in a randomised order. :)

cedws · on April 18, 2023

CDNs such as Cloudflare also do anycast routing which means you can hit the same IP in different places in the world and get a response from the nearest point-of-presence.

zokier · on April 18, 2023

Floating/Virtual IPs are one networking solution for HA. With Corosync/Pacemaker a cluster of hosts can decide how the ips are then assigned to physical hosts. CARP maybe/probably can do the same. I just googled and apparently there is also some IPVS/LVS project for Linux for HA loadbalancing?

philsnow · on April 19, 2023

This brought to mind the OpenBSD 3.5 release song https://www.openbsd.org/lyrics.html#35

  VRRP, philosophically,
  must ipso facto standard be
  But standard it
  needs to be free
  vis-à-vis
  the IETF
  you see?
  
  But can VRRP
  be said to be
  or not to be
  a standard, see,
  when VRRP can not be free,
  due to some Cisco patentry..