Dual-stack software examples

9. Dual-stack software examples #

9.1 Nginx #

Nginx treats the hostname as a set of distinct entries rather than multiple paths to the same host. From the upstream module doc:

A domain name that resolves to several IP addresses defines multiple servers at once.

On start Nginx resolves all hostnames using its static resolver with getaddrinfo() and the AF_UNSPEC and the AI_ADDRCONFIG flags.

ngx_int_t
ngx_inet_resolve_host(ngx_pool_t *pool, ngx_url_t *u)
{
...
hints.ai_family = AF_UNSPEC;
    hints.ai_socktype = SOCK_STREAM;
#ifdef AI_ADDRCONFIG
    hints.ai_flags = AI_ADDRCONFIG;
#endif

if (getaddrinfo((char *) host, NULL, &hints, &res) != 0) {
        u->err = "host not found";
        ngx_free(host);
        return NGX_ERROR;
    }
   ...

If you have a paid Plus version of Nginx or use a workaround with a variable in proxy_pass, Nginx can periodically update in memory resolution cache. In order to enable it, you need to specify the global resolver first:

resolver address ... [valid=time] [ipv4=on|off] [ipv6=on|off] [status_zone=zone];

The name resolving happens in ngx_resolver.c, which support A, AAAA, CNAME and SRV records (only in paid version):

static ngx_int_t
ngx_resolve_name_locked(ngx_resolver_t *r, ngx_resolver_ctx_t *ctx,
    ngx_str_t *name)
{
    
   addrs = ngx_resolver_export(r, rn, 1);
   

The resolver randomly rotate the addresses artificially implementing round-robin DNS:

static ngx_resolver_addr_t *
ngx_resolver_export(ngx_resolver_t *r, ngx_resolver_node_t *rn,
    ngx_uint_t rotate)
{
    
    d = rotate ? ngx_random() % n : 0;
    

Because Nginx handles each address of a hostname as multiple servers, the returned IPv6 addresses are treated as completely independent servers rather than alternative ways to reach the same host. As a result, Nginx does not support the Happy Eyeballs algorithm or sorting by destination address selection rules.

The addresses in upstreams are selected in a round-robin fashion.

By default, requests are distributed between the servers using a weighted round-robin balancing method.

The round-robin pear is created in ngx_http_upstream_create_round_robin_peer() function from the resolver answer:

ngx_int_t
ngx_http_upstream_create_round_robin_peer(ngx_http_request_t *r,
    ngx_http_upstream_resolved_t *ur)
{
   

The playground to play with the local Nginx and dynamic resolver via a prorxy_pass variable.

In order to control resolver we need to install our own DNS server. I suggest using CoreDNS, as it’s simple and powerful for our needs:

$ wget https://github.com/coredns/coredns/releases/download/v1.11.1/coredns_1.11.1_linux_arm64.tgz
$ tar -zxf coredns_1.11.1_linux_arm64.tgz

Its config file:

$ cat Corefile

test.example {
        bind 127.0.0.153
        loadbalance round_robin
        file test.example.db
}

Zone file for test.example domain:

$ cat test.example.db

$ORIGIN test.example.
@       3600 IN SOA sns.dns.icann.org. noc.dns.icann.org. 2017042745 7200 3600 1209600 3600

site    IN      A       192.168.5.15
        IN      AAAA    ::1
        IN      AAAA    fec0::5055:55ff:fe8e:3d07

where:

  • 192.168.5.15 – local container IPv4 address
  • fec0::5055:55ff:fe8e:3d07 – local container IPv6 unique local address (ULA).
Nginx doesn’t have link-local IPv6 address support for backends, so please don’t use them.

Run it:

$ sudo ./coredns

Nginx test config:

worker_processes auto;

events {
    worker_connections 1024;
}

pid /tmp/nginx.pid;

http {
    # Use a DNS resolver to resolve the backend host
    resolver 127.0.0.153 valid=5s;

    access_log /tmp/access.log;
    error_log /tmp/error.log debug;

    server {
        listen 80;
        server_name site.test.example;

        location / {
            set $backend_service "site.test.example:8080";
            # Proxy to the dynamically resolved backend service
            proxy_pass http://$backend_service;

            # Pass host headers
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header X-Forwarded-Proto $scheme;

            # Handle timeouts and retries
            proxy_connect_timeout 10s;
            proxy_send_timeout 10s;
            proxy_read_timeout 10s;
            proxy_next_upstream error timeout invalid_header http_500 http_502 http_503 http_504;

            # Optional: Additional settings to improve proxy performance and security
            proxy_buffering on;
            proxy_buffer_size 16k;
            proxy_buffers 32 16k;
            proxy_busy_buffers_size 64k;
            proxy_max_temp_file_size 64k;

        }
    }

 
    default_type application/octet-stream;

    # Gzip settings
    gzip on;
    gzip_vary on;
    gzip_proxied any;
    gzip_comp_level 6;
    gzip_buffers 16 8k;
    gzip_http_version 1.1;
    gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript;
}

Run a dummy backend with python:

$ python3 -m http.server 8080 --bind ::

And make some requests in another terminal window:

$ while true; do curl 127.0.0.1; done

In the console with the python simple server you should see something similar, which shows the usage of round-robin algorithm:

fec0::5055:55ff:fe8e:3d07 - - [01/Aug/2024 09:49:11] "GET / HTTP/1.0" 200 -
::ffff:192.168.5.15 - - [01/Aug/2024 09:49:11] "GET / HTTP/1.0" 200 -
::1 - - [01/Aug/2024 09:49:11] "GET / HTTP/1.0" 200 -
fec0::5055:55ff:fe8e:3d07 - - [01/Aug/2024 09:49:11] "GET / HTTP/1.0" 200 -
::1 - - [01/Aug/2024 09:49:11] "GET / HTTP/1.0" 200 -
fec0::5055:55ff:fe8e:3d07 - - [01/Aug/2024 09:49:11] "GET / HTTP/1.0" 200 -
::1 - - [01/Aug/2024 09:49:11] "GET / HTTP/1.0" 200 -
::1 - - [01/Aug/2024 09:49:11] "GET / HTTP/1.0" 200 -
::ffff:192.168.5.15 - - [01/Aug/2024 09:49:11] "GET / HTTP/1.0" 200 -
::1 - - [01/Aug/2024 09:49:11] "GET / HTTP/1.0" 200 -
::1 - - [01/Aug/2024 09:49:11] "GET / HTTP/1.0" 200 -
::ffff:192.168.5.15 - - [01/Aug/2024 09:49:11] "GET / HTTP/1.0" 200 -
::1 - - [01/Aug/2024 09:49:11] "GET / HTTP/1.0" 200 -

In the tcpdump you can see periodic refresh of addresses:

$ sudo tcpdump -s0 -i any -n -A host 127.0.0.153
09:53:16.002276 lo    In  IP 127.0.0.1.48587 > 127.0.0.153.53: 21998+ A? site.test.example. (35)

09:53:16.002288 lo    In  IP 127.0.0.1.48587 > 127.0.0.153.53: 19029+ AAAA? site.test.example. (35)

09:53:16.002394 lo    In  IP 127.0.0.153.53 > 127.0.0.1.48587: 19029*- 2/0/0 AAAA ::1, AAAA fec0::5055:55ff:fe8e:3d07 (125)

09:53:16.002422 lo    In  IP 127.0.0.153.53 > 127.0.0.1.48587: 21998*- 1/0/0 A 192.168.5.15 (68)

9.2 Envoy proxy #

The default resolver in Envoy for GNU/Linux is c-ares. Envoy is built statically, so the simplest way to determine the version of c-ares, is to check the bazel repository_locations.bzl file.

We will be reviewing Envoy with the Logical DNS option set for its cluster config.cluster.v3.Cluster.DiscoveryType, because it is the general case and the most used one.

The best place to start learning and configuring the stub resolve is config.cluster.v3.Cluster.DnsLookupFamily option:

Enum config.cluster.v3.Cluster.DnsLookupFamily

When V4_ONLY is selected, the DNS resolver will only perform a lookup for addresses in the IPv4 family. If V6_ONLY is selected, the DNS resolver will only perform a lookup for addresses in the IPv6 family. If AUTO is specified, the DNS resolver will first perform a lookup for addresses in the IPv6 family and fallback to a lookup for addresses in the IPv4 family. This is semantically equivalent to a non-existent V6_PREFERRED option. AUTO is a legacy name that is more opaque than necessary and will be deprecated in favor of V6_PREFERRED in a future major version of the API. If V4_PREFERRED is specified, the DNS resolver will first perform a lookup for addresses in the IPv4 family and fallback to a lookup for addresses in the IPv6 family. i.e., the callback target will only get v6 addresses if there were NO v4 addresses to return. If ALL is specified, the DNS resolver will perform a lookup for both IPv4 and IPv6 families, and return all resolved addresses. When this is used, Happy Eyeballs will be enabled for upstream connections.

The translation into the C socket address family is happening in cares/dns_impl.cc:

  switch (dns_lookup_family_) {
  case DnsLookupFamily::V4Only:
  case DnsLookupFamily::V4Preferred:
    family_ = AF_INET;
    break;
  case DnsLookupFamily::V6Only:
  case DnsLookupFamily::Auto:
    family_ = AF_INET6;
    break;
  case DnsLookupFamily::All:
    family_ = AF_UNSPEC;
    break;
  }

An important clarification regarding the DnsLookupFamily configuration is that the second fallback resolver call occurs only if there is a resolver-related error or a ‘not found’ response. This means that if you have the default DnsLookupFamily setting (which prefers IPv6), and the resolver returns an AAAA record, but your network routing is broken, there will be no retry for an A address. The same behavior applies when DnsLookupFamily is set to V4_PREFERRED.

The retry code is in dns_impl.cc:

   // Perform a second lookup for DnsLookupFamily::Auto and DnsLookupFamily::V4Preferred, given
    // that the first lookup failed to return any addresses. Note that DnsLookupFamily::All issues
    // both lookups concurrently so there is no need to fire a second lookup here.
    if (dns_lookup_family_ == DnsLookupFamily::Auto) {
      family_ = AF_INET;
      startResolutionImpl(AF_INET);
    } else if (dns_lookup_family_ == DnsLookupFamily::V4Preferred) {
      family_ = AF_INET6;
      startResolutionImpl(AF_INET6);
    }

To help dealing with such errors you can think of using the filter_unroutable_families option. But there is a trick.

filter_unroutable_families

(bool) The resolver will query available network interfaces and determine if there are no available interfaces for a given IP family. It will then filter these addresses from the results it presents. e.g., if there are no available IPv4 network interfaces, the resolver will not provide IPv4 addresses.

The trick is, it suffers from the same issue we identified earlier with the AI_ADDRCONFIG flag in getaddrinfo(): it considers any IPv6 address – except for localhost – as a cue to greenlight AAAA stub resolver queries. This means all link-local and unique local addresses are included and treated as votes for the IPv6 stack. Therefore, this option is really useful for IPv6-only hosts with a completely disabled IPv4 stack or IPv4-only with completely disabled IPv6.

The whole filtering is happening in DnsResolverImpl::AddrInfoPendingResolution::onAresGetAddrInfoCallback:

  if (addrinfo != nullptr && addrinfo->nodes != nullptr) {
      bool can_process_v4 =
          (!parent_.filter_unroutable_families_ || available_interfaces_.v4_available_);
      bool can_process_v6 =
          (!parent_.filter_unroutable_families_ || available_interfaces_.v6_available_);

The v4_available_ and v6_available_ are set in dns_impl.cc:

  for (const auto& interface_address : interface_addresses) {
    if (!interface_address.interface_addr_->ip()) {
      continue;
    }


    if (Network::Utility::isLoopbackAddress(*interface_address.interface_addr_)) {
      continue;
    }


    switch (interface_address.interface_addr_->ip()->version()) {
    case Network::Address::IpVersion::v4:
      available_interfaces.v4_available_ = true;
      if (available_interfaces.v6_available_) {
        return available_interfaces;
      }
      break;
    case Network::Address::IpVersion::v6:
      available_interfaces.v6_available_ = true;
      if (available_interfaces.v4_available_) {
        return available_interfaces;
      }
      break;
    }
  }
  return available_interfaces;
}

where the only filtering happens for localhosts in common/network/utility.cc:

bool Utility::isLoopbackAddress(const Address::Instance& address) {
  if (address.type() != Address::Type::Ip) {
    return false;
  }


  if (address.ip()->version() == Address::IpVersion::v4) {
    // Compare to the canonical v4 loopback address: 127.0.0.1.
    return address.ip()->ipv4()->address() == htonl(INADDR_LOOPBACK);
  } else if (address.ip()->version() == Address::IpVersion::v6) {
    static_assert(sizeof(absl::uint128) == sizeof(in6addr_loopback),
                  "sizeof(absl::uint128) != sizeof(in6addr_loopback)");
    absl::uint128 addr = address.ip()->ipv6()->address();
    return 0 == memcmp(&addr, &in6addr_loopback, sizeof(in6addr_loopback));
  }
  IS_ENVOY_BUG("unexpected address type");
  return false;
}

So the real solution for the dual stack upstreams or seamlessly migrating to IPv6 only backends is to enable the Happy Eyeballs algorithm which is supported by setting DnsLookupFamily to ALL.

The code is in source/common/network/happy_eyeballs_connection_impl.cc.

One interesting observation is that Envoy violates section 4 of the Happy Eyeballs RFC by not sorting addresses before initiating connections.

First, the client MUST sort the addresses received up to this point using Destination Address Selection ([RFC6724], Section 6).

But in the source/extensions/network/dns_resolver/cares/dns_impl.cc we can how c-ares is called with disabled sorting:

  /**
   * ARES_AI_NOSORT result addresses will not be sorted and no connections to resolved addresses
   * will be attempted
   */
  hints.ai_flags = ARES_AI_NOSORT;

The justification is to increase performance due to fewer system calls.

However Envoy has its own settings to control its own sorting for Happy Eyeballs:

first_address_family_version

(config.cluster.v3.UpstreamConnectionOptions.FirstAddressFamilyVersion) Specify the IP address family to attempt connection first in happy eyeballs algorithm according to RFC8305#section-4.

first_address_family_count

(UInt32Value) Specify the number of addresses of the first_address_family_version being attempted for connection before the other address family.

The full config to play with Envoy:

static_resources:

  listeners:
  - name: listener_0
    address:
      socket_address:
        address: "::" 
        port_value: 10000
        ipv4_compat: true
    filter_chains:
    - filters:
      - name: envoy.filters.network.http_connection_manager
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
          stat_prefix: ingress_http
          access_log:
          - name: envoy.access_loggers.stdout
            typed_config:
              "@type": type.googleapis.com/envoy.extensions.access_loggers.stream.v3.StdoutAccessLog
              log_format:
                text_format: "%START_TIME% %REQ(X-ENVOY-ORIGINAL-PATH?:PATH)% %RESP(X-ENVOY-UPSTREAM-SERVICE-TIME)% %REQ(:METHOD)% %REQ(X-FORWARDED-FOR?:REMOTE_ADDRESS)% %RESPONSE_CODE% %RESPONSE_FLAGS% %BYTES_RECEIVED% %BYTES_SENT% %DURATION% %UPSTREAM_HOST%\n"
          http_filters:
          - name: envoy.filters.http.router
            typed_config:
              "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
          route_config:
            name: local_route
            virtual_hosts:
            - name: local_service
              domains: ["*"]
              routes:
              - match:
                  prefix: "/"
                route:
                  host_rewrite_literal: www.envoyproxy.io
                  cluster: service_envoyproxy_io

  clusters:
  - name: service_envoyproxy_io
    type: LOGICAL_DNS
    # Change to play with other resolver rules 
    dns_lookup_family: AUTO
    typed_dns_resolver_config:
      name: envoy.network.dns_resolver.cares
      typed_config:
        "@type": type.googleapis.com/envoy.extensions.network.dns_resolver.cares.v3.CaresDnsResolverConfig
        resolvers:
          - socket_address:
              address: "8.8.8.8"
              port_value: 53
        filter_unroutable_families: true
        dns_resolver_options:
          no_default_search_domain: true
    load_assignment:
      cluster_name: service_envoyproxy_io
      endpoints:
      - lb_endpoints:
        - endpoint:
            address:
              socket_address:
                address: www.envoyproxy.io
                port_value: 443
    transport_socket:
      name: envoy.transport_sockets.tls
      typed_config:
        "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
        sni: www.envoyproxy.io

And run it with debug:

$ envoy -l debug -c ./envoy.yaml

9.3 HAProxy #

HAProxy uses getaddrinfo() only for the initial name resolution when binding to listening addresses, similar to the approach we discussed in the dual-stack server section.

To connect to backend (upstream) servers, HAProxy employs its own resolver. It is possible to configure HAProxy with multiple name servers. If you set more than one, HAProxy will send queries to all of them in parallel:

When multiple name servers are configured in a resolvers section, then HAProxy uses the first valid response. In case of invalid responses, only the last one is treated. Purpose is to give the chance to a slow server to deliver a valid answer after a fast faulty or outdated server.

The code is under src/resolvers.c:

static int resolv_send_query(struct resolv_resolution *resolution)
{
	 

	list_for_each_entry(ns, &resolvers->nameservers, list) {
		if (dns_send_nameserver(ns, trash.area, len) >= 0)
			resolution->nb_queries++;
	}
      
}

You can also control the preference of address families:

resolve-prefer <family>

When DNS resolution is enabled for a server and multiple IP addresses from different families are returned, HAProxy will prefer using an IP address from the family mentioned in the “resolve-prefer” parameter. Available families: “ipv4” and “ipv6

Default value: ipv6

According to the documentation, the parameter mentioned above is only a preference. In the event of a resolve timeout, error, or “not found” address, a retry will be issued with another resource record family type.

HAProxy also does not follow the Default Address Selection for Internet Protocol Version 6 (IPv6). The logic is similar to what we discussed with the Java JDK, but without the system value for the IPv6 preference property, which can make migrating from IPv4 to IPv6 challenging.

Another resolver-related setting controls how the initial resolution of backends occurs at startup. Interestingly, you can use getaddrinfo() for this purpose.

Additionally, there is no support for the Happy Eyeballs algorithm, which could make migration to or using a dual-stack application more difficult.

Read next chapter →