4. `getaddrinfo()` from `glibc` #

The standard in POSIX describes only the behavior and interface of the getaddrinfo() function. However, the actual implementation can vary between different frameworks. In this chapter, we will examine the internals of the getaddrinfo() implementation from glibc version 2.39. In the GNU/Linux world the glibc remains the default C library for the overwhelming majority of systems.

4.1 Internals and design #

Even though the main purpose of a stub resolver is to send DNS queries to a recursive server, the reality is more complex than that. The chart below illustrates the main steps that the glibc getaddrinfo() function performs each time you call it.

Figure 2. – glibc getaddrinfo() internals with Name Service Switch (NSS).

① – The first step in the process might come as a surprise: getaddrinfo() attempts to connect to a hardcoded path of a Unix socket each time it is called. This socket is a part of the Name Service Switch system and related to a cache daemon. We are touching it later in this section.

② – The next step involves reading the Name Service Switch configuration file, /etc/nsswitch.conf. This file sets the order of sources for various services, not just domain resolution. The line of interest for domain name resolution starts with the keyword “hosts”. Subsequent words identify NSS modules, which are queried from left to right in the specified order.

One important note about/etc/nsswitch.conf is that it includes special notation for error handling: for instance, whether to fail immediately, retry, or ignore the error and move to the next module.

In the example shown in the chart above, the getaddrinfo() call should first check the /etc/hosts file, then query the DNS name server, and finally run a custom module, which we are going to write.

③ – NSS modules are shared libraries that adhere to the NSS Modules Interface and are named accordingly to a template defined in the man 5 nsswitch.conf.

④ – The final step is sorting destination addresses according to ten rules from RFC 6724, which we will discuss in more detail in the chapter on dual-stack implementation.

4.2 Name Service Switch (nss) #

Name service switch (NSS) is a framework used to manage the sources from which various name service information is obtained and is which order it’s requested. The system configuration file for NSS is located at /etc/nsswitch.conf (man 5 nsswitch.conf).

The Name Service Switch (NSS) configuration file, /etc/nsswitch.conf, is used by the GNU C Library and certain other applications to determine the sources from which to obtain name-service information in a range of categories, and in what order. Each category of information is identified by a database name.

It’s possible to write your own NSS module. The module is a shared library that adheres to a defined API and should be named and placed according to the following format from the man 5 nsswitch.conf:

/lib/libnss_custom.so.2

for a module named custom.

Now we are ready to write our own NSS module for the host subsystem. We will need a great Rust library libnss-rs.

The step-by-step instructions:

Install rust and gcc:

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
sudo apt-get install gcc

We are naming our module gildor 🧝‍♂:

Dependencies:

[lib]
name = "nss_gildor"
crate-type = [ "cdylib" ]

[dependencies]
libc = "0.2.0"
libnss = "0.8.0"

Code:

use libnss::host::{AddressFamily, Addresses, Host, HostHooks};
use libnss::libnss_host_hooks;
use libnss::interop::Response;
use std::net::{IpAddr, Ipv4Addr};

struct HardcodedHost;
libnss_host_hooks!(gildor, HardcodedHost);

impl HostHooks for HardcodedHost {
    fn get_all_entries() -> Response<Vec<Host>> {
        Response::Success(vec![Host {
            name: "host1.example".to_string(),
            addresses: Addresses::V4(vec![Ipv4Addr::new(192, 168, 1, 1)]),
            aliases: vec!["super-host1.example".to_string()],
        }])
    }

    fn get_host_by_addr(addr: IpAddr) -> Response<Host> {
        match addr {
            IpAddr::V4(addr) => {
                if addr.octets() == [192, 168, 1, 1] {
                    Response::Success(Host {
                        name: "host1.example".to_string(),
                        addresses: Addresses::V4(vec![Ipv4Addr::new(192, 168,1, 1)]),
                        aliases: vec!["super-host1.example".to_string()],
                    })
                } else {
                    Response::NotFound
                }
            }
            _ => Response::NotFound,
        }
    }

    fn get_host_by_name(name: &str, family: AddressFamily) -> Response<Host> {
        if name.ends_with(".example") && family == AddressFamily::IPv4 {
            Response::Success(Host {
                name: name.to_string(),
                addresses: Addresses::V4(vec![Ipv4Addr::new(192, 168, 1, 1)]),
                aliases: vec!["host1.example".to_string(), "super-host1.example".to_string()],
            })
        } else {
            Response::NotFound
        }
    }
}

We also need an install script to put our library in the correct place:

$ cat install.sh

#!/bin/sh

cd target/release
cp libnss_gildor.so libnss_gildor.so.2
sudo install -m 0644 libnss_gildor.so.2 /lib
sudo /sbin/ldconfig -n /lib /usr/lib

It’s time to enable our module in /etc/nsswitch.conf:

$ cat /etc/nsswitch.conf | grep hosts:
hosts:          gildor dns files

And query NSS with getent (man 1 getent) tool:

$ getent ahosts host1.example
192.168.1.1   STREAM host1.example
192.168.1.1   DGRAM
192.168.1.1   RAW

4.3 Name service cache daemon `NSCD` #

If we revisit our example from Chapter 3 and examine the output of strace more carefully, we can find the following lines:

$ sudo strace -T -tt ./getaddrinfo microsoft.com 2>&1 | grep nscd

21:03:57.515650 connect(3, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory) <0.000102>
21:03:57.516376 connect(3, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory) <0.000053>
21:03:57.521218 connect(3, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory) <0.000053>
21:03:57.521794 connect(3, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory) <0.000050>

There are multiple attempts to connect over the UNIX socket to the /var/run/nscd/socket path, which usually doesn’t exist by default. The output shows that, on average, each call took 0.000050 seconds.

The UNIX socket at/var/run/nscd/socket is part of glibc and the Name Service Cache Daemon (nscd), which is known for its instability and frequent bugs. As a result, it is often avoided. However, there are alternatives, and if you cannot afford to spend 0.000050 seconds on every getaddrinfo() call, you might consider installing them or replacing the stub resolver. Fortunately, everything is open source.

4.4 Thread safety issues with `getaddrinfo()` #

It’s worth noting that you should be cautious when calling getaddrinfo() in a multithreaded application, as getaddrinfo() invokes getenv(). This is detailed in the resolv.conf (man 5 resolv.conf).

The search keyword of a system’s resolv.conf file can be overridden on a per-process basis by setting the environment variable LOCALDOMAIN to a space-separated list of search domains.
The options keyword of a system’s resolv.conf file can be amended on a per-process basis by setting the environment variable RES_OPTIONS to a space-separated list of resolver options as explained above under options.

So there are two environment variables: LOCALDOMAIN and RES_OPTIONS. This means that getaddrinfo() calls getenv()internally, but setenv() is not thread safe according to its own man.

setenv(), unsetenv()    │ Thread safety │ MT-Unsafe const:env

So if you have two threads in an application where one is resolving a domain name and the other is setting environment variables using setenv(), you might encounter a segmentation fault.

This is particularly relevant for Golang applications, which are typically multithreaded by default due to the design of language and goroutines. An example of this issue can be seen in an open issue with the cgo stub resolver: https://github.com/golang/go/issues/63567.

4.5. `/etc/resolv.conf` #

Technically, /etc/resolv.conf is not part of glibc; it is a system configuration file used by all known resolvers. However, I’ve chosen to discuss it in the glibc section because it is often challenging to unwind the dependencies and relationships due to the interconnections and interoperability between components (for example, glibc resolver library man 3 resolver).

The man page for /etc/resolv.conf (man 5 resolv.conf) includes several important configuration parameters that could save you hours of debugging and troubleshooting:

nameserver – parameter allows you to specify up to three nameservers, which will be queried in sequence. Note that in contrast, the musl libc implementation queries all specified nameservers in parallel (see the musl libc chapter below for details).
The algorithm used is to try a name server, and if the query times out, try the next, until out of name servers, then repeat trying all the name servers until a maximum number of retries are made.
search – it’s a search list of domains to use for a hostname lookup.
Resolver queries having fewer than ndots dots (default is 1) in them will be attempted using each component of the search path in turn until a match is found.
This means that if the domain name used in getaddrinfo() contains fewer than the specified ndots (explained further below), the resolver will append the search domain from the list and attempt to resolve it first.
domain – is the obsolete version of search.
The options ndots:n – specifies the minimum number of dots a domain name must have to avoid automatically appending the search domain to the name being resolved. The default setting is 1, meaning the only way to trigger the addition of a search domain is to attempt to resolve a hostname that contains no dots. For example:
```
$ cat /etc/resolv.conf | grep search
search mydomain
```

If we try now to resolve a hostname without a domain name:

$ getent ahosts resolve-me

In tcpdump you should see a number of queries:

$ tcpdump -s0 -i any -n -A dst port 53

08:15:42.289500 lo    In  IP 127.0.0.1.43396 > 127.0.0.53.53: 4492+ [1au] A? resolve-me.mydomain. (48)

08:15:42.289517 lo    In  IP 127.0.0.1.43396 > 127.0.0.53.53: 51586+ [1au] AAAA? resolve-me.mydomain. (48)

08:15:42.421518 lo    In  IP 127.0.0.1.57952 > 127.0.0.53.53: 48456+ [1au] A? resolve-me. (39)

08:15:42.421526 lo    In  IP 127.0.0.1.57952 > 127.0.0.53.53: 40777+ [1au] AAAA? resolve-me. (39)

The output shows that the first two requests for A and AAAA were sent with the search domain appended.

options edns0 – enables DNS extensions described in RFC 2671 which brings more efficient query and response handling.
options timeout:n – how long to wait for an answer from a nameserver before sending a query to the next in-order. The default is 5 seconds.
options attempts:n – how many times to retry a nameserver before switching to the next one. Default is 2 times.
options rotate – don’t start from the first nameserver every time, instead apply a round-robin algorithm.

As we wrap up this chapter, I’d like to highlight a potential resolver issue related to the ndots setting in Kubernetes environments, where /etc/resolv.conf might include multiple search domains, and ndots may be set higher than the default one dot.

For example:

$ cat /etc/resolv.conf

search namespace.svc.cluster.local svc.cluster.local cluster.local 
options ndots:5

Firstly, it’s important to understand what a Fully Qualified Domain Name (FQDN) is. By convention, an FQDN should end with a trailing dot, meaning “example.com.” is a FQDN, whereas “example.com” is not. However, many applications, such as web browsers, implicitly assume the trailing dot, which makes the distinction subtle.

With the above resolv.conf settings, attempting to resolve “example.com” will generate six additional, unnecessary DNS queries (three for A records and three for AAAA records) for three FQDNs:

example.com.namespace.svc.cluster.local.
example.com.svc.cluster.local.
example.com.cluster.local.

These queries occur before resolving what we actually want: “example.com.”. This can add significant latency and increase DNS traffic. To mitigate this situation, there are at least two approaches:

Modify the ndots option if you are confident that your application in a pod will not use internal DNS names: \

apiVersion: v1
kind: Pod
spec:
  dnsConfig:
    options:
      - name: ndots
        value: "1"

Change all your domain names to have a trailing dot at the end in configs and databases:
Ensure all your domain names in configs and databases include a trailing dot at the end, such as “example.com.cluster.local.”.

4.6. `/etc/hosts` #

“The well-known /etc/hosts (man 5 hosts) is a plain text file used by operating systems to map hostnames to IP addresses.

The only interesting thing I can probably share is the first hostname is the canonical_hostname (which is included in a getaddrinfo() output) and all others are aliases.

IP_address canonical_hostname [aliases...]

4. getaddrinfo() from glibc #

4.1 Internals and design #

4.2 Name Service Switch (nss) #

4.3 Name service cache daemon NSCD #

4.4 Thread safety issues with getaddrinfo() #

4.5. /etc/resolv.conf #

4.6. /etc/hosts #

4. `getaddrinfo()` from `glibc` #

4.3 Name service cache daemon `NSCD` #

4.4 Thread safety issues with `getaddrinfo()` #

4.5. `/etc/resolv.conf` #

4.6. `/etc/hosts` #