4. getaddrinfo()
from glibc
#
The standard in POSIX describes only the behavior and interface of the getaddrinfo()
function. However, the actual implementation can vary between different frameworks. In this chapter, we will examine the internals of the getaddrinfo()
implementation from glibc
version 2.39. In the GNU/Linux world the glibc
remains the default C library for the overwhelming majority of systems.
4.1 Internals and design #
Even though the main purpose of a stub resolver is to send DNS queries to a recursive server, the reality is more complex than that. The chart below illustrates the main steps that the glibc
getaddrinfo()
function performs each time you call it.
glibc
getaddrinfo()
internals with Name Service Switch (NSS).① – The first step in the process might come as a surprise: getaddrinfo()
attempts to connect to a hardcoded path of a Unix socket each time it is called. This socket is a part of the Name Service Switch system and related to a cache daemon. We are touching it later in this section.
② – The next step involves reading the Name Service Switch configuration file, /etc/nsswitch.conf
. This file sets the order of sources for various services, not just domain resolution. The line of interest for domain name resolution starts with the keyword “hosts
”. Subsequent words identify NSS modules, which are queried from left to right in the specified order.
One important note about/etc/nsswitch.conf
is that it includes special notation for error handling: for instance, whether to fail immediately, retry, or ignore the error and move to the next module.
In the example shown in the chart above, the getaddrinfo()
call should first check the /etc/hosts
file, then query the DNS name server, and finally run a custom
module, which we are going to write.
③ – NSS modules are shared libraries that adhere to the NSS Modules Interface and are named accordingly to a template defined in the man 5 nsswitch.conf
.
④ – The final step is sorting destination addresses according to ten rules from RFC 6724, which we will discuss in more detail in the chapter on dual-stack implementation.
4.2 Name Service Switch (nss) #
Name service switch (NSS) is a framework used to manage the sources from which various name service information is obtained and is which order it’s requested. The system configuration file for NSS is located at /etc/nsswitch.conf
(man 5 nsswitch.conf
).
The Name Service Switch (NSS) configuration file,
/etc/nsswitch.conf
, is used by the GNU C Library and certain other applications to determine the sources from which to obtain name-service information in a range of categories, and in what order. Each category of information is identified by a database name.
It’s possible to write your own NSS module. The module is a shared library that adheres to a defined API and should be named and placed according to the following format from the man 5 nsswitch.conf
:
/lib/libnss_custom.so.2
for a module named custom
.
Now we are ready to write our own NSS module for the host
subsystem. We will need a great Rust
library libnss-rs.
The step-by-step instructions:
Install rust
and gcc
:
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
sudo apt-get install gcc
We are naming our module gildor 🧝♂:
Dependencies:
[lib]
name = "nss_gildor"
crate-type = [ "cdylib" ]
[dependencies]
libc = "0.2.0"
libnss = "0.8.0"
Code:
use libnss::host::{AddressFamily, Addresses, Host, HostHooks};
use libnss::libnss_host_hooks;
use libnss::interop::Response;
use std::net::{IpAddr, Ipv4Addr};
struct HardcodedHost;
libnss_host_hooks!(gildor, HardcodedHost);
impl HostHooks for HardcodedHost {
fn get_all_entries() -> Response<Vec<Host>> {
Response::Success(vec![Host {
name: "host1.example".to_string(),
addresses: Addresses::V4(vec![Ipv4Addr::new(192, 168, 1, 1)]),
aliases: vec!["super-host1.example".to_string()],
}])
}
fn get_host_by_addr(addr: IpAddr) -> Response<Host> {
match addr {
IpAddr::V4(addr) => {
if addr.octets() == [192, 168, 1, 1] {
Response::Success(Host {
name: "host1.example".to_string(),
addresses: Addresses::V4(vec![Ipv4Addr::new(192, 168,1, 1)]),
aliases: vec!["super-host1.example".to_string()],
})
} else {
Response::NotFound
}
}
_ => Response::NotFound,
}
}
fn get_host_by_name(name: &str, family: AddressFamily) -> Response<Host> {
if name.ends_with(".example") && family == AddressFamily::IPv4 {
Response::Success(Host {
name: name.to_string(),
addresses: Addresses::V4(vec![Ipv4Addr::new(192, 168, 1, 1)]),
aliases: vec!["host1.example".to_string(), "super-host1.example".to_string()],
})
} else {
Response::NotFound
}
}
}
We also need an install script to put our library in the correct place:
$ cat install.sh
#!/bin/sh
cd target/release
cp libnss_gildor.so libnss_gildor.so.2
sudo install -m 0644 libnss_gildor.so.2 /lib
sudo /sbin/ldconfig -n /lib /usr/lib
It’s time to enable our module in /etc/nsswitch.conf:
$ cat /etc/nsswitch.conf | grep hosts:
hosts: gildor dns files
And query NSS
with getent
(man 1 getent
) tool:
$ getent ahosts host1.example
192.168.1.1 STREAM host1.example
192.168.1.1 DGRAM
192.168.1.1 RAW
4.3 Name service cache daemon NSCD
#
If we revisit our example from Chapter 3 and examine the output of strace
more carefully, we can find the following lines:
$ sudo strace -T -tt ./getaddrinfo microsoft.com 2>&1 | grep nscd
21:03:57.515650 connect(3, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory) <0.000102>
21:03:57.516376 connect(3, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory) <0.000053>
21:03:57.521218 connect(3, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory) <0.000053>
21:03:57.521794 connect(3, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory) <0.000050>
There are multiple attempts to connect over the UNIX socket to the /var/run/nscd/socket
path, which usually doesn’t exist by default. The output shows that, on average, each call took 0.000050 seconds.
The UNIX socket at/var/run/nscd/socket
is part of glibc
and the Name Service Cache Daemon (nscd
), which is known for its instability and frequent bugs. As a result, it is often avoided. However, there are alternatives, and if you cannot afford to spend 0.000050 seconds on every getaddrinfo()
call, you might consider installing them or replacing the stub resolver. Fortunately, everything is open source.
4.4 Thread safety issues with getaddrinfo()
#
It’s worth noting that you should be cautious when calling getaddrinfo()
in a multithreaded application, as getaddrinfo()
invokes getenv()
. This is detailed in the resolv.conf
(man 5 resolv.conf
).
The search keyword of a system’s resolv.conf file can be overridden on a per-process basis by setting the environment variable
LOCALDOMAIN
to a space-separated list of search domains.The options keyword of a system’s resolv.conf file can be amended on a per-process basis by setting the environment variable
RES_OPTIONS
to a space-separated list of resolver options as explained above under options.
So there are two environment variables: LOCALDOMAIN
and RES_OPTIONS
. This means that getaddrinfo()
calls getenv()
internally, but setenv()
is not thread safe according to its own man.
setenv(), unsetenv() │ Thread safety │ MT-Unsafe const:env
So if you have two threads in an application where one is resolving a domain name and the other is setting environment variables using setenv()
, you might encounter a segmentation fault.
This is particularly relevant for Golang
applications, which are typically multithreaded by default due to the design of language and goroutines. An example of this issue can be seen in an open issue with the cgo
stub resolver: https://github.com/golang/go/issues/63567.
4.5. /etc/resolv.conf
#
Technically, /etc/resolv.conf
is not part of glibc
; it is a system configuration file used by all known resolvers. However, I’ve chosen to discuss it in the glibc
section because it is often challenging to unwind the dependencies and relationships due to the interconnections and interoperability between components (for example, glibc
resolver library man 3 resolver
).
The man page for /etc/resolv.conf
(man 5 resolv.conf
) includes several important configuration parameters that could save you hours of debugging and troubleshooting:
nameserver
– parameter allows you to specify up to three nameservers, which will be queried in sequence. Note that in contrast, themusl libc
implementation queries all specified nameservers in parallel (see themusl libc
chapter below for details).The algorithm used is to try a name server, and if the query times out, try the next, until out of name servers, then repeat trying all the name servers until a maximum number of retries are made.
search
– it’s a search list of domains to use for a hostname lookup.Resolver queries having fewer than
ndots
dots (default is 1) in them will be attempted using each component of the search path in turn until a match is found.This means that if the domain name used in
getaddrinfo()
contains fewer than the specifiedndots
(explained further below), the resolver will append thesearch
domain from the list and attempt to resolve it first.domain
– is the obsolete version ofsearch
.The options
ndots:n
– specifies the minimum number of dots a domain name must have to avoid automatically appending the search domain to the name being resolved. The default setting is 1, meaning the only way to trigger the addition of a search domain is to attempt to resolve a hostname that contains no dots. For example:$ cat /etc/resolv.conf | grep search search mydomain
If we try now to resolve a hostname without a domain name:
$ getent ahosts resolve-me
In tcpdump
you should see a number of queries:
$ tcpdump -s0 -i any -n -A dst port 53
08:15:42.289500 lo In IP 127.0.0.1.43396 > 127.0.0.53.53: 4492+ [1au] A? resolve-me.mydomain. (48)
08:15:42.289517 lo In IP 127.0.0.1.43396 > 127.0.0.53.53: 51586+ [1au] AAAA? resolve-me.mydomain. (48)
08:15:42.421518 lo In IP 127.0.0.1.57952 > 127.0.0.53.53: 48456+ [1au] A? resolve-me. (39)
08:15:42.421526 lo In IP 127.0.0.1.57952 > 127.0.0.53.53: 40777+ [1au] AAAA? resolve-me. (39)
The output shows that the first two requests for A
and AAAA
were sent with the search domain appended.
options edns0
– enables DNS extensions described in RFC 2671 which brings more efficient query and response handling.options timeout:n
– how long to wait for an answer from anameserver
before sending a query to the next in-order. The default is 5 seconds.options attempts:n
– how many times to retry anameserver
before switching to the next one. Default is 2 times.options rotate
– don’t start from the firstnameserver
every time, instead apply a round-robin algorithm.
As we wrap up this chapter, I’d like to highlight a potential resolver issue related to the ndots
setting in Kubernetes environments, where /etc/resolv.conf
might include multiple search domains, and ndots
may be set higher than the default one dot.
For example:
$ cat /etc/resolv.conf
search namespace.svc.cluster.local svc.cluster.local cluster.local
options ndots:5
Firstly, it’s important to understand what a Fully Qualified Domain Name (FQDN) is. By convention, an FQDN should end with a trailing dot, meaning “example.com.
” is a FQDN, whereas “example.com
” is not. However, many applications, such as web browsers, implicitly assume the trailing dot, which makes the distinction subtle.
With the above resolv.conf
settings, attempting to resolve “example.com
” will generate six additional, unnecessary DNS queries (three for A
records and three for AAAA
records) for three FQDNs:
example.com.namespace.svc.cluster.local.
example.com.svc.cluster.local.
example.com.cluster.local.
These queries occur before resolving what we actually want: “example.com.
”. This can add significant latency and increase DNS traffic. To mitigate this situation, there are at least two approaches:
- Modify the
ndots
option if you are confident that your application in apod
will not use internal DNS names: \
apiVersion: v1
kind: Pod
spec:
dnsConfig:
options:
- name: ndots
value: "1"
- Change all your domain names to have a trailing dot at the end in configs and databases:
Ensure all your domain names in configs and databases include a trailing dot at the end, such as “example.com.cluster.local.
”.
4.6. /etc/hosts
#
“The well-known /etc/hosts
(man 5 hosts
) is a plain text file used by operating systems to map hostnames to IP addresses.
The only interesting thing I can probably share is the first hostname is the canonical_hostname
(which is included in a getaddrinfo()
output) and all others are aliases.
IP_address canonical_hostname [aliases...]