History: gethostbyname() and old good friends

2. History: gethostbyname() and old good friends #

Please do not use any of the code snippets from this chapter in your projects. They are provided solely for historical and educational purposes. Instead, you should use getaddrinfo().

The gethostbyname (man 3 gethostbyname) function first appeared in the 1980s and has been a part of the networking landscape ever since. Despite its obsoletion, some programs still use it. It was deprecated in POSIX.1-2001, over two decades ago, due to its internal design limitations and limited functionality. However, for a long time, it was the preferred and standardized helper function for resolving a domain name into a list of IP addresses.

The number of drawbacks and problems which made its usage obsolete:

  1. The lack of IPv6 support. Although there is a Linux-specific gethostbyname2(), which can resolve IPv6 addresses, the standard gethostbyname() function is limited to IPv4 only.
struct hostent *host_info = gethostbyname2(hostname, address_family);

if (host_info == NULL) {
    fprintf(stderr, "Error: Could not resolve hostname %s\n", hostname);
    exit(EXIT_FAILURE);
}

for (char **addr_list = host_info->h_addr_list; *addr_list != NULL; addr_list++) {
    char ip[INET6_ADDRSTRLEN];
    void *address = (host_info->h_addrtype == AF_INET) 
                    ? (void *) ((struct in_addr *) *addr_list)
                    : (void *) ((struct in6_addr *) *addr_list);

    const char *result = inet_ntop(host_info->h_addrtype, address, ip, sizeof(ip));
    if (result == NULL) {
        perror("inet_ntop");
        exit(EXIT_FAILURE);
    }
    printf("  %s\n", ip);
}

In theory, you could use gethostbyname2(), but it still lacks the capability to combine IPv4 and IPv6 results, properly sort them according to the standard RFC 6724 (which we will discuss later), and suffers from a legacy internal design.

  1. Non-Reentrant: gethostbyname() is not thread-safe. It uses internal static data structures, meaning that subsequent calls to gethostbyname() overwrite the data returned by previous calls.
  2. Limited Information: The hostent structure returned by gethostbyname() provides limited information, primarily focusing on the IP address and not on other details like the service or protocol. This often leads to additional string concatenations and the creation of new data structures, which result in more boilerplate code for subsequent socket API calls.

It’s worth mentioning that gethostbyname() has a unique feature: it returns a list of IP addresses in a semi-random order with each call, essentially providing a round-robin DNS. This allows for the simple implementation of a client-side load balancer among the returned A records.

Another UNIX libc function, getipnodebyname (man 3 getipnodebyname), has been removed from GNU/Linux but may still exist on other platforms. Here is how it appears in Python 3.12 (cpython) for some other platforms. File Modules/getaddrinfo.c:

#ifdef ENABLE_IPV6
    if (af == AF_UNSPEC) {
        hp = getipnodebyname(hostname, AF_INET6,
                        AI_ADDRCONFIG|AI_ALL|AI_V4MAPPED, &h_error);
    } else
        hp = getipnodebyname(hostname, af, AI_ADDRCONFIG, &h_error);
#else
    hp = gethostbyname(hostname);
    h_error = h_errno;
#endif
Read next chapter →