I\'m learning socket programming and am confused by what I feel is inconsistent use of htons()
and family of functions in my learning material. I\'m currently readi
Why is ntohs() used on adr_inet.sin_port in the first instance, but htons() in the second?
The first is a mistake, but in practice works anyway.
Nowadays practically all machines use 8-bit bytes and either consistent big-endian or consistent little-endian formats. On the former both hton[sl]
and ntoh[sl]
are no-ops; on the latter both reverse the byte order, and thus actually do the same thing even though their intended semantics are different. Thus using the wrong one still works on all systems you're likely to run a program on.
Back when the socket API was designed this wasn't always the case; for example the then-popular PDP-11 somewhat infamously used 'middle-endian' (!) aka 'NUXI' order for 32-bit.
Why is neither ntohs() nor htons() used on adr_inet.sin_family?
Again in ancient times the Internet Protocol stack was only one of several (up to a dozen or so) competing network technologies. The family
field distinguishes different types of sockaddr_*
structures for these different protocols, which did not all follow the Internet 'rule' for big-endian, at least not consistently. As there was no universal network representation for family
they just left it in host order -- which is usually more convenient for host software.
Nowadays in practice nobody uses any families but INET, INET6, and sometimes UNIX -- and the latter can be replaced by using named pipes in the filesystem which is usually at least as good.
Why is neither
ntohs()
norhtons()
used onadr_inet.sin_family
?
adr_inet.sin_family
is initialized to the value of AF_INET
. This is defined in bits/socket.h
(which is called by netinet/in.h
in your example) as:
#define PF_INET 2 /* IP protocol family. */
and then,
#define AF_INET PF_INET
So AF_INET
is just a way for the program to identify the associated socket as a TCP/IP connection. It won't actually hold the value of an IPv4 address itself, so there's no need to perform an endian conversion on it.
Also, note that in newer iterations of C, netinet/in.h
has a comment that states the following:
/* Functions to convert between host and network byte order.
Please note that these functions normally take `unsigned long int' or
`unsigned short int' values as arguments and also return them. But
this was a short-sighted decision since on different systems the types
may have different representations but the values are always the same. */
extern uint32_t ntohl (uint32_t __netlong) __THROW __attribute__ ((__const__));
extern uint16_t ntohs (uint16_t __netshort)
__THROW __attribute__ ((__const__));
extern uint32_t htonl (uint32_t __hostlong)
__THROW __attribute__ ((__const__));
extern uint16_t htons (uint16_t __hostshort)
__THROW __attribute__ ((__const__));
Whereas the website you're referencing cites the older use of unsigned long
and unsigned short
datatypes for the conversion functions. So there's a chance you may encounter issues running code from that site if
you're using a newer version of C.