What is the difference between a port and a socket?

后端 未结 30 1703
旧巷少年郎
旧巷少年郎 2020-11-22 14:31

This was a question raised by one of the software engineers in my organisation. I\'m interested in the broadest definition.

相关标签:
30条回答
  • 2020-11-22 15:19

    A socket is a communication endpoint. A socket is not directly related to the TCP/IP protocol family, it can be used with any protocol your system supports. The C socket API expects you to first get a blank socket object from the system that you can then either bind to a local socket address (to directly retrieve incoming traffic for connection-less protocols or to accept incoming connection requests for connection-oriented protocols) or that you can connect to a remote socket address (for either kind of protocol). You can even do both if you want to control both, the local socket address a socket is bound to and the remote socket address a socket is connected to. For connection-less protocols connecting a socket is even optional but if you don't do that, you'll have to also pass the destination address with every packet you want to send over the socket as how else would the socket know where to send this data to? Advantage is that you can use a single socket to send packets to different socket addresses. Once you have your socket configured and maybe even connected, consider it to be a bi-directional communication pipe. You can use it to pass data to some destination and some destination can use it to pass data back to you. What you write to a socket is send out and what has been received is available for reading.

    Ports on the other hand are something that only certain protocols of the TCP/IP protocol stack have. TCP and UDP packets have ports. A port is just a simple number. The combination of source port and destination port identify a communication channel between two hosts. E.g. you may have a server that shall be both, a simple HTTP server and a simple FTP server. If now a packet arrives for the address of that server, how would it know if that is a packet for the HTTP or the FTP server? Well, it will know so as the HTTP server will run on port 80 and the FTP server on port 21, so if the packet arrives with a destination port 80, it is for the HTTP server and not for the FTP server. Also the packet has a source port since without such a source port, a server could only have one connection to one IP address at a time. The source port makes it possible for a server to distinguish otherwise identical connections: they all have the same destination port, e.g. port 80, the same destination IP (the IP of the server), and the same source IP, as they all come from the same client, but as they have a different source ports, the server can distinguish them from each other. And when the server sends back replies, it will do so to the port the request came from, that way the client can also distinguish different replies it receives from the same server.

    0 讨论(0)
  • 2020-11-22 15:20

    Summary

    A TCP socket is an endpoint instance defined by an IP address and a port in the context of either a particular TCP connection or the listening state.

    A port is a virtualisation identifier defining a service endpoint (as distinct from a service instance endpoint aka session identifier).

    A TCP socket is not a connection, it is the endpoint of a specific connection.

    There can be concurrent connections to a service endpoint, because a connection is identified by both its local and remote endpoints, allowing traffic to be routed to a specific service instance.

    There can only be one listener socket for a given address/port combination.

    Exposition

    This was an interesting question that forced me to re-examine a number of things I thought I knew inside out. You'd think a name like "socket" would be self-explanatory: it was obviously chosen to evoke imagery of the endpoint into which you plug a network cable, there being strong functional parallels. Nevertheless, in network parlance the word "socket" carries so much baggage that a careful re-examination is necessary.

    In the broadest possible sense, a port is a point of ingress or egress. Although not used in a networking context, the French word porte literally means door or gateway, further emphasising the fact that ports are transportation endpoints whether you ship data or big steel containers.

    For the purpose of this discussion I will limit consideration to the context of TCP-IP networks. The OSI model is all very well but has never been completely implemented, much less widely deployed in high-traffic high-stress conditions.

    The combination of an IP address and a port is strictly known as an endpoint and is sometimes called a socket. This usage originates with RFC793, the original TCP specification.

    A TCP connection is defined by two endpoints aka sockets.

    An endpoint (socket) is defined by the combination of a network address and a port identifier. Note that address/port does not completely identify a socket (more on this later).

    The purpose of ports is to differentiate multiple endpoints on a given network address. You could say that a port is a virtualised endpoint. This virtualisation makes multiple concurrent connections on a single network interface possible.

    It is the socket pair (the 4-tuple consisting of the client IP address, client port number, server IP address, and server port number) that specifies the two endpoints that uniquely identifies each TCP connection in an internet. (TCP-IP Illustrated Volume 1, W. Richard Stevens)

    In most C-derived languages, TCP connections are established and manipulated using methods on an instance of a Socket class. Although it is common to operate on a higher level of abstraction, typically an instance of a NetworkStream class, this generally exposes a reference to a socket object. To the coder this socket object appears to represent the connection because the connection is created and manipulated using methods of the socket object.

    In C#, to establish a TCP connection (to an existing listener) first you create a TcpClient. If you don't specify an endpoint to the TcpClient constructor it uses defaults - one way or another the local endpoint is defined. Then you invoke the Connect method on the instance you've created. This method requires a parameter describing the other endpoint.

    All this is a bit confusing and leads you to believe that a socket is a connection, which is bollocks. I was labouring under this misapprehension until Richard Dorman asked the question.

    Having done a lot of reading and thinking, I'm now convinced that it would make a lot more sense to have a class TcpConnection with a constructor that takes two arguments, LocalEndpoint and RemoteEndpoint. You could probably support a single argument RemoteEndpoint when defaults are acceptable for the local endpoint. This is ambiguous on multihomed computers, but the ambiguity can be resolved using the routing table by selecting the interface with the shortest route to the remote endpoint.

    Clarity would be enhanced in other respects, too. A socket is not identified by the combination of IP address and port:

    [...]TCP demultiplexes incoming segments using all four values that comprise the local and foreign addresses: destination IP address, destination port number, source IP address, and source port number. TCP cannot determine which process gets an incoming segment by looking at the destination port only. Also, the only one of the [various] endpoints at [a given port number] that will receive incoming connection requests is the one in the listen state. (p255, TCP-IP Illustrated Volume 1, W. Richard Stevens)

    As you can see, it is not just possible but quite likely for a network service to have numerous sockets with the same address/port, but only one listener socket on a particular address/port combination. Typical library implementations present a socket class, an instance of which is used to create and manage a connection. This is extremely unfortunate, since it causes confusion and has lead to widespread conflation of the two concepts.

    Hagrawal doesn't believe me (see comments) so here's a real sample. I connected a web browser to http://dilbert.com and then ran netstat -an -p tcp. The last six lines of the output contain two examples of the fact that address and port are not enough to uniquely identify a socket. There are two distinct connections between 192.168.1.3 (my workstation) and 54.252.94.236:80 (the remote HTTP server)

      TCP    192.168.1.3:63240      54.252.94.236:80       SYN_SENT
      TCP    192.168.1.3:63241      54.252.94.236:80       SYN_SENT
      TCP    192.168.1.3:63242      207.38.110.62:80       SYN_SENT
      TCP    192.168.1.3:63243      207.38.110.62:80       SYN_SENT
      TCP    192.168.1.3:64161      65.54.225.168:443      ESTABLISHED
    

    Since a socket is the endpoint of a connection, there are two sockets with the address/port combination 207.38.110.62:80 and two more with the address/port combination 54.252.94.236:80.

    I think Hagrawal's misunderstanding arises from my very careful use of the word "identifies". I mean "completely, unambiguously and uniquely identifies". In the above sample there are two endpoints with the address/port combination 54.252.94.236:80. If all you have is address and port, you don't have enough information to tell these sockets apart. It's not enough information to identify a socket.

    Addendum

    Paragraph two of section 2.7 of RFC793 says

    A connection is fully specified by the pair of sockets at the ends. A local socket may participate in many connections to different foreign sockets.

    This definition of socket is not helpful from a programming perspective because it is not the same as a socket object, which is the endpoint of a particular connection. To a programmer, and most of this question's audience are programmers, this is a vital functional difference.

    References

    1. TCP-IP Illustrated Volume 1 The Protocols, W. Richard Stevens, 1994 Addison Wesley

    2. RFC793, Information Sciences Institute, University of Southern California for DARPA

    3. RFC147, The Definition of a Socket, Joel M. Winett, Lincoln Laboratory

    0 讨论(0)
  • 2020-11-22 15:20

    The port was the easiest part, it is just a unique identifier for a socket. A socket is something processes can use to establish connections and to communicate with each other. Tall Jeff had a great telephone analogy which was not perfect, so I decided to fix it:

    • ip and port ~ phone number
    • socket ~ phone device
    • connection ~ phone call
    • establishing connection ~ calling a number
    • processes, remote applications ~ people
    • messages ~ speech
    0 讨论(0)
  • 2020-11-22 15:21

    These are basic networking concepts so I will explain them in an easy yet a comprehensive way to understand in details.

    • A socket is like a telephone (i.e. end to end device for communication)
    • IP is like your telephone number (i.e. address for your socket)
    • Port is like the person you want to talk to (i.e. the service you want to order from that address)
    • A socket can be a client or a server socket (i.e. in a company the telephone of the customer support is a server but a telephone in your home is mostly a client)

    So a socket in networking is a virtual communication device bound to a pair (ip , port) = (address , service).

    Note:

    • A machine, a computer, a host, a mobile, or a PC can have multiple addresses , multiple open ports, and thus multiple sockets. Like in an office you can have multiple telephones with multiple telephone numbers and multiple people to talk to.
    • Existence of an open/active port necessitate that you must have a socket bound to it, because it is the socket that makes the port accessible. However, you may have unused ports for the time being.
    • Also note, in a server socket you can bind it to (a port, a specific address of a machine) or to (a port, all addresses of a machine) as in the telephone you may connect many telephone lines (telephone numbers) to a telephone or one specific telephone line to a telephone and still you can reach a person through all these telephone lines or through a specific telephone line.
    • You can not associate (bind) a socket with two ports as in the telephone usually you can not always have two people using the same telephone at the same time .
    • Advanced: on the same machine you cannot have two sockets with same type (client, or server) and same port and ip. However, if you are a client you can open two connections, with two sockets, to a server because the local port in each of these client's sockets is different)

    Hope it clears you doubts

    0 讨论(0)
  • 2020-11-22 15:23

    Socket is an abstraction provided by kernel to user applications for data I/O. A socket type is defined by the protocol its handling, an IPC communication etc. So if somebody creates a TCP socket he can do manipulations like reading data to socket and writing data to it by simple methods and the lower level protocol handling like TCP conversions and forwarding packets to lower level network protocols is done by the particular socket implementation in the kernel. The advantage is that user need not worry about handling protocol specific nitigrities and should just read and write data to socket like a normal buffer. Same is true in case of IPC, user just reads and writes data to socket and kernel handles all lower level details based on the type of socket created.

    Port together with IP is like providing an address to the socket, though its not necessary, but it helps in network communications.

    0 讨论(0)
  • 2020-11-22 15:23

    A port denotes a communication endpoint in the TCP and UDP transports for the IP network protocol. A socket is a software abstraction for a communication endpoint commonly used in implementations of these protocols (socket API). An alternative implementation is the XTI/TLI API.

    See also:

    Stevens, W. R. 1998, UNIX Network Programming: Networking APIs: Sockets and XTI; Volume 1, Prentice Hall.
    Stevens, W. R., 1994, TCP/IP Illustrated, Volume 1: The Protocols, Addison-Wesley.

    0 讨论(0)
提交回复
热议问题