问题
I can reliably get a Winsock socket to connect()
to itself if I connect to localhost with a port in the range of automatically assigned ephemeral ports (5000–65534). Specifically, Windows appears to have a system-wide rolling port number which is the next port that it will try to assign as a local port number for a client socket. If I create sockets until the assigned number is just below my target port number, and then repeatedly create a socket and attempt to connect to that port number, I can usually get the socket to connect to itself.
I first got it to happen in an application that repeatedly tries to connect to a certain port on localhost, and when the service is not listening it very rarely successfully establishes a connection and receives the message that it initially sent (which happens to be a Redis PING
command).
An example, in Python (run with nothing listening to the target port):
import socket
TARGET_PORT = 49400
def mksocket():
return socket.socket(socket.AF_INET, socket.SOCK_STREAM, socket.IPPROTO_TCP)
while True:
sock = mksocket()
sock.bind(('127.0.0.1', 0))
host, port = sock.getsockname()
if port > TARGET_PORT - 10 and port < TARGET_PORT:
break
print port
while port < TARGET_PORT:
sock = mksocket()
err = None
try:
sock.connect(('127.0.0.1', TARGET_PORT))
except socket.error, e:
err = e
host, port = sock.getsockname()
if err:
print 'Unable to connect to port %d, used local port %d: %s' % (TARGET_PORT, port, err)
else:
print 'Connected to port %d, used local port %d' (TARGET_PORT, port)
On my Mac machine, this eventually terminates with Unable to connect to port 49400, used local port 49400
. On my Windows 7 machine, a connection is successfully established and it prints Connected to port 49400, used local port 49400
. The resulting socket receives any data that is sent to it.
Is this a bug in Winsock? Is this a bug in my code?
Edit: Here is a screenshot of TcpView with the offending connection shown:
回答1:
This appears to be a 'simultaneous initiation' as described in #3.4 of RFC 793. See Figure 8. Note that neither side is in state LISTEN at any stage. In your case, both ends are the same: that would cause it to work exactly as described in the RFC.
回答2:
It is a logic bug in your code.
First off, only newer versions of Windows use 5000–65534 as ephemeral ports. Older versions used 1025-5000 instead.
You are creating multiple sockets that are explicitly bound to random ephemeral ports until you have bound a socket that is within 10 ports less than your target port. However, if any of those sockets happen to actually bind to the actual target port, you ignore that and keep looping. So you may or may end up with a socket that is bound to the target port, and you may or may not end up with a final port
value that is actually less than the target port.
After that, if port
happens to be less than your target port (which is not guaranteed), you are then creating more sockets that are implicitly bound to different random available ephemeral ports when calling connect()
(it does an implicit bind()
internally if bind()
has not been called yet), none of which will be the same ephemeral ports that you explicitly bound to since those ports are already in use and cannot be used again.
At no point do you have any given socket connecting from an ephemeral port to the same ephemeral port. And unless another app happens to have bound itself to your target port and is actively listening on that port, then there is no way that connect()
can be successfully connecting to the target port on any of the sockets you create, since none of them are in the listening state. And getsockname()
is not valid on an unbound socket, and a connecting socket is not guaranteed to be bound if connect()
fails. So the symptoms you think are happening are actually physically impossible given the code you have shown. Your logging is simply making the wrong assumptions and thus is logging the wrong things, giving you a false state of being.
Try something more like this instead, and you will see what the real ports are:
import socket
TARGET_PORT = 49400
def mksocket():
return socket.socket(socket.AF_INET, socket.SOCK_STREAM, socket.IPPROTO_TCP)
while True:
sock = mksocket()
sock.bind(('127.0.0.1', 0))
host, port = sock.getsockname()
print 'Bound to local port %d' % (port)
if port > TARGET_PORT - 10 and port < TARGET_PORT:
break
if port >= TARGET_PORT:
print 'Bound port %d exceeded target port %d' % (port, TARGET_PORT)
else:
while port < TARGET_PORT:
sock = mksocket()
# connect() would do this internal anyway, so this is just to ensure a port is available for logging even if connect() fails
sock.bind(('127.0.0.1', 0))
err = None
try:
sock.connect(('127.0.0.1', TARGET_PORT))
except socket.error, e:
err = e
host, port = sock.getsockname()
if err:
print 'Unable to connect to port %d using local port %d' % (TARGET_PORT, port)
else:
print 'Connected to port %d using local port %d' % (TARGET_PORT, port)
来源:https://stackoverflow.com/questions/17584383/why-can-a-socket-connect-to-its-own-ephemeral-port