When we start a server application, we always need to speicify the port number it listens to. But how is this \"listening mechanism\" implemented under the hood?
My cur
While the other answers seem to explain things correctly, let me give a more direct answer: your imagination is wrong.
There is no buffer that the application monitors. Instead, the application calls listen() at some point, and the OS remembers from then on that this application is interested in new connections to that port number. Only one application can indicate interest in a certain port at any time.
The listen operation does not block. Instead, it returns right away. What may block is accept(). The system has a backlog of incoming connections (buffering the data that have been received), and returns one of the connections every time accept is called. accept doesn't transmit any data; the application must then do recv() calls on the accepted socket.
As to your questions:
as others have said: hardware interrupts. The NIC takes the datagram completely off the wire, interrupts, and is assigned an address in memory to copy it to.
for TCP, there will be no data loss, as there will always be sufficient memory during the communication. TCP has flow control, and the sender will stop sending before the receiver has no more memory. For UDP and new TCP connections, there can be data loss; the sender will typically get an error indication (as the system reserves memory to accept just one more datagram).
see above: listen itself is not blocking; accept is.