Your question deserves a much longer discussion but here's a short stab at an answer:
- using blocking sockets means that only one socket may be active at any time in any one thread (because it blocks while waiting for activity)
- using blocking sockets is generally easier than non-blocking sockets (asynchronous programming tends to be more complicated)
- you could create 1 thread per socket as you stated but threads have overhead and are extremely inefficient compared to the non-blocking solutions;
- with non-blocking sockets you could handle a much larger volume of clients: it could scale to hundreds of thousands in a single process - but the code becomes a little bit more complicated
With non-blocking sockets (on Windows) you have a couple of options:
- polling
- events based
- overlapped I/O
Overlapped I/O will give you the best performance (thousands of sockets / process) at the expense of being the most complicated model to understand and implement correctly.
Basically it comes down to performance vs. programming complexity.
NOTE
Here's a better explanation of why using a thread/socket model is a bad idea:
In windows, creating a large number of threads is highly inefficient because the scheduler is unable to properly determine which threads should be receiving processor time and which shouldn't. That coupled with the memory overhead of each thread means that you will run out of memory (because of stack space) and processor cycles (because of overhead in managing threads) at the OS level long before you will run out of capacity to handle socket connections.