The python threading documentation states that \"...threading is still an appropriate model if you want to run multiple I/O-bound tasks simultaneously\", apparently because
The GIL in CPython1 is only concerned with Python code being executed. A thread-safe C extension that uses a lot of CPU might release the GIL as long as it doesn't need to interact with the Python runtime.
As soon as the C code needs to 'talk' to Python (read: call back into the Python runtime) then it needs to acquire the GIL again - that is, the GIL is to establish protection/atomic behavior for the "interpreter" (and I use the term loosely) and is not to prevent native/non-Python code from running concurrently.
Releasing the GIL around I/O (blocking or not, using CPU or not) is the same thing - until the data is moved into Python there is no reason to acquire the GIL.
1 The GIL is controversial because it prevents multithreaded CPython programs from taking full advantage of multiprocessor systems in certain situations. Note that potentially blocking or long-running operations, such as I/O, image processing, and NumPy number crunching, happen outside the GIL. Therefore it is only in multithreaded programs that spend a lot of time inside the GIL, interpreting CPython bytecode, that the GIL becomes a bottleneck.
All of Python's blocking I/O primitives release the GIL while waiting for the I/O block to resolve -- it's as simple as that! They will of course need to acquire the GIL again before going on to execute further Python code, but for the long-in-terms-of-machine-cycles intervals in which they're just waiting for some I/O syscall, they don't need the GIL, so they don't hold on to it!