My incomplete understanding is that Twisted, Stackless, Greenlet, Eventlet, Coroutines all make use of async network IO and userland threads that are very lightweight and quick
You are almost right when comparing Stackless to Greenlet. The missing thing is:
Stackless per se does not add something. Instead, Greenlet, invented 5 years after Stackless, removes certain things. It is written simple enough to be built as an extension module instead of a replacement interpreter.
This is really funny—Stackless has many more features, is about 10 times more efficient on switching, and provides pickling of execution state.
Greenlet still wins, probably only due to ease of use as an extension module. So I'm thinking about reverting the process by extending Greenlet with pickling. Maybe that would change the picture, again :-)