NIO is reasonable for performance, and should use no more than #cores threads, multiplicated by the IO factor of your application, where the IO factor is the percentage of your app waiting for disk IO to complete.
The reason is simple. When you have #cores worker, each worker is likely to be bound to a single cpu core and can unitilize it to its maximum. the more workers, the more context switches, and this is exactly what you don't want and why you use NIO in the first place.
If the workers have to wait for IO, they could handle other requests, so use some more workers than cores for full cpu utilization.
If you use threads, you get the following advantages:
- you can store session information in ThreadLocals.
- you don't have to manage the session information in other ways.