I am aware of how async await works. I know that when execution reaches to await, it release the thread and after IO completes, it fetches thread from threadpool and run the rem
Should we use async methods for the very fast IO method, like cache read/write method?
Such an IO would not block in the classical sense. "Blocking" is a loosely defined term. Normally it means that the CPU must wait for the hardware.
This type of IO is purely CPU work and there are no context switches. This would typically happen if the app reads a file or socket slower than data can be provided. Here, async IO does not help performance at all. I'm not even sure it would be suitable to unblock the UI thread since all tasks might complete synchronously.
Or it also saves cpu as well?
It generally increases CPU usage in real-world loads. This is because the async machinery adds processing, allocations and synchronization. Also, we need to transition to kernel mode two times instead of once (first to initiate the IO, then to dequeue the IO completion notification).
Typical workloads run with <<100% CPU. A production server with >60% CPU would worry me since there is no margin for error. In such cases the thread pool work queues are almost always empty. Therefore, there are no context switching savings caused by processing multiple IO completions on one context switch.
That's why CPU usage generally increases (slightly), except if the machine is very high on CPU load and the work queues are often capable of delivering a new item immediately.
On the server async IO is mainly useful for saving threads. If you have ample threads available you will realize zero or negative gains. In particular any single IO will not become one bit faster.
That means it does not consume cpu.
It would be a waste to leave the CPU unavailable while an IO is in progress. To the kernel an IO is just a data structure. While it's in progress there is no CPU work to be done.
An anonymous person said:
For IO-bound tasks there may not be a major performance advantage to using separate threads just to wait for a result.
Pushing the same work to a different thread certainly does not help with throughput. This is added work, not reduced work. It's a shell game. (And async IO does not use a thread while it's running so all of this is based on a false assumption.)
A simple way to convince yourself that async IO generally costs more CPU than sync IO is to run a simple TCP ping/pong benchmark sync and async. Sync is faster. This is kind of an artificial load so it's just a hint at what's going on and not a comprehensive measurement.