Taking Peter Norvig\'s advice, I am pondering on the question:
How much time does it take to fetch one word from memory, with and without a cache miss
Fair summary here, with some imprecision. When written (2+ years ago) and for a mid-range PC of the time, it estimated: memory access, 60ns; L1 cache, 10ns; L2 cache, 20-30ns (no estimates for L3 cache access times). It all varies a lot of course, depending on contention and access patterns (since cache layers are typically filled "by lines" from slower memory, if you access address X then address X+1 the second access may be a bit faster as the cache line filling was started by the first access).
And, of course, a high-end, well-tuned server will be much faster (relative differences between such machines in memory access latency are typically much larger than ones in "raw" CPU speeds).
There is a nice visualization of the data in that table published on github.org They also have a "human scale" reinterpretation of the time values there.
Seems like Norvig answers this himself:
execute typical instruction 1/1,000,000,000 sec = 1 nanosec
fetch from L1 cache memory 0.5 nanosec
branch misprediction 5 nanosec
fetch from L2 cache memory 7 nanosec
Mutex lock/unlock 25 nanosec
fetch from main memory 100 nanosec
send 2K bytes over 1Gbps network 20,000 nanosec
read 1MB sequentially from memory 250,000 nanosec
fetch from new disk location (seek) 8,000,000 nanosec
read 1MB sequentially from disk 20,000,000 nanosec
send packet US to Europe and back 150 milliseconds = 150,000,000 nanosec
The part where it says "execute typical instruction" = 1 ns implies a 1 GHz CPU (assuming efficient pipelining, of course).
I don't know where he takes this information, but I trust Peter Norvig to be reliable :-)