Write a program and try to compare(measure, if you can) the time of accessing data from main memory and cache.
If you can do that, then how to measure the speed of each
Take a look at cachegrind-valgrind:
Cachegrind simulates how your program interacts with a machine's cache hierarchy and (optionally) branch predictor. It simulates a machine with independent first-level instruction and data caches (I1 and D1), backed by a unified second-level cache (L2). This exactly matches the configuration of many modern machines.
See tese nice questions they are somehow related: