how design and implement an instrumentation tool that can find the top 20 most frequently called routines, and the top 20 loads and top 20 stores that generate most cache mi