I have created a shared object and access it from two different program and measuring the time.
DATA array is the shared object between two processes.<
You didn't describe exactly how you run the different versions (different processes?), but assuming they're sequential - It is possible that you're seeing the affect of sleep()
It depends of course on the exact implementation and HW, but it's very likely to send your CPU into some power-saving/sleep state (that's what it's designed for). If that's the case, then the core caches will have to be flushed as part of the process, and you'll wake-up with cold caches. The whie loop on the other hand is intended to do a busy wait loop while grinding your CPU and keeping it alive (along with the caches), unless you happen to get a context switch along the way.
The exact details would again depend on implementation, on x86 you can use inline assembly to invoke monitor+mwait instructions that allow you to specify the exact C-state depth you want to achieve. The deeper it is, the more caches will get closed (mostly relevant for the L3).