I have created a shared object and access it from two different program and measuring the time.
DATA array is the shared object between two processes.<
I my opinion we can't predict the cache behavior as per your case.
It also depends upon your H/W which, you have not mentioned (like how many CPU cores(physical or logical) are present).
We can't say that program_2 will be scheduled on the same core and just after program_1 because it totally depends upon OS scheduler. So program_2 may use cache filled by program_1 or not.
It's possible that the cache is flushed out due to some other program which was scheduled just after program_1.
You didn't describe exactly how you run the different versions (different processes?), but assuming they're sequential - It is possible that you're seeing the affect of sleep()
It depends of course on the exact implementation and HW, but it's very likely to send your CPU into some power-saving/sleep state (that's what it's designed for). If that's the case, then the core caches will have to be flushed as part of the process, and you'll wake-up with cold caches. The whie loop on the other hand is intended to do a busy wait loop while grinding your CPU and keeping it alive (along with the caches), unless you happen to get a context switch along the way.
The exact details would again depend on implementation, on x86 you can use inline assembly to invoke monitor+mwait instructions that allow you to specify the exact C-state depth you want to achieve. The deeper it is, the more caches will get closed (mostly relevant for the L3).