Consider two threads, T1 and T2, that store and load an atomic integer a_i respectively and that the load happens after the store:
T1