I wrote the following short C++ program to reproduce the false sharing effect as described by Herb Sutter:
Say, we want to perform a total amount of WORKLOAD integer op
You should be able to request the required alignment from the compiler:
alignas(64) int arr[PARALELL * PADDING]; // align the array to a 64 byte line