I have a piece of code that runs 2x faster on windows than on linux. Here are the times I measured:
g++ -Ofast -march=native -m64
29.1123
g++ -Ofast -mar
You don't say whether the windows/linux operating systems are 32 or 64 bit.
On a 64-bit linux machine, if you change the size_t to an int you'll find that execution times drop on linux to a similar value to those that you have for windows.
size_t is an int32 on win32, an int64 on win64.
EDIT: just seen your windows disassembly.
Your windows OS is the 32-bit variety (or at least you've compiled for 32-bit).