Approach number 1 will most likely be compiled as an array of 4-byte integers, and one bit of each will be used to store your data. Theoretically a smart compiler could optimize this, but I wouldn't count on it.
Is there a reason you don't want to use std::bitset?