Lots of posts talk about the gyro drift problem. Some guys say that the gyro reading has drift, however others say the integration has drift.
Motivated by Ali reply (thanks Ali!), I did some reading and some numerical experiments and decided to post my own reply about the nature of gyro drift.
I've written a simple octave online script plotting white noise and integrated white noise:
The angle plot with reduced offset that is shown in the question seems to resemble a typical random walk. Mathematical random walks has zero mean value, so that cannot be accounted as drift. However, I believe numerical integration of white noise leads to non-zero mean (as can be seen in the histogram plot for random walk below). This, together with linearly increasing variance could be associated to the so-called gyro drift.
There is a great introduction to errors arising from gyroscopes and accelerometers here. In any case, I still have much to learn, so I could be wrong.
Regarding the complimentary filter, there's some discussion here, showing how the gyro drift is reduced by it. The article is very informal, but I found it interesting.