I have the following pandas DataFrame:
time Group blocks
0 1 A 4
1 2 A 7
2 3 A
Look at this variants. The first is Andrews' curves and the second is a multiline plot which are grouped by one column Month
. The dataframe data
includes three columns Temperature
, Day
, and Month
:
import pandas as pd
import statsmodels.api as sm
import matplotlib.pylab as plt
from pandas.tools.plotting import andrews_curves
data = sm.datasets.get_rdataset('airquality').data
fig, (ax1, ax2) = plt.subplots(nrows = 2, ncols = 1)
data = data[data.columns.tolist()[3:]] # use only Temp, Month, Day
# Andrews' curves
andrews_curves(data, 'Month', ax=ax1)
# multiline plot with group by
for key, grp in data.groupby(['Month']):
ax2.plot(grp['Day'], grp['Temp'], label = "Temp in {0:02d}".format(key))
plt.legend(loc='best')
plt.show()
When you plot Andrews' curve your data salvaged to one function. It means that Andrews' curves that are represented by functions close together suggest that the corresponding data points will also be close together.
You can re-structure the data as a pivot table:
df.pivot_table(index='time',columns='Group',values='blocks',aggfunc='sum').plot()