I\'m sure this is simple, but as a complete newbie to python, I\'m having trouble figuring out how to iterate over variables in a pandas
dataframe and run a regress
This answer is to iterate over selected columns as well as all columns in a DF.
df.columns
gives a list containing all the columns' names in the DF. Now that isn't very helpful if you want to iterate over all the columns. But it comes in handy when you want to iterate over columns of your choosing only.
We can use Python's list slicing easily to slice df.columns according to our needs. For eg, to iterate over all columns but the first one, we can do:
for column in df.columns[1:]:
print(df[column])
Similarly to iterate over all the columns in reversed order, we can do:
for column in df.columns[::-1]:
print(df[column])
We can iterate over all the columns in a lot of cool ways using this technique. Also remember that you can get the indices of all columns easily using:
for ind, column in enumerate(df.columns):
print(ind, column)
A workaround is to transpose the DataFrame
and iterate over the rows.
for column_name, column in df.transpose().iterrows():
print column_name