For sure, the fastest way to iterate over a dataframe is to access the underlying numpy ndarray either via df.values
(as you do) or by accessing each column separately df.column_name.values
. Since you want to have access to the index too, you can use df.index.values
for that.
index = df.index.values
column_of_interest1 = df.column_name1.values
...
column_of_interestk = df.column_namek.values
for i in range(df.shape[0]):
index_value = index[i]
...
column_value_k = column_of_interest_k[i]
Not pythonic? Sure. But fast.
If you want to squeeze more juice out of the loop you will want to look into cython. Cython will let you gain huge speedups (think 10x-100x). For maximum performance check memory views for cython.