I need help in reshaping a data in csv file that have over 10000 row by 10 each. For example I have this csv file :
Ale Brick
1 ww
2 ee
3 qq
3 xx
5 dd
You can group by every 10th row and then reshape the values
df.groupby(np.repeat(np.arange(len(df) / 10), 10))['Brick'].apply(lambda x: x.values.reshape(2,5))
0.0 [[ww, ee, qq, xx, dd], [gg, hh, tt, yy, uu]]
1.0 [[ii, oo, pp, mm, ww], [zz, cc, rr, tt, ll]]
import pandas as pd
df = pd.read_csv(`"`test.csv`"`)
data = df['Brick']
k=int(len(data)/10)+1
for x in range(k):
temp=data[10*x:10*(x+1)]
print temp.values.reshape(2,5)
You need to make use of the modulo operator to "batch" reshape your column. You're on the right track. You just need another iterator to do the modulo operation.
import pandas as pd
df = pd.DataFrame({'brick': ['xx','yy','xa','bd','ev','bb','oo','pp','qq','bn','nv','bn','rr','qw','bn','cd','fd','bv','nm','ty']})
start = 0 # set start to 0 for slicing
for i in range(len(df.index)):
if (i + 1) % 10 == 0: # the modulo operation
result = df['brick'].iloc[start:i+1].reshape(2,5)
print result
start = i + 1 # set start to next index
Output:
[['xx' 'yy' 'xa' 'bd' 'ev']
['bb' 'oo' 'pp' 'qq' 'bn']]
[['nv' 'bn' 'rr' 'qw' 'bn']
['cd' 'fd' 'bv' 'nm' 'ty']]