In R, when adding new data of unequal length to a data frame, the values repeat to fill the data frame:
df <- data.frame(first=c(1,2,3,4,5,6))
df$second &
import pandas as pd
import numpy as np
def put(df, column, values):
df[column] = 0
np.put(df[column], np.arange(len(df)), values)
df = pd.DataFrame({'first':range(1, 8)})
put(df, 'second', [1,2,3])
yields
first second
0 1 1
1 2 2
2 3 3
3 4 1
4 5 2
5 6 3
6 7 1
Not particularly beautiful, but one "feature" it possesses is that you do not have to worry if the length of the DataFrame is a multiple of the length of the repeated values. np.put
repeats the values as necessary.
My first answer was:
import itertools as IT
df['second'] = list(IT.islice(IT.cycle([1,2,3]), len(df)))
but it turns out this is significantly slower:
In [312]: df = pd.DataFrame({'first':range(10**6)})
In [313]: %timeit df['second'] = list(IT.islice(IT.cycle([1,2,3]), len(df)))
10 loops, best of 3: 143 ms per loop
In [316]: %timeit df['second'] = 0; np.put(df['second'], np.arange(N), [1,2,3])
10 loops, best of 3: 27.9 ms per loop