How to fill in rows with repeating data in pandas?

前端 未结 7 448
感情败类
感情败类 2021-01-02 02:59

In R, when adding new data of unequal length to a data frame, the values repeat to fill the data frame:

df <- data.frame(first=c(1,2,3,4,5,6))
df$second &         


        
7条回答
  •  小蘑菇
    小蘑菇 (楼主)
    2021-01-02 03:19

    import pandas as pd
    import numpy as np
    
    def put(df, column, values):
        df[column] = 0
        np.put(df[column], np.arange(len(df)), values)
    
    df = pd.DataFrame({'first':range(1, 8)})    
    put(df, 'second', [1,2,3])
    

    yields

       first  second
    0      1       1
    1      2       2
    2      3       3
    3      4       1
    4      5       2
    5      6       3
    6      7       1
    

    Not particularly beautiful, but one "feature" it possesses is that you do not have to worry if the length of the DataFrame is a multiple of the length of the repeated values. np.put repeats the values as necessary.


    My first answer was:

    import itertools as IT
    df['second'] = list(IT.islice(IT.cycle([1,2,3]), len(df)))
    

    but it turns out this is significantly slower:

    In [312]: df = pd.DataFrame({'first':range(10**6)})
    
    In [313]: %timeit df['second'] = list(IT.islice(IT.cycle([1,2,3]), len(df)))
    10 loops, best of 3: 143 ms per loop
    
    In [316]: %timeit df['second'] = 0; np.put(df['second'], np.arange(N), [1,2,3])
    10 loops, best of 3: 27.9 ms per loop
    

提交回复
热议问题