pandas series repeat n time and change column value

前端 未结 4 1332
半阙折子戏
半阙折子戏 2021-01-13 03:14

I have input data like this.

NAME | PLACE | DATE
  A  |   X   | 2020-04-30
  B  |   Y   | 2019-04-30

i want to duplicate 5 times and change d

相关标签:
4条回答
  • 2021-01-13 03:41

    Here is a way to do it:

    df_out = df.reindex(df.index.repeat(6))
    
    df_out['DATE'] += pd.Series([pd.DateOffset(years=i) 
                                  for i in df_out.groupby('AME').cumcount()], 
                                index=df_out.index)    
    df_out.reset_index(drop=True)
    

    Output:

          AME    PLACE       DATE
    0     A       X    2020-04-30
    1     A       X    2021-04-30
    2     A       X    2022-04-30
    3     A       X    2023-04-30
    4     A       X    2024-04-30
    5     A       X    2025-04-30
    6     B       Y    2019-04-30
    7     B       Y    2020-04-30
    8     B       Y    2021-04-30
    9     B       Y    2022-04-30
    10    B       Y    2023-04-30
    11    B       Y    2024-04-30
    
    0 讨论(0)
  • 2021-01-13 03:44

    I think there isn't possible to repeat in the way you want. but maybe this approach help you:

    first_clm = (["A"] * 6)
    first_clm.extend(["B"] * 6)
    scnd_clm = (["X"] * 6)
    scnd_clm.extend(["Y"] * 6)
    third_clm = ["20%s-04-30" % i for i in range(20,26)]
    third_clm.extend(["20%s-04-30" % i for i in range(19,25)])
    pd.DataFrame({"NAME": first_clm, "PLACE": scnd_clm, "DATE":third_clm})
    
    
    0 讨论(0)
  • 2021-01-13 03:45

    Let's try this, convert single date to an array of dates for the given range & will utilize DataFrame.explode to transform each element of a list-like to a row.

    import pandas as pd
    
    df = pd.DataFrame({
        "Name": ["A", "B"],
        "Place": ["X", "Y"],
        "Date": ["2020-04-30", "2020-04-30"]
    })
    
    expand = 5
    print(
        df.assign(
            Date=pd.to_datetime(df.Date)
                .apply(lambda x: [x.replace(x.year + i) for i in range(0, expand + 1)])
        ).explode("Date").reset_index(drop=True)
    )
    

       Name Place       Date
    0     A     X 2020-04-30
    1     A     X 2021-04-30
    2     A     X 2022-04-30
    3     A     X 2023-04-30
    4     A     X 2024-04-30
    5     A     X 2025-04-30
    6     B     Y 2020-04-30
    7     B     Y 2021-04-30
    8     B     Y 2022-04-30
    9     B     Y 2023-04-30
    10    B     Y 2024-04-30
    11    B     Y 2025-04-30
    
    0 讨论(0)
  • 2021-01-13 03:56

    Use:

    df['Date'] = pd.to_datetime(df['Date'])
    
    y = np.array([pd.offsets.DateOffset(years=_) for _ in np.tile(range(6), len(df.index))])
    df = df.reindex(df.index.repeat(6)).assign(Date=lambda x: x['Date'] + y)
    

    Details:

    Create a np.array of DateOffset objects that needs to be added to the Date column to get the desired year offset.

    print(y)
    array([<DateOffset: years=0>, <DateOffset: years=1>,
           <DateOffset: years=2>, <DateOffset: years=3>,
           <DateOffset: years=4>, <DateOffset: years=5>,
           <DateOffset: years=0>, <DateOffset: years=1>,
           <DateOffset: years=2>, <DateOffset: years=3>,
           <DateOffset: years=4>, <DateOffset: years=5>], dtype=object)
    

    Use reindex to reindex the dataframe as required and use assign to add the Date with the years.

    print(df)
      Name Place       Date
    0    A     X 2020-04-30
    0    A     X 2021-04-30
    0    A     X 2022-04-30
    0    A     X 2023-04-30
    0    A     X 2024-04-30
    0    A     X 2025-04-30
    1    B     Y 2019-04-30
    1    B     Y 2020-04-30
    1    B     Y 2021-04-30
    1    B     Y 2022-04-30
    1    B     Y 2023-04-30
    1    B     Y 2024-04-30
    
    0 讨论(0)
提交回复
热议问题