Wide to long returns empty output - Python dataframe

后端 未结 2 403
盖世英雄少女心
盖世英雄少女心 2020-12-21 23:40

I have a dataframe which can be generated from the code as given below

df = pd.DataFrame({\'person_id\' :[1,2,3],\'date1\': 
[\'12/31/2007\',\'11/25/2009\',\         


        
相关标签:
2条回答
  • 2020-12-21 23:53

    The issue is with your column names, the numbers used to convert from wide to long need to be at the end of your column names or you need to specify a suffix to groupby. I think the easiest solution is to create a function that accepts regex and the dataframe.

    import pandas as pd
    import re
    
    def change_names(df, regex):
        # Select one of three column groups
        old_cols = df.filter(regex = regex).columns
        # Create list of new column names
        new_cols = []
        for col in old_cols:
            # Get the stubname of the original column
            stub = ''.join(re.split(r'\d', col))
            # Get the time point
            num = re.findall(r'\d+', col) # returns a list like ['1']
            # Make new column name
            new_col = stub + num[0]
            new_cols.append(new_col)
    
        # Create dictionary mapping old column names to new column names
        dd = {oc: nc for oc, nc in zip(old_cols, new_cols)}
        # Rename columns
        df.rename(columns = dd, inplace = True)
    
        return df
    
    
    tdf = pd.DataFrame({'person_id' :[1,2,3],'h1date': ['12/31/2007','11/25/2009','10/06/2005'],'t1val': [2,4,6],'h2date': ['12/31/2017','11/25/2019','10/06/2015'],'t2val':[1,3,5],'h3date': ['12/31/2027','11/25/2029','10/06/2025'],'t3val':[7,9,11]})
    
    # Change date columns
    tdf = change_names(tdf, 'date$')
    tdf = change_names(tdf, 'val$')
    
    print(tdf)
        person_id      hdate1  tval1      hdate2  tval2      hdate3  tval3
    0          1  12/31/2007      2  12/31/2017      1  12/31/2027      7
    1          2  11/25/2009      4  11/25/2019      3  11/25/2029      9
    2          3  10/06/2005      6  10/06/2015      5  10/06/2025     11
    
    0 讨论(0)
  • 2020-12-21 23:58

    Try adding additional argument in the function which allows the strings suffix.

    pd.long_to_wide(.......................,suffix='\w+')
    
    0 讨论(0)
提交回复
热议问题