Concatenate cells into a string with separator pandas python

前端 未结 5 1898
暖寄归人
暖寄归人 2021-02-14 16:53

Given the following:

df = pd.DataFrame({\'col1\' : [\"a\",\"b\"],
            \'col2\'  : [\"ab\",np.nan], \'col3\' : [\"w\",\"e\"]})

I would l

相关标签:
5条回答
  • 2021-02-14 17:15
    In [1556]: df.apply(lambda x: '*'.join(x.dropna().astype(str).values), axis=1)
    Out[1556]: 
    0    a*ab*w
    1       b*e
    2     3*4*�
    3     ñ*ü*á
    dtype: object
    
    0 讨论(0)
  • 2021-02-14 17:30
    df.apply(lambda row: '*'.join(row.dropna()), axis=1)
    
    0 讨论(0)
  • 2021-02-14 17:30
    In [68]:
    
    df['new_col'] = df.apply(lambda x: '*'.join(x.dropna().values.tolist()), axis=1)
    df
    Out[68]:
      col1 col2 col3 new_col
    0    a   ab    w  a*ab*w
    1    b  NaN    e     b*e
    

    UPDATE

    If you have ints or float you can convert these to str first:

    In [74]:
    
    df = pd.DataFrame({'col1' : ["a","b",3],
                'col2'  : ["ab",np.nan, 4], 'col3' : ["w","e", 6]})
    df
    Out[74]:
      col1 col2 col3
    0    a   ab    w
    1    b  NaN    e
    2    3    4    6
    In [76]:
    
    df['new_col'] = df.apply(lambda x: '*'.join(x.dropna().astype(str).values), axis=1)
    df
    Out[76]:
      col1 col2 col3 new_col
    0    a   ab    w  a*ab*w
    1    b  NaN    e     b*e
    2    3    4    6   3*4*6
    

    Another update

    In [81]:
    
    df = pd.DataFrame({'col1' : ["a","b",3,'ñ'],
                'col2'  : ["ab",np.nan, 4,'ü'], 'col3' : ["w","e", 6,'á']})
    df
    Out[81]:
      col1 col2 col3
    0    a   ab    w
    1    b  NaN    e
    2    3    4    6
    3    ñ    ü    á
    
    In [82]:
    
    df['new_col'] = df.apply(lambda x: '*'.join(x.dropna().astype(str).values), axis=1)
    ​
    df
    Out[82]:
      col1 col2 col3 new_col
    0    a   ab    w  a*ab*w
    1    b  NaN    e     b*e
    2    3    4    6   3*4*6
    3    ñ    ü    á   ñ*ü*á
    

    My code still works with Spanish characters

    0 讨论(0)
  • 2021-02-14 17:36

    You can use dropna()

    df['col4'] = df.apply(lambda row: '*'.join(row.dropna()), axis=1)
    

    UPDATE:

    Since, you need to convert numbers and special chars too, you can use astype(unicode)

    In [37]: df = pd.DataFrame({'col1': ["a", "b"], 'col2': ["ab", np.nan], "col3": [3, u'\xf3']})
    
    In [38]: df.apply(lambda row: '*'.join(row.dropna().astype(unicode)), axis=1)
    Out[38]: 
    0    a*ab*3
    1       b*ó
    dtype: object
    
    In [39]: df['col4'] = df.apply(lambda row: '*'.join(row.dropna().astype(unicode)), axis=1)
    
    In [40]: df
    Out[40]: 
      col1 col2 col3    col4
    0    a   ab    3  a*ab*3
    1    b  NaN    ó     b*ó
    
    0 讨论(0)
  • 2021-02-14 17:38
    for row in xrange(len(df)):
        s = '*'.join(df.ix[row].dropna().tolist())
        print s
    
    0 讨论(0)
提交回复
热议问题