Converting a column within pandas dataframe from int to string

前端 未结 5 1417
南笙
南笙 2020-11-29 17:17

I have a dataframe in pandas with mixed int and str data columns. I want to concatenate first the columns within the dataframe. To do that I have to convert an int

相关标签:
5条回答
  • 2020-11-29 17:39

    Change data type of DataFrame column:

    To int:

    df.column_name = df.column_name.astype(np.int64)

    To str:

    df.column_name = df.column_name.astype(str)

    0 讨论(0)
  • 2020-11-29 17:45
    In [16]: df = DataFrame(np.arange(10).reshape(5,2),columns=list('AB'))
    
    In [17]: df
    Out[17]: 
       A  B
    0  0  1
    1  2  3
    2  4  5
    3  6  7
    4  8  9
    
    In [18]: df.dtypes
    Out[18]: 
    A    int64
    B    int64
    dtype: object
    

    Convert a series

    In [19]: df['A'].apply(str)
    Out[19]: 
    0    0
    1    2
    2    4
    3    6
    4    8
    Name: A, dtype: object
    
    In [20]: df['A'].apply(str)[0]
    Out[20]: '0'
    

    Don't forget to assign the result back:

    df['A'] = df['A'].apply(str)
    

    Convert the whole frame

    In [21]: df.applymap(str)
    Out[21]: 
       A  B
    0  0  1
    1  2  3
    2  4  5
    3  6  7
    4  8  9
    
    In [22]: df.applymap(str).iloc[0,0]
    Out[22]: '0'
    

    df = df.applymap(str)
    
    0 讨论(0)
  • 2020-11-29 17:45

    Use the following code:

    df.column_name = df.column_name.astype('str')
    
    0 讨论(0)
  • 2020-11-29 17:48

    Warning: Both solutions given ( astype() and apply() ) do not preserve NULL values in either the nan or the None form.

    import pandas as pd
    import numpy as np
    
    df = pd.DataFrame([None,'string',np.nan,42], index=[0,1,2,3], columns=['A'])
    
    df1 = df['A'].astype(str)
    df2 =  df['A'].apply(str)
    
    print df.isnull()
    print df1.isnull()
    print df2.isnull()
    

    I believe this is fixed by the implementation of to_string()

    0 讨论(0)
  • 2020-11-29 18:04

    Just for an additional reference.

    All of the above answers will work in case of a data frame. But if you are using lambda while creating / modify a column this won't work, Because there it is considered as a int attribute instead of pandas series. You have to use str( target_attribute ) to make it as a string. Please refer the below example.

    def add_zero_in_prefix(df):
        if(df['Hour']<10):
            return '0' + str(df['Hour'])
    
    data['str_hr'] = data.apply(add_zero_in_prefix, axis=1)
    
    0 讨论(0)
提交回复
热议问题