Change column type in pandas

前端 未结 9 1340
萌比男神i
萌比男神i 2020-11-21 04:15

I want to convert a table, represented as a list of lists, into a Pandas DataFrame. As an extremely simplified example:

a = [[\'a\', \'1.2\', \'         


        
相关标签:
9条回答
  • 2020-11-21 05:10

    Here is a function that takes as its arguments a DataFrame and a list of columns and coerces all data in the columns to numbers.

    # df is the DataFrame, and column_list is a list of columns as strings (e.g ["col1","col2","col3"])
    # dependencies: pandas
    
    def coerce_df_columns_to_numeric(df, column_list):
        df[column_list] = df[column_list].apply(pd.to_numeric, errors='coerce')
    

    So, for your example:

    import pandas as pd
    
    def coerce_df_columns_to_numeric(df, column_list):
        df[column_list] = df[column_list].apply(pd.to_numeric, errors='coerce')
    
    a = [['a', '1.2', '4.2'], ['b', '70', '0.03'], ['x', '5', '0']]
    df = pd.DataFrame(a, columns=['col1','col2','col3'])
    
    coerce_df_columns_to_numeric(df, ['col2','col3'])
    
    0 讨论(0)
  • 2020-11-21 05:10

    I thought I had the same problem but actually I have a slight difference that makes the problem easier to solve. For others looking at this question it's worth checking the format of your input list. In my case the numbers are initially floats not strings as in the question:

    a = [['a', 1.2, 4.2], ['b', 70, 0.03], ['x', 5, 0]]
    

    but by processing the list too much before creating the dataframe I lose the types and everything becomes a string.

    Creating the data frame via a numpy array

    df = pd.DataFrame(np.array(a))
    
    df
    Out[5]: 
       0    1     2
    0  a  1.2   4.2
    1  b   70  0.03
    2  x    5     0
    
    df[1].dtype
    Out[7]: dtype('O')
    

    gives the same data frame as in the question, where the entries in columns 1 and 2 are considered as strings. However doing

    df = pd.DataFrame(a)
    
    df
    Out[10]: 
       0     1     2
    0  a   1.2  4.20
    1  b  70.0  0.03
    2  x   5.0  0.00
    
    df[1].dtype
    Out[11]: dtype('float64')
    

    does actually give a data frame with the columns in the correct format

    0 讨论(0)
  • 2020-11-21 05:15

    this below code will change datatype of column.

    df[['col.name1', 'col.name2'...]] = df[['col.name1', 'col.name2'..]].astype('data_type')
    

    in place of data type you can give your datatype .what do you want like str,float,int etc.

    0 讨论(0)
提交回复
热议问题