Exception Handling in Pandas .apply() function

后端 未结 3 1894
忘了有多久
忘了有多久 2021-02-18 16:15

If I have a DataFrame:

myDF = DataFrame(data=[[11,11],[22,\'2A\'],[33,33]], columns = [\'A\',\'B\'])

Gives the following dataframe (Starting ou

相关标签:
3条回答
  • 2021-02-18 16:32

    I had the same question, but for a more general case where it was hard to tell if the function would generate an exception (i.e. you couldn't explicitly check this condition with something as straightforward as isdigit).

    After thinking about it for a while, I came up with the solution of embedding the try/except syntax in a separate function. I'm posting a toy example in case it helps anyone.

    import pandas as pd
    import numpy as np
    
    x=pd.DataFrame(np.array([['a','a'], [1,2]]))
    
    def augment(x):
        try:
            return int(x)+1
        except:
            return 'error:' + str(x)
    
    x[0].apply(lambda x: augment(x))
    
    0 讨论(0)
  • 2021-02-18 16:37

    A way to achieve that with lambda:

    myDF['B'].apply(lambda x: int(x) if str(x).isdigit() else None)
    

    For your input:

    >>> myDF
        A   B
    0  11  11
    1  22  2A
    2  33  33
    
    [3 rows x 2 columns]
    

    >>> myDF['B'].apply(lambda x: int(x) if str(x).isdigit() else None)
    0    11
    1   NaN
    2    33
    Name: B, dtype: float64
    
    0 讨论(0)
  • 2021-02-18 16:52

    much better/faster to do:

    In [1]: myDF = DataFrame(data=[[11,11],[22,'2A'],[33,33]], columns = ['A','B'])
    
    In [2]: myDF.convert_objects(convert_numeric=True)
    Out[2]: 
        A   B
    0  11  11
    1  22 NaN
    2  33  33
    
    [3 rows x 2 columns]
    
    In [3]: myDF.convert_objects(convert_numeric=True).dtypes
    Out[3]: 
    A      int64
    B    float64
    dtype: object
    

    This is a vectorized method of doing just this. The coerce flag say to mark as nan anything that cannot be converted to numeric.

    You can of course do this to a single column if you'd like.

    0 讨论(0)
提交回复
热议问题