Removing b'' from string column in a pandas dataframe

前端 未结 2 1284
一向
一向 2021-01-18 03:21

I have a data frame as taken from SDSS database. Example data is here.

I want to remove the character \'b\' from data[\'class\']. I tried

相关标签:
2条回答
  • 2021-01-18 03:41

    Further explanation:

    df = pd.DataFrame([b'123']) # create dataframe with b'' element
    

    Now we can call

    df[0].str.decode('utf-8') # returns a pd.series applying decode on str succesfully
    df[0].decode('utf-8') # tries to decode the series and throws an error
    

    Basically what you are doing with .str() is applying it for all elements. It could also be written like this:

    df[0].apply(lambda x: x.decode('utf-8')) 
    
    0 讨论(0)
  • 2021-01-18 03:56

    You're working with byte strings. You might consider str.decode:

    data['class'] = data['class'].str.decode('utf-8') 
    
    0 讨论(0)
提交回复
热议问题