I have a data frame as taken from SDSS database. Example data is here.
I want to remove the character \'b\' from data[\'class\']
. I tried
Further explanation:
df = pd.DataFrame([b'123']) # create dataframe with b'' element
Now we can call
df[0].str.decode('utf-8') # returns a pd.series applying decode on str succesfully
df[0].decode('utf-8') # tries to decode the series and throws an error
Basically what you are doing with .str() is applying it for all elements. It could also be written like this:
df[0].apply(lambda x: x.decode('utf-8'))
You're working with byte strings. You might consider str.decode
:
data['class'] = data['class'].str.decode('utf-8')