I read data from a .csv file to a Pandas dataframe as below. For one of the columns, namely id
, I want to specify the column type as int
. The probl
It is now possible to create a pandas column containing NaNs as dtype int
, since it is now officially added on pandas 0.24.0
pandas 0.24.x release notes Quote: "Pandas has gained the ability to hold integer dtypes with missing values
import pandas as pd
df= pd.read_csv("data.csv")
df['id'] = pd.to_numeric(df['id'])
My use case is munging data prior to loading into a DB table:
df[col] = df[col].fillna(-1)
df[col] = df[col].astype(int)
df[col] = df[col].astype(str)
df[col] = df[col].replace('-1', np.nan)
Remove NaNs, convert to int, convert to str and then reinsert NANs.
It's not pretty but it gets the job done!
use pd.to_numeric()
df["DateColumn"] = pd.to_numeric(df["DateColumn"])
simple and clean
Try this:
df[['id']] = df[['id']].astype(pd.Int64Dtype())
If you print it's dtypes
, you will get id Int64
instead of normal one int64