Is it possible to read in a CSV as a pandas DataFrame and set spaces (or empty cells) to 0 in one line? Below is an illustration of the problem.
Input:
df.replace(r'\s+', 0, regex=True)
a b c
0 0.0 a 0.0
1 0.0 b 1.0
2 1.5 c 2.5
3 2.1 d 3.0
Almost in one line, and might not work in a real case.
You can set missing values to be mapped to NaN in read_csv
import pandas as pd
df = pd.read_csv('data.csv', na_values=" ")
yielding
a b c
0 NaN a 0.0
1 0.0 b 1.0
2 1.5 c 2.5
3 2.1 d 3.0
Then, you can run a fillna to change the NaN's to .0
.
Hence, the following line does it all:
df = pd.read_csv('data.csv', na_values=" ").fillna(0)
gives
a b c
0 0.0 a 0.0
1 0.0 b 1.0
2 1.5 c 2.5
3 2.1 d 3.0
Pandas will automatically read the empty values with NaN, so from there just fill them with the fillna method, setting the desired new value(in this case 0).
import pandas as pd
df = pd.read_csv('data.csv').fillna(value = 0)
Which yields:
a b c
0 0.0 a 0.0
1 0.0 b 1.0
2 1.5 c 2.5
3 2.1 d 3.0
Also you can set different values for each column by passing a dict. Imagine we have the following csv file:
a b c
0 NaN a 0.0
1 0.0 b 1.0
2 1.5 NaN 2.5
3 2.1 d NaN
If we want it to be the same as before we should do:
pd.read_csv('data.csv').fillna(value = {'a':0,'b':'c','c':3})
Yielding again:
a b c
0 0.0 a 0.0
1 0.0 b 1.0
2 1.5 c 2.5
3 2.1 d 3.0