I am trying to create a new column based on both columns. Say I want to create a new column z, and it should be the value of y when it is not missing and be the value of x when
Let's say DataFrame is called df
. First copy the y
column.
df["z"] = df["y"].copy()
Then set the nan locations of z to the locations in x where the nans are in z.
import numpy as np
df.z[np.isnan(df.z)]=df.x[np.isnan(df.z)]
>>> df
x y z
0 1 NaN 1
1 2 8 8
2 4 10 10
3 8 NaN 8
I'm not sure if I understand the question, but would this be what you're looking for?
"if y[i]" will skip if the value is none.
for i in range(len(x));
if y[i]:
z.append(y[i])
else:
z.append(x[i])
The update
method does almost exactly this. The only caveat is that update
will do so in place so you must first create a copy:
df['z'] = df.x.copy()
df.z.update(df.y)
In the above example you start with x
and replace each value with the corresponding value from y
, as long as the new value is not NaN
.
Use np.where
:
In [3]:
df['z'] = np.where(df['y'].isnull(), df['x'], df['y'])
df
Out[3]:
x y z
0 1 NaN 1
1 2 8 8
2 4 10 10
3 8 NaN 8
Here it uses the boolean condition and if true returns df['x']
else df['y']
The new column 'z'
get its values from column 'y'
using df['z'] = df['y']
. This brings over the missing values so fill them in using fillna
using column 'x'
. Chain these two actions:
>>> df['z'] = df['y'].fillna(df['x'])
>>> df
x y z
0 1 NaN 1
1 2 8 8
2 4 10 10
3 8 NaN 8
You can use apply
with option axis=1
. Then your solution is pretty concise.
df[z] = df.apply(lambda row: row.y if pd.notnull(row.y) else row.x, axis=1)