问题
I have dataframe with three column "x" ,"y" and "z"
x y z
bn 12452 221
mb 14521 330
pl 12563 160
lo 22516 142
I need to create a another column which is derived by this formula
(m = z / y+z)
So the new data frameshould look something like this:
x y z m
bn 12452 221 .01743
mb 14521 330 .02222
pl 12563 160 .01257
lo 22516 142 .00626
回答1:
df = sqlContext.createDataFrame([('bn', 12452, 221), ('mb', 14521, 330)], ['x', 'y', 'z'])
df = df.withColumn('m', df['z'] / (df['y'] + df['z']))
df.head(2)
来源:https://stackoverflow.com/questions/40728017/how-to-do-mathematical-operation-with-two-column-in-dataframe-using-pyspark