I am new to spark & pyspark.
I am reading a small csv file (~40k) into a dataframe.
from pyspark.sql import functions as F
df = sqlContext.read.forma
I believe you are running Spark 2.x and above. Below code should create your dataframe from csv:
df = spark.read.format("csv").option("header", "true").load("csvfile.csv")
then you can have below code:
df = df.withColumn('verified', F.when(df['verified'] == 'Y', 1).otherwise(0))
and then you can create df2 without Row and toDF()
Let me know if this works or if you are using Spark 1.6...thanks.