I am new to spark and was playing around with Pyspark.sql. According to the pyspark.sql documentation here, one can go about setting the Spark dataframe and schema like this:
It means if the column allows null values, true
for nullable, and false
for not nullable
StructField(name, dataType, nullable): Represents a field in a StructType. The name of a field is indicated by name. The data type of a field is indicated by dataType. nullable is used to indicate if values of this fields can have null values.
Refer to Spark SQL and DataFrame Guide for more informations.