Syntax while setting schema for Pyspark.sql using StructType

前端 未结 2 997
日久生厌
日久生厌 2021-01-31 17:47

I am new to spark and was playing around with Pyspark.sql. According to the pyspark.sql documentation here, one can go about setting the Spark dataframe and schema like this:

相关标签:
2条回答
  • 2021-01-31 18:08

    It means if the column allows null values, true for nullable, and false for not nullable

    StructField(name, dataType, nullable): Represents a field in a StructType. The name of a field is indicated by name. The data type of a field is indicated by dataType. nullable is used to indicate if values of this fields can have null values.

    Refer to Spark SQL and DataFrame Guide for more informations.

    0 讨论(0)
  • 2021-01-31 18:29

    You can also use a datatype string:

    schema = 'Name STRING, DateTime TIMESTAMP, Age INTEGER'
    

    There's not much documentation on datatype strings, but they mention them in the docs. They're much more compact and readable than StructTypes

    0 讨论(0)
提交回复
热议问题