Creating a Pyspark Schema involving an ArrayType

后端 未结 1 1239
挽巷
挽巷 2021-02-14 06:06

I\'m trying to create a schema for my new DataFrame and have tried various combinations of brackets and keywords but have been unable to figure out how to make this work. My cu

1条回答
  •  梦如初夏
    2021-02-14 06:16

    You will need an additional StructField for ArrayType property. This one should work:

    from pyspark.sql.types import *
    
    schema = StructType([
      StructField("User", IntegerType()),
      StructField("My_array", ArrayType(
          StructType([
              StructField("user", StringType()),
              StructField("product", StringType()),
              StructField("rating", DoubleType())
          ])
       )
    ])
    

    For more information check this link: http://nadbordrozd.github.io/blog/2016/05/22/one-weird-trick-that-will-fix-your-pyspark-schemas/

    0 讨论(0)
提交回复
热议问题