pyspark: TypeError: IntegerType can not accept object in type

前端 未结 2 1068
爱一瞬间的悲伤
爱一瞬间的悲伤 2021-02-20 17:11

programming with pyspark on a Spark cluster, the data is large and in pieces so can not be loaded into the memory or check the sanity of the data easily

basically it loo

2条回答
  •  北海茫月
    2021-02-20 17:52

    With apache 2.0 you can let spark infer the schema of your data. Overall you'll need to cast in your parser function as argued above:

    "When schema is None, it will try to infer the schema (column names and types) from data, which should be an RDD of Row, or namedtuple, or dict."

提交回复
热议问题