pyspark: TypeError: IntegerType can not accept object in type

前端未结

关注

 2  1114

爱一瞬间的悲伤 2021-02-20 17:11

programming with pyspark on a Spark cluster, the data is large and in pieces so can not be loaded into the memory or check the sanity of the data easily

basically it loo

2条回答

北海茫月 (楼主)

2021-02-20 17:52

With apache 2.0 you can let spark infer the schema of your data. Overall you'll need to cast in your parser function as argued above:

"When schema is None, it will try to infer the schema (column names and types) from data, which should be an RDD of Row, or namedtuple, or dict."

0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...