PySpark serialization EOFError

后端未结

关注

 3  1048

谎友^ 2021-01-01 09:56

I am reading in a CSV as a Spark DataFrame and performing machine learning operations upon it. I keep getting a Python serialization EOFError - any idea why? I thought it mi

3条回答

囚心锁ツ (楼主)

2021-01-01 10:07
The error appears to happen in the pySpark read_int function. Code for which is as follows from spark site :
```
def read_int(stream):
length = stream.read(4)
if not length:
    raise EOFError
return struct.unpack("!i", length)[0]
```
This would mean that when reading 4bytes from the stream, if 0 bytes are read, EOF error is raised. The python docs are here.
0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...