Spark: Self-suppression not permitted when writing big file to HDFS

前端未结

关注

 1  1599

I\'m writing a large file to HDFS using spark. Basically what I was doing was to join 3 big files and then convert the result dataframe to json using toJSON() and then use s

相关标签:

1条回答

鱼传尺愫

2021-01-16 00:02
From this error:
```
Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException): File 
/user/dawei/upid_json_all/_temporary/0/_temporary/attempt_201512210857_0006_m_000037_361/
part-00037 could only be replicated to 0 nodes instead of minReplication (=1).
There are 5 datanode(s) running and no node(s) are excluded in this operation.
```
It seems that replication is not happening. If you fix this error, things may fall in right place.

It may be due to below issues:
1. Inconsistency in your datanodes: Restart your Hadoop cluster and see if this solves your problem
2. Communication between datanodes and namenode: Network connectivity Issues and permission/firewall access issues related to port accessibility.
3. Disk space may be full on datanode
4. Datanode may be busy or unresponsive
5. Invalid configuration like Negative block size configuration
Have a look at related SE questions too on this topic.

HDFS error: could only be replicated to 0 nodes, instead of 1
0 讨论(0)
发布评论:

提交评论
- 加载中...