Spark: Self-suppression not permitted when writing big file to HDFS

前端 未结 1 1598
南笙
南笙 2021-01-15 23:07

I\'m writing a large file to HDFS using spark. Basically what I was doing was to join 3 big files and then convert the result dataframe to json using toJSON() and then use s

相关标签:
1条回答
  • 2021-01-16 00:02

    From this error:

    Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException): File 
    /user/dawei/upid_json_all/_temporary/0/_temporary/attempt_201512210857_0006_m_000037_361/
    part-00037 could only be replicated to 0 nodes instead of minReplication (=1).
    There are 5 datanode(s) running and no node(s) are excluded in this operation.
    

    It seems that replication is not happening. If you fix this error, things may fall in right place.

    It may be due to below issues:

    1. Inconsistency in your datanodes: Restart your Hadoop cluster and see if this solves your problem
    2. Communication between datanodes and namenode: Network connectivity Issues and permission/firewall access issues related to port accessibility.
    3. Disk space may be full on datanode
    4. Datanode may be busy or unresponsive
    5. Invalid configuration like Negative block size configuration

    Have a look at related SE questions too on this topic.

    HDFS error: could only be replicated to 0 nodes, instead of 1

    0 讨论(0)
提交回复
热议问题