I\'m writing a large file to HDFS using spark. Basically what I was doing was to join 3 big files and then convert the result dataframe to json using toJSON() and then use s
From this error:
Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException): File
/user/dawei/upid_json_all/_temporary/0/_temporary/attempt_201512210857_0006_m_000037_361/
part-00037 could only be replicated to 0 nodes instead of minReplication (=1).
There are 5 datanode(s) running and no node(s) are excluded in this operation.
It seems that replication is not happening. If you fix this error, things may fall in right place.
It may be due to below issues:
Have a look at related SE questions too on this topic.
HDFS error: could only be replicated to 0 nodes, instead of 1