Spark on yarn jar upload problems

前端 未结 2 1017
死守一世寂寞
死守一世寂寞 2021-01-13 17:34

I am trying to run a simple Map/Reduce java program using spark over yarn (Cloudera Hadoop 5.2 on CentOS). I have tried this 2 different ways. The first way is the following

2条回答
  •  不知归路
    2021-01-13 18:24

    if you are getting this error it means you are uploading assembly jars using --jars option or manually copying to hdfs in each node. i have followed this approach and it works for me .

    In yarn-cluster mode, Spark submit automatically uploads the assembly jar to a distributed cache that all executor containers read from, so there is no need to manually copy the assembly jar to all nodes (or pass it through --jars). It seems there are two versions of the same jar in your HDFS.

    Try removing all old jars from your .sparkStaging directory and try again ,it should work .

提交回复
热议问题