NoClassDefFoundError com.apache.hadoop.fs.FSDataInputStream when execute spark-shell

前端 未结 14 971
北荒
北荒 2020-11-30 02:07

I\'ve downloaded the prebuild version of spark 1.4.0 without hadoop (with user-provided Haddop). When I ran the spark-shell command, I got this error:

> E         


        
相关标签:
14条回答
  • 2020-11-30 02:16

    linux

    ENV SPARK_DIST_CLASSPATH="$HADOOP_HOME/etc/hadoop/*:$HADOOP_HOME/share/hadoop/common/lib/*:$HADOOP_HOME/share/hadoop/common/*:$HADOOP_HOME/share/hadoop/hdfs/*:$HADOOP_HOME/share/hadoop/hdfs/lib/*:$HADOOP_HOME/share/hadoop/hdfs/*:$HADOOP_HOME/share/hadoop/yarn/lib/*:$HADOOP_HOME/share/hadoop/yarn/*:$HADOOP_HOME/share/hadoop/mapreduce/lib/*:$HADOOP_HOME/share/hadoop/mapreduce/*:$HADOOP_HOME/share/hadoop/tools/lib/*"
    

    windows

    set SPARK_DIST_CLASSPATH=%HADOOP_HOME%\etc\hadoop\*;%HADOOP_HOME%\share\hadoop\common\lib\*;%HADOOP_HOME%\share\hadoop\common\*;%HADOOP_HOME%\share\hadoop\hdfs\*;%HADOOP_HOME%\share\hadoop\hdfs\lib\*;%HADOOP_HOME%\share\hadoop\hdfs\*;%HADOOP_HOME%\share\hadoop\yarn\lib\*;%HADOOP_HOME%\share\hadoop\yarn\*;%HADOOP_HOME%\share\hadoop\mapreduce\lib\*;%HADOOP_HOME%\share\hadoop\mapreduce\*;%HADOOP_HOME%\share\hadoop\tools\lib\*
    
    0 讨论(0)
  • 2020-11-30 02:19

    I had the same issue ....Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/fs/ FSDataInputStream at org.apache.spark.deploy.SparkSubmitArguments$$anonfun$mergeDefaultSpa rkProperties$1.apply(SparkSubmitArguments.scala:111)... Then I realized that I had installed the spark version without hadoop. I installed the "with-hadoop" version the problem went away.

    0 讨论(0)
  • 2020-11-30 02:21

    Run below from your package dir just before running spark-submit -

    export SPARK_DIST_CLASSPATH=`hadoop classpath`
    
    0 讨论(0)
  • 2020-11-30 02:22

    I got this error because the file was copied from Windows. Resolve it using

    dos2unix file_name
    
    0 讨论(0)
  • 2020-11-30 02:23

    I ran into the same error when trying to get familiar with spark. My understanding of the error message is that while spark doesn't need a hadoop cluster to run, it does need some of the hadoop classes. Since I was just playing around with spark and didn't care what version of hadoop libraries are used, I just downloaded a spark binary pre-built with a version of hadoop (2.6) and things started working fine.

    0 讨论(0)
  • 2020-11-30 02:23

    I finally find a solution to remove the exception.

    In spark-class2.cmd, add :

    set HADOOP_CLASS1=%HADOOP_HOME%\share\hadoop\common\*
    set HADOOP_CLASS2=%HADOOP_HOME%\share\hadoop\common\lib\*
    set HADOOP_CLASS3=%HADOOP_HOME%\share\hadoop\mapreduce\*
    set HADOOP_CLASS4=%HADOOP_HOME%\share\hadoop\mapreduce\lib\*
    set HADOOP_CLASS5=%HADOOP_HOME%\share\hadoop\yarn\*
    set HADOOP_CLASS6=%HADOOP_HOME%\share\hadoop\yarn\lib\*
    set HADOOP_CLASS7=%HADOOP_HOME%\share\hadoop\hdfs\*
    set HADOOP_CLASS8=%HADOOP_HOME%\share\hadoop\hdfs\lib\*
    
    set CLASSPATH=%HADOOP_CLASS1%;%HADOOP_CLASS2%;%HADOOP_CLASS3%;%HADOOP_CLASS4%;%HADOOP_CLASS5%;%HADOOP_CLASS6%;%HADOOP_CLASS7%;%HADOOP_CLASS8%;%LAUNCH_CLASSPATH%
    

    Then, change :

    "%RUNNER%" -cp %CLASSPATH%;%LAUNCH_CLASSPATH% org.apache.spark.launcher.Main %* > %LAUNCHER_OUTPUT%
    

    to :

    "%RUNNER%" -Dhadoop.home.dir=*hadoop-installation-folder* -cp %CLASSPATH% %JAVA_OPTS% %*
    

    It works fine with me, but I'm not sure this is the best solution.

    0 讨论(0)
提交回复
热议问题