Spark History Server on S3A FileSystem: ClassNotFoundException

前端 未结 3 2104
隐瞒了意图╮
隐瞒了意图╮ 2021-02-06 12:03

Spark can use Hadoop S3A file system org.apache.hadoop.fs.s3a.S3AFileSystem. By adding the following into the conf/spark-defaults.conf, I can get spark

3条回答
  •  醉话见心
    2021-02-06 12:12

    I added the following jars into my SPARK_HOME/jars directory and it works great:

    • hadoop-aws-*.jar (Version must be same as hadoop-common which you have)
    • aws-java-sdk-s3-*.jar (Choose the one compatible with hadoop-aws jar)
    • aws-java-sdk-*.jar (Choose same version as above one)
    • aws-java-sdk-core-*.jar (Choose same version as above one)
    • aws-java-sdk-dynamodb-*.jar (Choose same version as above, Frankly not sure why this is needed but doesn't work for me without this jar).

    Edit :

    And my spark_defaults.conf has below 3 parameters set :

    spark.eventLog.enabled : true
    spark.eventLog.dir : s3a://bucket_name/folder_name
    spark.history.fs.logDirectory : s3a://bucket_name/folder_name
    

提交回复
热议问题