发表新帖

发表新帖

Pyspark S3 error: java.lang.NoClassDefFoundError: com/amazonaws/services/s3/model/MultiObjectDeleteException

后端未结

关注

 3  1691

Been unsuccessful setting a spark cluster that can read AWS s3 files. The software I used are as follows:

hadoop-aws-3.2.0.jar
aws-java-sdk-1.11.887.jar<

相关标签:

3条回答

隐瞒了意图╮

2021-01-24 07:13

Hadoop 3.2 was built against 1.11.563; stick the full shaded sdk of that specific version in your classpath "aws-java-sdk-bundle" and all should be well.

The SDK has been "fussy" in the past...and upgrade invariably causes surprises. For the curious Qualifying an AWS SDK update. It's probably about time someone does it again.

0 讨论(0)
发布评论:

提交评论
- 加载中...
走了就别回头了

2021-01-24 07:24

I was able to solve this issue on Spark 3.0/ Hadoop 3.2. I documented my answer here as well - AWS EKS Spark 3.0, Hadoop 3.2 Error - NoClassDefFoundError: com/amazonaws/services/s3/model/MultiObjectDeleteException

Use following AWS Java SDK bundle and this issue will be solved -

aws-java-sdk-bundle-1.11.874.jar (https://mvnrepository.com/artifact/com.amazonaws/aws-java-sdk-bundle/1.11.874)

0 讨论(0)
发布评论:

提交评论
- 加载中...
盖世英雄少女心

2021-01-24 07:32

So I cleaned-up everything and re-installed the following versions of jars and it worked: hadoop-aws-2.7.4.jar, aws-java-sdk-1.7.4.2.jar. Spark install version: spark-2.4.7-bin-hadoop2.7. Python version: Python 3.6.

0 讨论(0)
发布评论:

提交评论
- 加载中...

热议问题