Unable to access S3 data using Spark 2.2

前端未结

关注

 1  721

I get a lot of data uploaded to an S3 bucket that I want so analyze/visualize using Spark and Zeppelin. Yet, I am still stuck at loading data from S3.

I did some rea

相关标签:

1条回答

慢半拍i

2021-01-13 03:26

Mixing and matching AWS SDK JARs with anything else is an exercise in futility, as you've discovered. You need the version of the AWS JARs Hadoop was built with, and the version of Jackson AWS was built with. Oh, and don't try mixing any of (different amazon-* JARs, different hadoop-* JARs, different jackson-* JARs); they all go in lock-sync.

For Spark 2.2.0 and Hadoop 2.7, use AWS 1.7.4 artifacts, and make sure that if you are on Java 8, that Joda time is > 2.8.0, such as 2.9.4. That can lead to 400 "bad auth problems".

Otherwise, try Troubleshooting S3A

0 讨论(0)
发布评论:

提交评论
- 加载中...