Unable to access S3 data using Spark 2.2

前端 未结 1 700
隐瞒了意图╮
隐瞒了意图╮ 2021-01-13 02:28

I get a lot of data uploaded to an S3 bucket that I want so analyze/visualize using Spark and Zeppelin. Yet, I am still stuck at loading data from S3.

I did some rea

相关标签:
1条回答
  • 2021-01-13 03:26

    Mixing and matching AWS SDK JARs with anything else is an exercise in futility, as you've discovered. You need the version of the AWS JARs Hadoop was built with, and the version of Jackson AWS was built with. Oh, and don't try mixing any of (different amazon-* JARs, different hadoop-* JARs, different jackson-* JARs); they all go in lock-sync.

    For Spark 2.2.0 and Hadoop 2.7, use AWS 1.7.4 artifacts, and make sure that if you are on Java 8, that Joda time is > 2.8.0, such as 2.9.4. That can lead to 400 "bad auth problems".

    Otherwise, try Troubleshooting S3A

    0 讨论(0)
提交回复
热议问题