PySpark using IAM roles to access S3

前端未结

关注

 5  1489

野趣味 2021-02-08 23:48

I\'m wondering if PySpark supports S3 access using IAM roles. Specifically, I have a business constraint where I have to assume an AWS role in order to access a given bucket. Th

5条回答

灰色年华 (楼主)

2021-02-09 00:17
IAM role for accessing s3 is only support by s3a, because it is using AWS SDK.

You need to put hadoop-aws JAR and aws-java-sdk JAR (and third-party Jars in its package) into your CLASSPATH.

hadoop-aws link.

aws-java-sdk link.

Then set this in core-site.xml:
```
    fs.s3.impl
    org.apache.hadoop.fs.s3a.S3AFileSystem


    fs.s3a.impl
    org.apache.hadoop.fs.s3a.S3AFileSystem
```
0 讨论(0)

查看其它5个回答
发布评论:

提交评论
- 加载中...