Spark Scala read csv file using s3a

后端 未结 1 1197
暗喜
暗喜 2021-01-25 22:52

I am trying to read a csv (native) file from an S3 bucket using a locally running Spark - Scala. I am able to read the file using the http protocol but I intend to use the s3a p

相关标签:
1条回答
  • 2021-01-25 22:57

    Anyone else struggling with this I had to update the version of hadoop-client

    additionally the links below were quite helpful

    • https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html

    • https://disqus.com/by/cfeduke/?utm_source=reply&utm_medium=email&utm_content=comment_author

    • http://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region

    pom details below

    <properties>
        <spark.version>2.2.0</spark.version>
        <hadoop.version>2.8.0</hadoop.version>
    
    </properties>
    
    
    <dependencies>
        <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-core_2.11 -->
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_2.11</artifactId>
            <version>${spark.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-sql_2.11</artifactId>
            <version>${spark.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-client</artifactId>
            <version>${hadoop.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-aws</artifactId>
            <version>${hadoop.version}</version>
        </dependency>
    
    0 讨论(0)
提交回复
热议问题