Reading file from HDFS using Spring batch

前端 未结 1 462
感情败类
感情败类 2021-01-06 21:18

I\'ve to write a Spring batch which will read a file from HDFS and will update the data in MySQL DB.

The source file in HDFS contains some report data, in CSV format

相关标签:
1条回答
  • 2021-01-06 21:40

    The FlatFileItemReader in Spring Batch works with any Spring Framework Resource implementation:

    @Bean
    public FlatFileItemReader<String> itemReader() {
        Resource resource; // get (or autowire) resource
        return new FlatFileItemReaderBuilder<String>()
                .resource(resource)
                // set other reader properties
                .build();
    }
    

    So if you manage to have a Resource handle pointing to a HDFS file, your are done.

    Now in order to have a HDFS resource, you can:

    • Use Spring for Hadoop. Once the HDFS file system is configured, you would be able to get the resource from the application context with applicationContext.getResource("hdfs:data.csv");
    • Implement your own Resource using Hadoop APIs (like shown in the answer by Michael Simons). I see that some folks already did this here

    Hope this helps.

    0 讨论(0)
提交回复
热议问题