Is there a sample implementation of Kiba ETL Job using s3 bucket with csv files as source and the destination is in s3 bucket also?

问题

I have csv file in s3 and I wanted to transform some columns and put the result in another s3 bucket and sometimes in same bucket but with different folder. Can I achieve it using Kiba? Im possible.. do I need to store the csv data in database first before transformation and other stuff?

回答1:

Thanks for using Kiba! There is no such implementation sample available today. I'll provide vendor-supported S3 components as part of Kiba Pro in the future.

That said, what you have in mind is definitely possible (I've done this for some clients) - and there is definitely no need to store the CSV data in a database first.

What you need to do is implement a Kiba S3 source and destination which will do that for you.

I recommend that you check out the AWS Ruby SDK, and in particular the S3 Examples.

The following links will be particularly helpful:

https://docs.aws.amazon.com/sdk-for-ruby/v3/developer-guide/s3-example-get-bucket-items.html to list the bucket items
https://docs.aws.amazon.com/sdk-for-ruby/v3/developer-guide/s3-example-get-bucket-item.html to download the file locally before processing it
https://docs.aws.amazon.com/sdk-for-ruby/v3/developer-guide/s3-example-upload-bucket-item.html to upload a file back to S3

Hope this helps!

来源：https://stackoverflow.com/questions/48257335/is-there-a-sample-implementation-of-kiba-etl-job-using-s3-bucket-with-csv-files

标签

kiba-etl

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!