问题
I have csv file in s3 and I wanted to transform some columns and put the result in another s3 bucket and sometimes in same bucket but with different folder. Can I achieve it using Kiba? Im possible.. do I need to store the csv data in database first before transformation and other stuff?
回答1:
Thanks for using Kiba! There is no such implementation sample available today. I'll provide vendor-supported S3 components as part of Kiba Pro in the future.
That said, what you have in mind is definitely possible (I've done this for some clients) - and there is definitely no need to store the CSV data in a database first.
What you need to do is implement a Kiba S3 source and destination which will do that for you.
I recommend that you check out the AWS Ruby SDK, and in particular the S3 Examples.
The following links will be particularly helpful:
- https://docs.aws.amazon.com/sdk-for-ruby/v3/developer-guide/s3-example-get-bucket-items.html to list the bucket items
- https://docs.aws.amazon.com/sdk-for-ruby/v3/developer-guide/s3-example-get-bucket-item.html to download the file locally before processing it
- https://docs.aws.amazon.com/sdk-for-ruby/v3/developer-guide/s3-example-upload-bucket-item.html to upload a file back to S3
Hope this helps!
来源:https://stackoverflow.com/questions/48257335/is-there-a-sample-implementation-of-kiba-etl-job-using-s3-bucket-with-csv-files