How to use S3DistCp in java code

非 Y 不嫁゛ 提交于 2020-01-24 09:33:11

问题


I want to copy output of job from EMR cluster to Amazon S3 pro-grammatically.

How to use S3DistCp in java code to do the same.


回答1:


hadoop ToolRunner can run this.. since S3DistCP extends Tool

Below is the usage example:

import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.apache.hadoop.util.ToolRunner;
import com.amazon.external.elasticmapreduce.s3distcp.S3DistCp

public class CustomS3DistCP{
  private static final Log log = LogFactory.getLog(CustomS3DistCP.class);

public static void main(String[] args) throws Exception {
     log.info("Running with args: " + args);

     System.exit(ToolRunner.run(new S3DistCp(), args));
   }

you have to have s3distcp jar in your classpath You can call this program from a shell script.

Hope that helps!



来源:https://stackoverflow.com/questions/18124845/how-to-use-s3distcp-in-java-code

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!