How to copy data from one HDFS to another HDFS?

后端 未结 6 1252
日久生厌
日久生厌 2021-01-30 11:30

I have two HDFS setup and want to copy (not migrate or move) some tables from HDFS1 to HDFS2. How to copy data from one HDFS to another HDFS? Is it possible via Sqoop or other c

6条回答
  •  日久生厌
    2021-01-30 12:20

    distcp is used for copying data to and from the hadoop filesystems in parallel. It is similar to the generic hadoop fs -cp command. In the background process, distcp is implemented as a MapReduce job where mappers are only implemented for copying in parallel across the cluster.

    Usage:

    • copy one file to another

      % hadoop distcp file1 file2

    • copy directories from one location to another

      % hadoop distcp dir1 dir2

    If dir2 doesn't exist then it will create that folder and copy the contents. If dir2 already exists, then dir1 will be copied under it. -overwrite option forces the files to be overwritten within the same folder. -update option updates only the files that are changed.

    • transferring data between two HDFS clusters

      % hadoop distcp -update -delete hdfs://nn1/dir1 hdfs://nn2/dir2

    -delete option deletes the files or directories from the destination that are not present in the source.

提交回复
热议问题