Sqoop on Dataproc cannot export data to Avro format

吃可爱长大的小学妹 提交于 2019-12-20 02:37:26

问题


I want to use Sqoop to pull data from Postgres database, I use Google Dataproc to execute Sqoop. However, I get an error when I submit the Sqoop job.

I use the following commands:

Create a cluster with 1.3.24-deb9 image version

gcloud dataproc clusters create <CLUSTER_NAME> \ 
    --region=asia-southeast1 --zone=asia-southeast1-a \
    --properties=hive:hive.metastore.warehouse.dir=gs://<BUCKET>/hive-warehouse \
    --master-boot-disk-size=100 

Submit a job

gcloud dataproc jobs submit hadoop --cluster=<CLUSTER_NAME> \
    --region=asia-southeast1 \
    --class=org.apache.sqoop.Sqoop \
    --jars=gs://<BUCKET>/sqoop-1.4.7-hadoop260.jar,gs://<BUCKET>/avro-tools-1.8.2.jar,gs://<BUCKET>/postgresql-42.2.5.jar \
    -- \
    import -Dmapreduce.job.user.classpath.first=true \
    --connect=jdbc:postgresql://<HOST>:5432/<DATABASE> \
    --username=<USER> \
    --password-file=gs://BUCKET/pass.txt \
    --target-dir=gs://<BUCKET>/<OUTPUT> \
    --table=<TABLE> \
    --as-avrodatafile

Error

19/02/26 04:52:38 INFO mapreduce.Job: Running job: job_1551156514661_0001
19/02/26 04:52:48 INFO mapreduce.Job: Job job_xxx_0001 running in uber mode : false
19/02/26 04:52:48 INFO mapreduce.Job:  map 0% reduce 0%
19/02/26 04:52:48 INFO mapreduce.Job: Job job_xxx_0001 failed with state FAILED due to: Application application_xxx_0001 failed 2 times due to AM Container for appattempt_xxx_0001_000002 exited with  exitCode: 1
Failing this attempt.Diagnostics: [2019-02-26 04:52:47.771]Exception from container-launch.
Container id: container_xxx_0001_02_000001
Exit code: 1

[2019-02-26 04:52:47.779]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
log4j:WARN No such property [containerLogFile] in org.apache.hadoop.yarn.ContainerLogAppender.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/hadoop/yarn/nm-local-dir/usercache/root/filecache/10/libjars/avro-tools-1.8.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
log4j:WARN No appenders could be found for logger (org.apache.hadoop.mapreduce.v2.app.MRAppMaster).


[2019-02-26 04:52:47.780]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
log4j:WARN No such property [containerLogFile] in org.apache.hadoop.yarn.ContainerLogAppender.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/hadoop/yarn/nm-local-dir/usercache/root/filecache/10/libjars/avro-tools-1.8.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]

回答1:


The issue could be in different Avro versions in Dataproc's Hadoop (Avro 1.7.7) and Sqoop 1.4.7 (Avro 1.8.1).

You may want to try to downgrade Sqoop to 1.4.6 that depends on Avro 1.7 and use avro-tools-1.7.7.jar during job submission.



来源:https://stackoverflow.com/questions/54878798/sqoop-on-dataproc-cannot-export-data-to-avro-format

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!