问题
I want to use Sqoop to pull data from Postgres database, I use Google Dataproc to execute Sqoop. However, I get an error when I submit the Sqoop job.
I use the following commands:
Create a cluster with 1.3.24-deb9 image version
gcloud dataproc clusters create <CLUSTER_NAME> \
--region=asia-southeast1 --zone=asia-southeast1-a \
--properties=hive:hive.metastore.warehouse.dir=gs://<BUCKET>/hive-warehouse \
--master-boot-disk-size=100
Submit a job
gcloud dataproc jobs submit hadoop --cluster=<CLUSTER_NAME> \
--region=asia-southeast1 \
--class=org.apache.sqoop.Sqoop \
--jars=gs://<BUCKET>/sqoop-1.4.7-hadoop260.jar,gs://<BUCKET>/avro-tools-1.8.2.jar,gs://<BUCKET>/postgresql-42.2.5.jar \
-- \
import -Dmapreduce.job.user.classpath.first=true \
--connect=jdbc:postgresql://<HOST>:5432/<DATABASE> \
--username=<USER> \
--password-file=gs://BUCKET/pass.txt \
--target-dir=gs://<BUCKET>/<OUTPUT> \
--table=<TABLE> \
--as-avrodatafile
Error
19/02/26 04:52:38 INFO mapreduce.Job: Running job: job_1551156514661_0001
19/02/26 04:52:48 INFO mapreduce.Job: Job job_xxx_0001 running in uber mode : false
19/02/26 04:52:48 INFO mapreduce.Job: map 0% reduce 0%
19/02/26 04:52:48 INFO mapreduce.Job: Job job_xxx_0001 failed with state FAILED due to: Application application_xxx_0001 failed 2 times due to AM Container for appattempt_xxx_0001_000002 exited with exitCode: 1
Failing this attempt.Diagnostics: [2019-02-26 04:52:47.771]Exception from container-launch.
Container id: container_xxx_0001_02_000001
Exit code: 1
[2019-02-26 04:52:47.779]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
log4j:WARN No such property [containerLogFile] in org.apache.hadoop.yarn.ContainerLogAppender.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/hadoop/yarn/nm-local-dir/usercache/root/filecache/10/libjars/avro-tools-1.8.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
log4j:WARN No appenders could be found for logger (org.apache.hadoop.mapreduce.v2.app.MRAppMaster).
[2019-02-26 04:52:47.780]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
log4j:WARN No such property [containerLogFile] in org.apache.hadoop.yarn.ContainerLogAppender.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/hadoop/yarn/nm-local-dir/usercache/root/filecache/10/libjars/avro-tools-1.8.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
回答1:
The issue could be in different Avro versions in Dataproc's Hadoop (Avro 1.7.7) and Sqoop 1.4.7 (Avro 1.8.1).
You may want to try to downgrade Sqoop to 1.4.6 that depends on Avro 1.7 and use avro-tools-1.7.7.jar
during job submission.
来源:https://stackoverflow.com/questions/54878798/sqoop-on-dataproc-cannot-export-data-to-avro-format