I want to use Sqoop to pull data from Postgres database, I use Google Dataproc to execute Sqoop. However, I get an error when I submit the Sqoop job.
I use the followin
The issue could be in different Avro versions in Dataproc's Hadoop (Avro 1.7.7) and Sqoop 1.4.7 (Avro 1.8.1).
You may want to try to downgrade Sqoop to 1.4.6 that depends on Avro 1.7 and use avro-tools-1.7.7.jar
during job submission.
Edited:
To resolve class-loading issue, you need to set mapreduce.job.classloader=true
when submitting Dataproc job:
gcloud dataproc jobs submit hadoop --cluster= \
--class=org.apache.sqoop.Sqoop \
--jars=gs:///sqoop-1.4.7-hadoop260.jar \
--properties=mapreduce.job.classloader=true \
-- \
. . .