Write data to Redshift using Spark 2.0.1

冷暖自知 提交于 2020-01-24 15:45:09


I am doing a POC, where I want to write some simple data set to Redshift.

I have following sbt file:

name := "Spark_POC"

version := "1.0"

scalaVersion := "2.10.6"

libraryDependencies += "org.apache.spark" % "spark-core_2.10" % "2.0.1"

libraryDependencies += "org.apache.spark" % "spark-sql_2.10" % "2.0.1"

resolvers += "jitpack" at "https://jitpack.io"

libraryDependencies += "com.databricks" %% "spark-redshift" % "3.0.0-preview1"

and following code:

object Main extends App{

  val conf = new SparkConf().setAppName("Hello World").setMaster("local[2]")

  System.setProperty("hadoop.home.dir", "C:\\Users\\Srdjan Nikitovic\\Desktop\\scala\\hadoop")

  val spark = SparkSession
    .appName("Spark 1")

  val tempS3Dir = "s3n://access_key:secret_access_key@bucket_location"

  spark.sparkContext.hadoopConfiguration.set("fs.s3n.impl", "org.apache.hadoop.fs.s3native.NativeS3FileSystem")
  spark.sparkContext.hadoopConfiguration.set("fs.s3n.awsAccessKeyId", "access_key")
  spark.sparkContext.hadoopConfiguration.set("fs.s3n.awsSecretAccessKey", "secret_access_key")

  val data =

    .option("url", "jdbc:redshift://redshift_server:5439/database?user=user_name&password=password")
    .option("dbtable", "public.testSpark")
    .option("tempdir", tempS3Dir)

I am running the code from local Windows machine, thru Intellij.

I get the following error:

Exception in thread "main" java.lang.ClassNotFoundException: Could not load an Amazon Redshift JDBC driver; see the README for instructions on downloading and configuring the official Amazon driver.

I have tried with almost all the versions of Spark-Redshift drivers, (1.0.0, 2.0.0, 2.0.1 and now 3.0.0-PREVIEW) and I can't get this code to work.

Any help?


You first need to download the Redshift JDBC driver from Amazon.

Then you must tell Spark about it in the environment where this code is running. E.g. for a spark-shell running on EMR:

spark-shell … --jars /usr/share/aws/redshift/jdbc/RedshiftJDBC41.jar

