Save CSV file to hbase table using Spark and Phoenix

瘦欲@ 提交于 2019-12-13 15:04:46

问题


Can someone point me to a working example of saving a csv file to Hbase table using Spark 2.2 Options that I tried and failed (Note: all of them work with Spark 1.6 for me)

  1. phoenix-spark
  2. hbase-spark
  3. it.nerdammer.bigdata : spark-hbase-connector_2.10

All of them finally after fixing everything give similar error to this Spark HBase

Thanks


回答1:


Add below parameters to your spark job-

spark-submit \
--conf "spark.yarn.stagingDir=/somelocation" \
--conf "spark.hadoop.mapreduce.output.fileoutputformat.outputdir=/s‌​omelocation" \
--conf "spark.hadoop.mapred.output.dir=/somelocation"



回答2:


Phoexin has plugin and jdbc thin client which can connect(read/write) to HBASE, example are in https://phoenix.apache.org/phoenix_spark.html

Option 1 : Connect via zookeeper url - phoenix plugin

            import org.apache.spark.SparkContext
            import org.apache.spark.sql.SQLContext
            import org.apache.phoenix.spark._

            val sc = new SparkContext("local", "phoenix-test")
            val sqlContext = new SQLContext(sc)

            val df = sqlContext.load(
              "org.apache.phoenix.spark",
              Map("table" -> "TABLE1", "zkUrl" -> "phoenix-server:2181")
            )

            df
              .filter(df("COL1") === "test_row_1" && df("ID") === 1L)
              .select(df("ID"))
              .show

Option 2 : Use JDBC thin client provied by phoenix query server

more info on https://phoenix.apache.org/server.html

jdbc:phoenix:thin:url=http://localhost:8765;serialization=PROTOBUF


来源:https://stackoverflow.com/questions/46477932/save-csv-file-to-hbase-table-using-spark-and-phoenix

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!