问题
I am able to read the messages from Kafka using the below code:
val ssc = new StreamingContext(sc, Seconds(50))
val topicmap = Map("test" -> 1)
val lines = KafkaUtils.createStream(ssc,"127.0.0.1:2181", "test-consumer-group",topicmap)
But, I am trying to read each message from Kafka and putting into HBase. This is my code to write into HBase but no success.
lines.foreachRDD(rdd => {
rdd.foreach(record => {
val i = +1
val hConf = new HBaseConfiguration()
val hTable = new HTable(hConf, "test")
val thePut = new Put(Bytes.toBytes(i))
thePut.add(Bytes.toBytes("cf"), Bytes.toBytes("a"), Bytes.toBytes(record))
})
})
回答1:
Well, you are not actually executing the Put, you are mereley creating a Put request and adding data to it. What you are missing is an
hTable.put(thePut);
回答2:
Adding other answer!!
You can use foreachPartition
to establish connection at executor level to be more efficient instead of each row which is costly operation.
lines.foreachRDD(rdd => {
rdd.foreachPartition(iter => {
val hConf = new HBaseConfiguration()
val hTable = new HTable(hConf, "test")
iter.foreach(record => {
val i = +1
val thePut = new Put(Bytes.toBytes(i))
thePut.add(Bytes.toBytes("cf"), Bytes.toBytes("a"), Bytes.toBytes(record))
//missing part in your code
hTable.put(thePut);
})
})
})
来源:https://stackoverflow.com/questions/27246386/spark-rdd-write-to-hbase