问题
I have created a HBase by mentioning the default versions as 10
create 'tablename',{NAME => 'cf', VERSIONS => 10}
and inserted two rows(row1 and row2)
put 'tablename','row1','cf:id','row1id'
put 'tablename','row1','cf:name','row1name'
put 'tablename','row2','cf:id','row2id'
put 'tablename','row2','cf:name','row2name'
put 'tablename','row2','cf:name','row2nameupdate'
put 'tablename','row2','cf:name','row2nameupdateagain'
put 'tablename','row2','cf:name','row2nameupdateonemoretime'
Tried to select the data using scan
scan 'tablename',{RAW => true, VERSIONS => 10}
I'm able to see all the versions data.
Now created a Hive External table to point to this HBase table
CREATE EXTERNAL TABLE hive_timestampupdate(key int, value string)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf:name")
TBLPROPERTIES ("hbase.table.name" = "tablename");
When I queried the table hive_timestampupdate
, I'm able to see the data in HBase table.
select * from hive_timestampupdate;
Here I want to query the data based on timestamp. Is there a way to query the data based on timestamp of HBase table?
回答1:
Unfortunately, no. According to the Hive HBase Integration document,
there is currently no way to access the HBase timestamp attribute, and queries always access data with the latest timestamp.
There are some JIRAs talking about timestamp related functionality, but they don't really do what you are asking, and they haven't gotten a great reception :(
来源:https://stackoverflow.com/questions/29371987/accessing-hbase-table-data-from-hive-based-on-time-stamp