Continuous data migration from mysql to Hbase

好久不见. 提交于 2019-12-24 18:05:12

问题


I have installed hadoop and hbase for real time analytics purpose. The proble I face is to migrate data on line from mysql to Hbase.

The sqoop tool is useful to do bulk data migrations, is there any way that the data from mysql can be transfered to HBase on line (then and there when an insert/update/delete happens). So that real time analytics can be achieved. Not near-real-time.

Please help me on this regards.


回答1:


I think you faced task of setting up replication between different DBMS. It is a case when native replication mechanism is not useful. Simplest solution would be create set of triggers over tables you want to replicate - and write the data to be replicated into some additional table. Then you can set up monitoring this table and apply changes to the HBase.
More robust, but more complicated solution would be to analyze MySQL log used for its native replication mechanism and apply the changes to the HBase.
In the same time it is not clear for me hoe HBase will give you real-time analytics. I wrote abit about this issue here: Group by In HBase




回答2:


To add more information about where to use Hive in your project, there are multiple setups that you can integrate Hive and HBase to work together. For instance, if you use AWS, you can install HBase/Hive on the same hadoop cluster to run join queries on Hive table and Hbase table together. Or you can separate HBase and Hive into two different clusters and reference HBase data from your Hive queries. If you use Cloudera distribution, you can do the same thing too.

Reference:

  • http://aws.typepad.com/aws/2012/06/apache-hbase-on-emr.html
  • http://www.cloudera.com/blog/2010/06/integrating-hive-and-hbase


来源:https://stackoverflow.com/questions/9919638/continuous-data-migration-from-mysql-to-hbase

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!