I need to take a dump of one table in HBase and need it in a text file/csv format? I looked for scan
, export
and get
commands in HBase
There are a lot of ways to get data out of an HBase table like running the export map/reduce job. You can read about this and other here http://blog.sematext.com/2011/03/11/hbase-backup-options/ If you want to control which rows/cells are written you can do that with pig scripts
x = LOAD 'hbase://<sourceDatabaseName>' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage( '<family:qualifier>', '<family:qualifier2>','-loadKey true') AS (ID: bytearray , Value1:chararray , Value2:chararray);
STORE x INTO '<destFileName>'
USING CSVExcelStorage(['<delimiter>' [,{'YES_MULTILINE' | 'NO_MULTILINE'} [,{'UNIX' | 'WINDOWS' | 'UNCHANGED'}]]]);
If we need to have more control with Java code, hope below link will be helpful.
https://gist.github.com/sakthiinfotec/102fca54c91b411f626a
This will backup a single HBase table as CSV format in local filesystem. We need to pre-define the list of columns we needed from a single column family. This code uses necessary jars to connect HBase table along with OpenCSV jar to write CSV records.
Assumption here is all the columns are only string.