How does HBase enable Random Access to HDFS?

前端 未结 2 1893
感情败类
感情败类 2021-02-08 12:42

Given that HBase is a database with its files stored in HDFS, how does it enable random access to a singular piece of data within HDFS? By which method is this accomplished?

相关标签:
2条回答
  • 2021-02-08 13:38

    hbase acess hdfs file by using hfile . you can check the url to get the detail: http://hbase.apache.org/book/hfilev2.html

    0 讨论(0)
  • 2021-02-08 13:43

    HBase stores data in HFiles that are indexed (sorted) by their key. Given a random key, the client can determine which region server to ask for the row from. The region server can determine which region to retrieve the row from, and then do a binary search through the region to access the correct row. This is accomplished by having sufficient statistics to know the number of blocks, block size, start key, and end key.

    For example: a table may contain 10 TB of data. But, the table is broken up into regions of size 4GB. Each region has a start/end key. The client can get the list of regions for a table and determine which region has the key it is looking for. Regions are broken up into blocks, so that the region server can do a binary search through its blocks. Blocks are essentially long lists of key, attribute, value, version. If you know what the starting key is for each block, you can determine one file to access, and what the byte-offset (block) is to start reading to see where you are in the binary search.

    0 讨论(0)
提交回复
热议问题