Google Cloud Bigtable vs Google Cloud Datastore

末鹿安然 提交于 2019-11-28 14:58:17

问题


What is the difference between Google Cloud Bigtable and Google Cloud Datastore / App Engine datastore, and what are the main practical advantages/disadvantages? AFAIK Cloud Datastore is build on top of Bigtable.


回答1:


Based on experience with Datastore and reading the Bigtable docs, the main differences are:

  • Bigtable was originally designed for HBase compatibility, but now has client libraries in multiple languages. Datastore was originally more geared towards Python/Java/Go web app developers (originally App Engine)
  • Bigtable is 'a bit more IaaS' than Datastore in that it's not 'just there' but requires a cluster to be configured.
  • Bigtable supports only one index - the 'row key' (the entity key in Datastore)
    • This means queries are on the Key, unlike Datastore's indexed properties
  • Bigtable supports atomicity only on a single row - there are no transactions
  • Mutations and deletions appear not to be atomic in Bigtable, whereas Datastore provides eventual and strong consistency, depending on the read/query method
  • The billing model is very different:
    • Datastore charges for read/write operations, storage and bandwidth
    • Bigtable charges for 'nodes', storage and bandwidth



回答2:


Bigtable is optimized for high volumes of data and analytics

  • Cloud Bigtable doesn’t replicate data across zones or regions (data within a single cluster is replicated and durable), which means Bigtable is faster and more efficient, and costs are much lower, though it is less durable and available in the default configuration
  • It uses the HBase API - there’s no risk of lock-in or new paradigms to learn
  • It is integrated with the open-source Big Data tools, meaning you can analyze the data stored in Bigtable in most analytics tools customers use (Hadoop, Spark, etc.)
  • Bigtable is indexed by a single Row Key
  • Bigtable is in a single zone

Cloud Bigtable is designed for larger companies and enterprises who often have larger data needs with complex backend workloads.

Datastore is optimized to serve high-value transactional data to applications

  • Cloud Datastore has extremely high availability with replication and data synchronization
  • Datastore, because of its versatility and high availability, is more expensive
  • Datastore is slower writing data due to synchronous replication
  • Datastore has much better functionality around transactions and queries (since secondary indexes exist)



回答3:


Bigtable and Datastore are extremely different. Yes, the datastore is build on top of Bigtable, but that does not make it anything like it. That is kind of like saying a car is build on top of wheels, and so a car is not much different from wheels.

Bigtable and Datastore provide very different data models and very different semantics in how the data is changed.

The main difference is that the Datastore provides SQL-database-like ACID transactions on subsets of the data known as entity groups (though the query language GQL is much more restrictive than SQL). Bigtable is strictly NoSQL and comes with much weaker guarantees.




回答4:


If you read papers, BigTable is this and Datastore is MegaStore. Datastore is BigTable plus replication, transaction, and index. (and is much more expensive).




回答5:


I am going to try to summarize all the answers above plus what is given in Coursea Google Cloud Platform Big Data and Machine Learning Fundamentals

+---------------------+------------------------------------------------------------------+------------------------------------------+--+
|      Category       |                             BigTable                             |                Datastore                 |  |
+---------------------+------------------------------------------------------------------+------------------------------------------+--+
| Technology          | Based on HBase(uses HBase API)                                   | Uses BigTable itself                     |  |
| ----------------    |                                                                  |                                          |  |
| Access Mataphor     | Key/Value (column-families) like Hbase                           | Persistent hashmap                       |  |
| ----------------    |                                                                  |                                          |  |
| Read                | Scan Rows                                                        | Filter Objects on property               |  |
| ----------------    |                                                                  |                                          |  |
| Write               | Put Row                                                          | Put Object                               |  |
| ----------------    |                                                                  |                                          |  |
| Update Granularity  | can't update row ( you should write a new row, can't update one) | can update attribute                     |  |
| ----------------    |                                                                  |                                          |  |
| Capacity            | Petabytes                                                        | Terbytes                                 |  |
| ----------------    |                                                                  |                                          |  |
| Index               | Index key only (you should properly design the key)              | You can index any property of the object |  |
| Usage and use cases | High throughput, scalable flatten data                           | Structured data for Google App Engine    |  |
+---------------------+------------------------------------------------------------------+------------------------------------------+--+

Check this image too:




回答6:


A relatively minor point to consider, as of November 2016, bigtable python client library is still in Alpha, which means the future change might not be backward compatible. Also, bigtable python library is not compatible with App Engine's standard environment. You have to use the flexible one.




回答7:


I just found this useful analogy buried in the length page about eventual consistency in the Datastore documentation (emphasis mine):

One practice is to combine Cloud Datastore and BigQuery to fulfill different business requirements. Use Cloud Datastore for online transactional processing (OLTP) required for core application logic and use BigQuery for online analytical processing (OLAP) for backend operations. It may be necessary to implement a continuous data export flow from Cloud Datastore to BigQuery to move the data necessary for those queries.



来源:https://stackoverflow.com/questions/30085326/google-cloud-bigtable-vs-google-cloud-datastore

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!