Bigtable database design theory

前端 未结 4 1739
野趣味
野趣味 2021-01-31 23:37

I am very well versed in the theory and practice of relational database design.

I know what works and what doesn\'t, what is performant and what is maintainable (almost

相关标签:
4条回答
  • 2021-01-31 23:53

    ... are there books or academic research papers on designing databases for bigtable and similar database paradigms?

    Well Bigtable is essentially a database itself, so I take it that your question is more on how to model and to some extent design your schema in these Bigtable like databases. More specifically you would like to know how to do this on Google's App Engine.

    With GAE you will be using the Datastore API, which adds a significant layer of abstraction to Bigtable, so to some extent you don't have to worry about low level details as you would if you were using something like HBase. There are a few posts on SO (here's a great answer by a Google Engineer who I think is part of GAE team) that will guide you and offer hints on how to approach this new type of Database system.

    Helpful Info:

    1. HBase was inspired by Google's Bigtable (Alternate Link) paper
    2. Hypertable was also inspired by Bigtable paper
    3. Cassandra's Data Model was inspired by Bigtable paper
    4. Hadoop was inspired by Google's GFS and MapReduce papers
    0 讨论(0)
  • 2021-02-01 00:04

    There's not much recent literature on non-relational database design that I'm aware of - though you might gain some valuable insights by digging up old papers from before the relational paradigm 'won'.

    The basic insight of databases like Bigtable is, of course, that in web-apps and other read-heavy applications, given the availability of cheap disk storage, the best approach is to optimize for reads, and do more work on writes. Normalization does the opposite - minimizing replication of data on disk, thus making writes easier and cheaper, but reads harder. Pretty much all the differences to relational database design arise from this single fact.

    The other consequence - one that could use more attention - is that when you optimize for reads, you have to know what type of reads you will be engaging in ahead of time, while normalized structures are more or less read-agnostic.

    0 讨论(0)
  • 2021-02-01 00:09

    The search term is column oriented databases/datastores

    Wikipedia

    In the beginning there was a discussion on how to build databases. Row oriented won.

    However column oriented is in a "revival" phase. It's best for big read-only, distributed scenarios.

    There is much theory to find when you search for column oriented databases/stores.

    0 讨论(0)
  • 2021-02-01 00:11

    just to be sure... you did read googles paper about bigtable right?

    Technologies like hadoop are based on this initial paper.

    0 讨论(0)
提交回复
热议问题