Maxing out document storage in Firestore

后端 未结 1 1594
青春惊慌失措
青春惊慌失措 2021-01-16 23:08

I\'m working on some posting forum projects and trying to figure out the ideal Firestore database structure. I read that documents have a max size of 1 mg but what are the p

相关标签:
1条回答
  • 2021-01-16 23:45

    You can likely store many posts in a single document, and depending on your application, there may be good reasons for doing so. Just keep a few things in mind:

    • Firestore always reads complete documents. So if you store 100 posts in a single 1MB document, to only display 10 of those posts, you may have reduced the read operations by 10x, but you've increased the bandwidth consumption by 10x. And your mobile users will likely also pay for that bandwidth.
    • Implementing your own sharding strategy is not always hard, but it's seldom related to application functionality.

    My guidelines when modeling data in any NoSQL database is:

    • model application screens in your database

      I tend to model the data in my database after the screens that I have in my application. So if you typically show a list of headlines of recent articles when a user starts the app, I might actually create a document that contains just the headlines of recent articles. That way the app only has to read a single document with just the headlines, instead of having to read each individual post. This reduces not only the number of documents the app needs to read, but also the bandwidth it consumes.

    • don't be afraid to duplicate data

      This goes hand-in-hand with the previous guideline, and is very normal across all NoSQL databases, but goes against the core of what many of us have learned from relational databases. It is sometimes also referred to as denormalizing, as it counters the database normalization of relations database models.

      Continuing the example from before: you'll probably have a separate document for each post, just to make sure that each post has its own single point of definition. But you'll store parts of that post in many other places, such as in the document-of-recent-headlines that we had before. This means that we'll have to duplicate the data for each new post into that document, and possibly multiple other places. This process is known as fan-out, and there are some common strategies for updating this denormalized data.

      I find that this duplication leads to no concerns, as long as it is clear what the main point of definition for each entity is. So in our example: if there ever is a difference between the headline of a post in the post-document itself, and the document-of-recent-headlines, I know that I should update the document-of-recent-headlines, since the post-document itself is my point-of-definition for the post.

    The result of all this is that I often see my database as part actual data storage, part prerendered fragments of application screens. As long as the points of definition are clear, that works quite well and allows me to define data models that scale efficiently both for users of the applications that consume the data and for the cost to operate them.

    To learn more about NoSQL data modeling:

    • NoSQL data modeling
    • Getting to know Cloud Firestore, which contains many more examples of these prerendered application screens.
    0 讨论(0)
提交回复
热议问题