Sorry if this question is too simple; I\'m only entering 9th grade.
I\'m trying to learn about NoSQL database design. I want to design a Google Datastore model that min
Be aware of this caveat when using a repeated StructuredProperty:
Do not use repeated properties if you have more than 100-1000 values. (1000 is probably already pushing it.) They weren't designed for such use.
See Guido's answer in GAE ndb design, performance and use of repeated properties.
So while you may not hit the 1 MB entity limit with StructuredProperty, you may easily hit the 100-1000 suggested max.
What about:
from google.appengine.ext import ndb
class Comment(ndb.Model):
various properties...
class BlogPost(ndb.Model):
comments = ndb.KeyProperty(Comment, repeated=True)
various other properties...
This way, you can store up to 5000 comments per blog post (the maximum number of repeated properties) independent of the size of each blog post. You won't need a query to fetch the blogs for a comment, you can just do ndb.get_multi(blog_post.comments)
. And for this operation, you can try to rely on ndb's memcache. Of course, it depends on your use case whether this is a good assumption or not.
I could be wrong, but from what I understand, a StructuredProperty is just a property within an entity, but with sub-properties.
This means reading a BlogPost and all its comments would only cost one read. So when you render your page, you only need one read op for your entire page.
Writes would be cheaper each too. You'll need one read op to get the BlogPost, and as long as you don't update any indexed properties, it'll just be one write op.
You can handle the comment sorting on your own after you read the entity out of the datastore.
You'll have to synchronize your comment updates/edits with transactions, to make sure one comment doesn't overwrite another, since they are both modifying the same entity. You may run into unsolveable problems if everyone is commenting and editing the same blog post at the same time.
In optimizing for cost though, you'll hit a wall with the maximum entity size of 1MB. This will limit the number of comments you can store per blog post.
Going with the KeyProperty would be quite a bit more expensive.
You'll need one read to get the blog post, plus 1 query plus 1 small read op for each comment.
Every comment is a new entity, so it'll be at least 4 write ops. You may want to index for sort order, so that'll end up costing even more write ops.
On the plus side, you'll have unlimited comments per blog post, you don't have to worry about synchronizing new comments. You might need to worry about synchronization for editing comments, but if you limit the edit to the creator, that shouldn't really be a problem. You don't have to do sorting yourself either.
It's a cost vs features tradeoff.