Am wondering if anyone might provide some conceptual advice on an efficient way to build a data model to accomplish the simple system described below. Am somewhat new to th
Many-to-many sounds reasonable. Perhaps you should try it first to see if it is actually expensive.
Good thing about G.A.E. is that it will tell you when you are using too many cycles. Profiling for free!
One possible way is with Expando
, where you'd add a tag like:
setattr(entity, 'tag_'+tag_name, True)
Then you could query all the entities with a tag like:
def get_all_with_tag(model_class, tag):
return model_class.all().filter('tag_%s =' % tag, True)
Of course you have to clean up your tags to be proper Python identifiers. I haven't tried this, so I'm not sure if it's really a good solution.
counts being pre-computed is not only practical, but also necessary because the count() function returns a maximum of 1000. if write-contention might be an issue, make sure to check out the sharded counter example.
http://code.google.com/appengine/articles/sharding_counters.html
Thanks to both of you for your suggestions. I've implemented (first iteration) as follows. Not sure if it's the best approach, but it's working.
Class A = Articles. Has a StringListProperty which can be queried on it's list elements
Class B = Tags. One entity per tag, also keeps a running count of the total number of articles using each tag.
Data modifications to A are accompanied by maintenance work on B. Thinking that counts being pre-computed is a good approach in a read-heavy environment.