I\'m working on a multi-tenanted application in which some users can define their own data fields (via the admin) to collect additional data in forms and re
As of today, there are four available approaches, two of them requiring a certain storage backend:
Django-eav (the original package is no longer mantained but has some thriving forks)
This solution is based on Entity Attribute Value data model, essentially, it uses several tables to store dynamic attributes of objects. Great parts about this solution is that it:
allows you to effectively attach/detach dynamic attribute storage to Django model with simple commands like:
eav.unregister(Encounter)
eav.register(Patient)
Nicely integrates with Django admin;
At the same time being really powerful.
Downsides:
The usage is pretty straightforward:
import eav
from app.models import Patient, Encounter
eav.register(Encounter)
eav.register(Patient)
Attribute.objects.create(name='age', datatype=Attribute.TYPE_INT)
Attribute.objects.create(name='height', datatype=Attribute.TYPE_FLOAT)
Attribute.objects.create(name='weight', datatype=Attribute.TYPE_FLOAT)
Attribute.objects.create(name='city', datatype=Attribute.TYPE_TEXT)
Attribute.objects.create(name='country', datatype=Attribute.TYPE_TEXT)
self.yes = EnumValue.objects.create(value='yes')
self.no = EnumValue.objects.create(value='no')
self.unkown = EnumValue.objects.create(value='unkown')
ynu = EnumGroup.objects.create(name='Yes / No / Unknown')
ynu.enums.add(self.yes)
ynu.enums.add(self.no)
ynu.enums.add(self.unkown)
Attribute.objects.create(name='fever', datatype=Attribute.TYPE_ENUM,\
enum_group=ynu)
# When you register a model within EAV,
# you can access all of EAV attributes:
Patient.objects.create(name='Bob', eav__age=12,
eav__fever=no, eav__city='New York',
eav__country='USA')
# You can filter queries based on their EAV fields:
query1 = Patient.objects.filter(Q(eav__city__contains='Y'))
query2 = Q(eav__city__contains='Y') | Q(eav__fever=no)
Hstore, JSON or JSONB fields in PostgreSQL
PostgreSQL supports several more complex data types. Most are supported via third-party packages, but in recent years Django has adopted them into django.contrib.postgres.fields.
HStoreField:
Django-hstore was originally a third-party package, but Django 1.8 added HStoreField as a built-in, along with several other PostgreSQL-supported field types.
This approach is good in a sense that it lets you have the best of both worlds: dynamic fields and relational database. However, hstore is not ideal performance-wise, especially if you are going to end up storing thousands of items in one field. It also only supports strings for values.
#app/models.py
from django.contrib.postgres.fields import HStoreField
class Something(models.Model):
name = models.CharField(max_length=32)
data = models.HStoreField(db_index=True)
In Django's shell you can use it like this:
>>> instance = Something.objects.create(
name='something',
data={'a': '1', 'b': '2'}
)
>>> instance.data['a']
'1'
>>> empty = Something.objects.create(name='empty')
>>> empty.data
{}
>>> empty.data['a'] = '1'
>>> empty.save()
>>> Something.objects.get(name='something').data['a']
'1'
You can issue indexed queries against hstore fields:
# equivalence
Something.objects.filter(data={'a': '1', 'b': '2'})
# subset by key/value mapping
Something.objects.filter(data__a='1')
# subset by list of keys
Something.objects.filter(data__has_keys=['a', 'b'])
# subset by single key
Something.objects.filter(data__has_key='a')
JSONField:
JSON/JSONB fields support any JSON-encodable data type, not just key/value pairs, but also tend to be faster and (for JSONB) more compact than Hstore. Several packages implement JSON/JSONB fields including django-pgfields, but as of Django 1.9, JSONField is a built-in using JSONB for storage. JSONField is similar to HStoreField, and may perform better with large dictionaries. It also supports types other than strings, such as integers, booleans and nested dictionaries.
#app/models.py
from django.contrib.postgres.fields import JSONField
class Something(models.Model):
name = models.CharField(max_length=32)
data = JSONField(db_index=True)
Creating in the shell:
>>> instance = Something.objects.create(
name='something',
data={'a': 1, 'b': 2, 'nested': {'c':3}}
)
Indexed queries are nearly identical to HStoreField, except nesting is possible. Complex indexes may require manually creation (or a scripted migration).
>>> Something.objects.filter(data__a=1)
>>> Something.objects.filter(data__nested__c=3)
>>> Something.objects.filter(data__has_key='a')
Django MongoDB
Or other NoSQL Django adaptations -- with them you can have fully dynamic models.
NoSQL Django libraries are great, but keep in mind that they are not 100% the Django-compatible, for example, to migrate to Django-nonrel from standard Django you will need to replace ManyToMany with ListField among other things.
Checkout this Django MongoDB example:
from djangotoolbox.fields import DictField
class Image(models.Model):
exif = DictField()
...
>>> image = Image.objects.create(exif=get_exif_data(...))
>>> image.exif
{u'camera_model' : 'Spamcams 4242', 'exposure_time' : 0.3, ...}
You can even create embedded lists of any Django models:
class Container(models.Model):
stuff = ListField(EmbeddedModelField())
class FooModel(models.Model):
foo = models.IntegerField()
class BarModel(models.Model):
bar = models.CharField()
...
>>> Container.objects.create(
stuff=[FooModel(foo=42), BarModel(bar='spam')]
)
Django-mutant: Dynamic models based on syncdb and South-hooks
Django-mutant implements fully dynamic Foreign Key and m2m fields. And is inspired by incredible but somewhat hackish solutions by Will Hardy and Michael Hall.
All of these are based on Django South hooks, which, according to Will Hardy's talk at DjangoCon 2011 (watch it!) are nevertheless robust and tested in production (relevant source code).
First to implement this was Michael Hall.
Yes, this is magic, with these approaches you can achieve fully dynamic Django apps, models and fields with any relational database backend. But at what cost? Will stability of application suffer upon heavy use? These are the questions to be considered. You need to be sure to maintain a proper lock in order to allow simultaneous database altering requests.
If you are using Michael Halls lib, your code will look like this:
from dynamo import models
test_app, created = models.DynamicApp.objects.get_or_create(
name='dynamo'
)
test, created = models.DynamicModel.objects.get_or_create(
name='Test',
verbose_name='Test Model',
app=test_app
)
foo, created = models.DynamicModelField.objects.get_or_create(
name = 'foo',
verbose_name = 'Foo Field',
model = test,
field_type = 'dynamiccharfield',
null = True,
blank = True,
unique = False,
help_text = 'Test field for Foo',
)
bar, created = models.DynamicModelField.objects.get_or_create(
name = 'bar',
verbose_name = 'Bar Field',
model = test,
field_type = 'dynamicintegerfield',
null = True,
blank = True,
unique = False,
help_text = 'Test field for Bar',
)
I've been working on pushing the django-dynamo idea further. The project is still undocumented but you can read the code at https://github.com/charettes/django-mutant.
Actually FK and M2M fields (see contrib.related) also work and it's even possible to define wrapper for your own custom fields.
There's also support for model options such as unique_together and ordering plus Model bases so you can subclass model proxy, abstract or mixins.
I'm actually working on a not in-memory lock mechanism to make sure model definitions can be shared accross multiple django running instances while preventing them using obsolete definition.
The project is still very alpha but it's a cornerstone technology for one of my project so I'll have to take it to production ready. The big plan is supporting django-nonrel also so we can leverage the mongodb driver.
Further research reveals that this is a somewhat special case of Entity Attribute Value design pattern, which has been implemented for Django by a couple of packages.
First, there's the original eav-django project, which is on PyPi.
Second, there's a more recent fork of the first project, django-eav which is primarily a refactor to allow use of EAV with django's own models or models in third-party apps.