Example
I have Person
, SpecialPerson
, and User
. Person
and SpecialPerson
are just people - they don\
There are generally three ways of mapping object inheritance to database tables.
You can make one big table with all the fields from all the objects with a special field for the type. This is fast but wastes space, although modern databases save space by not storing empty fields. And if you're only looking for all users in the table, with every type of person in it things can get slow. Not all or-mappers support this.
You can make different tables for all the different child classes with all of the tables containing the base-class fields. This is ok from a performance perspective. But not from a maintenance perspective. Every time your base-class changes all the tables change.
You can also make a table per class like you suggested. This way you need joins to get all the data. So it's less performant. I think it's the cleanest solution.
What you want to use depends of course on your situation. None of the solutions is perfect so you have to weigh the pros and cons.
Take a look at Martin Fowler's Patterns of Enterprise Application Architecture:
Single Table Inheritance:
When mapping to a relational database, we try to minimize the joins that can quickly mount up when processing an inheritance structure in multiple tables. Single Table Inheritance maps all fields of all classes of an inheritance structure into a single table.
Class Table Inheritance:
You want database structures that map clearly to the objects and allow links anywhere in the inheritance structure. Class Table Inheritance supports this by using one database table per class in the inheritance structure.
Concrete Table Inheritance:
Thinking of tables from an object instance point of view, a sensible route is to take each object in memory and map it to a single database row. This implies Concrete Table Inheritance, where there's a table for each concrete class in the inheritance hierarchy.
Personally, I would store all of these different user classes in a single table. You can then either have a field which stores a 'Type' value, or you can imply what type of person you're dealing with by what fields are filled in. For example, if UserID is NULL, then this record isn't a User.
You could link out to other tables using a one to one-or-none type of join, but then in every query you'll be adding extra joins.
The first method is also supported by LINQ-to-SQL if you decide to go down that route (they call it 'Table Per Hierarchy' or 'TPH').
In the past I've done it exactly as you suggest -- have a Person table for common stuff, then SpecialPerson linked for the derived class. However, I'm re-thinking that, as Linq2Sql wants to have a field in the same table indicate the difference. I haven't looked at the entity model too much, though -- pretty sure that allows the other method.
There's three basic strategies for handling inheritance in a relational database, and a number of more complex/bespoke alternatives depending on your exact needs.
Each of these appoaches raises its own issues about normalization, data access code, and data storage, although my personal preferance is to use table per subclass unless there's a specific performance or structural reason to go with one of the the alternatives.
This is an older post but I thought I'll weigh in from a conceptual, procedural and performance standpoint.
The first question I would ask is the relationship between person, specialperson, and user, and whether it's possible for someone to be both a specialperson and a user simultaneously. Or, any other of 4 possible combinations (class a + b, class b + c, class a + c, or a + b + c). If this class is stored as a value in a type
field and would therefore collapse these combinations, and that collapse is unacceptable, then I would think a secondary table would be required allowing for a one-to-many relationship. I've learned you don't judge that until you evaluate the usage and the cost of losing your combination information.
The other factor that makes me lean toward a single table is your description of the scenario. User
is the only entity with a username (say varchar(30)) and password (say varchar(32)). If the common fields' possible length is an average 20 characters per 20 fields, then your column size increase is 62 over 400, or about 15% - 10 years ago this would have been more costly than it is with modern RDBMS systems, especially with a field type like varchar (e.g. for MySQL) available.
And, if security is of concern to you, it might be advantageous to have a secondary one-to-one table called credentials ( user_id, username, password)
. This table would be invoked in a JOIN contextually at say time of login, but structurally separate from just "anyone" in the main table. And, a LEFT JOIN
is available for queries that might want to consider "registered users".
My main consideration for years is still to consider the object's significance (and therefore possible evolution) outside the DB and in the real world. In this case, all types of persons have beating hearts (I hope), and may also have hierarchical relationships to one another; so, in the back of my mind, even if not now, we may need to store such relationships by another method. That's not explicitly related to your question here, but it is another example of the expression of an object's relationship. And by now (7 years later) you should have good insight into how your decision worked anyway :)