I\'m thinking about table layout for a Hibernate-managed class hierarchy, and certainly the table per subclass technique strikes me as the most appropriate in a general sense.
As long as you access your DB only through Hibernate and you either dont have important data or are ready to write a small migration script, you should be able to make the decision about table per subclass / hierarchy quite late in your development process. That's the beauty of an ORM, it abstract the database structure...
On the other hand, I'm a big fan of "prefer composition over inheritance" (Prefer composition over inheritance?) and I am quite dubious that a model with 70 classes over 4-5 levels cannot be simplified ... but I'll let you think for yourself over that one, after all I dont know what probem you are trying to sovle.
After David M's helpful answer I decided to throw together a skeleton test.
I created an abstract superclass, ADTestA
, and 25 concrete subclasses in a three-level hierarchy (I expect you can guess their names). Each class had a single integer field with a name corresponding to its letter - so for example, class ADTestG
has a single int field g
in addition to the b
field it inherits from its immediate parent ADTestB
, and the pkey
and a
fields from the top-level abstract superclass.
Issuing the HQL query from ADTestA where pkey=1
resulted in the following SQL:
select adtesta0_.pkey as pkey0_, adtesta0_.a as a0_, adtesta0_1_.b as b1_,
adtesta0_2_.c as c2_, adtesta0_3_.d as d3_, adtesta0_4_.e as e4_,
adtesta0_5_.f as f5_, adtesta0_6_.g as g6_, adtesta0_7_.h as h7_,
adtesta0_8_.i as i8_, adtesta0_9_.j as j9_, adtesta0_10_.k as k10_,
adtesta0_11_.l as l11_, adtesta0_12_.m as m12_, adtesta0_13_.n as n13_,
adtesta0_14_.o as o14_, adtesta0_15_.p as p15_, adtesta0_16_.q as q16_,
adtesta0_17_.r as r17_, adtesta0_18_.s as s18_, adtesta0_19_.t as t19_,
adtesta0_20_.u as u20_, adtesta0_21_.v as v21_, adtesta0_22_.w as w22_,
adtesta0_23_.x as x23_, adtesta0_24_.y as y24_, adtesta0_25_.z as z25_,
case
when adtesta0_6_.pkey is not null then 6
when adtesta0_7_.pkey is not null then 7
when adtesta0_8_.pkey is not null then 8
when adtesta0_9_.pkey is not null then 9
when adtesta0_10_.pkey is not null then 10
when adtesta0_11_.pkey is not null then 11
when adtesta0_12_.pkey is not null then 12
when adtesta0_13_.pkey is not null then 13
when adtesta0_14_.pkey is not null then 14
when adtesta0_15_.pkey is not null then 15
when adtesta0_16_.pkey is not null then 16
when adtesta0_17_.pkey is not null then 17
when adtesta0_18_.pkey is not null then 18
when adtesta0_19_.pkey is not null then 19
when adtesta0_20_.pkey is not null then 20
when adtesta0_21_.pkey is not null then 21
when adtesta0_22_.pkey is not null then 22
when adtesta0_23_.pkey is not null then 23
when adtesta0_24_.pkey is not null then 24
when adtesta0_25_.pkey is not null then 25
when adtesta0_1_.pkey is not null then 1
when adtesta0_2_.pkey is not null then 2
when adtesta0_3_.pkey is not null then 3
when adtesta0_4_.pkey is not null then 4
when adtesta0_5_.pkey is not null then 5
when adtesta0_.pkey is not null then 0
end as clazz_
from ADTestA adtesta0_
left outer join ADTestB adtesta0_1_ on adtesta0_.pkey=adtesta0_1_.pkey
left outer join ADTestC adtesta0_2_ on adtesta0_.pkey=adtesta0_2_.pkey
left outer join ADTestD adtesta0_3_ on adtesta0_.pkey=adtesta0_3_.pkey
left outer join ADTestE adtesta0_4_ on adtesta0_.pkey=adtesta0_4_.pkey
left outer join ADTestF adtesta0_5_ on adtesta0_.pkey=adtesta0_5_.pkey
left outer join ADTestG adtesta0_6_ on adtesta0_.pkey=adtesta0_6_.pkey
left outer join ADTestH adtesta0_7_ on adtesta0_.pkey=adtesta0_7_.pkey
left outer join ADTestI adtesta0_8_ on adtesta0_.pkey=adtesta0_8_.pkey
left outer join ADTestJ adtesta0_9_ on adtesta0_.pkey=adtesta0_9_.pkey
left outer join ADTestK adtesta0_10_ on adtesta0_.pkey=adtesta0_10_.pkey
left outer join ADTestL adtesta0_11_ on adtesta0_.pkey=adtesta0_11_.pkey
left outer join ADTestM adtesta0_12_ on adtesta0_.pkey=adtesta0_12_.pkey
left outer join ADTestN adtesta0_13_ on adtesta0_.pkey=adtesta0_13_.pkey
left outer join ADTestO adtesta0_14_ on adtesta0_.pkey=adtesta0_14_.pkey
left outer join ADTestP adtesta0_15_ on adtesta0_.pkey=adtesta0_15_.pkey
left outer join ADTestQ adtesta0_16_ on adtesta0_.pkey=adtesta0_16_.pkey
left outer join ADTestR adtesta0_17_ on adtesta0_.pkey=adtesta0_17_.pkey
left outer join ADTestS adtesta0_18_ on adtesta0_.pkey=adtesta0_18_.pkey
left outer join ADTestT adtesta0_19_ on adtesta0_.pkey=adtesta0_19_.pkey
left outer join ADTestU adtesta0_20_ on adtesta0_.pkey=adtesta0_20_.pkey
left outer join ADTestV adtesta0_21_ on adtesta0_.pkey=adtesta0_21_.pkey
left outer join ADTestW adtesta0_22_ on adtesta0_.pkey=adtesta0_22_.pkey
left outer join ADTestX adtesta0_23_ on adtesta0_.pkey=adtesta0_23_.pkey
left outer join ADTestY adtesta0_24_ on adtesta0_.pkey=adtesta0_24_.pkey
left outer join ADTestZ adtesta0_25_ on adtesta0_.pkey=adtesta0_25_.pkey
where adtesta0_.pkey=1
That's not very pretty, and does correspond to the effective simulation of the per-hierarchy table that I hoped could be avoided.
So it looks like these kind of queries are going to be very expensive. I'll have a think about how often they would be needed (compared, say, to knowing that I want an instance of ADTestP
and asking for one of those right off the bat which only joins in the required parent tables). I have a feeling however that this will be unavoidable with references from othe entities; in other words, a one-to-one mapping from a field of type ADTestA
is always going to involve exactly this kind of lookup.
(On the other hand the alternative strategies aren't shining beacons of hope either; going the table-per-hierarchy route and having literally hundreds of columns in a single table doesn't sound very efficient either...)
You'll find that Hibernate writes the query for an unknown animal type with a series of LEFT JOIN
statements, one per subclass. So the query will slow as the number of subclasses increases, and will attempt to return an ever wider result set. So you are correct, it doesn't scale well with large class hierarchies.
With HQL, yes you can query the subclass directly, and access its properties. That will then be rendered with a single INNER JOIN
.
I haven't tried this with multiple levels of inheritance. If the above hasn't put you off yet, suggest you try it and see - you can turn on SQL debug output to see what is being sent to the database, or simply profile your database.