Implementing inheritance in MySQL: alternatives and a table with only surrogate keys

问题

This is a question that has probably been asked before, but I'm having some difficulty to find exactly my case, so I'll explain my situation in search for some feedback:

I have an application that will be registering locations, I have several types of locations, each location type has a different set of attributes, but I need to associate notes to locations regardless of their type and also other types of content (mostly multimedia entries and comments) to said notes. With this in mind, I came up with a couple of solutions:

Create a table for each location type, and a "notes" table for every location table with a foreign key, this is pretty troublesome because I would have to create a multimedia and comments table for every comments table, e.g.:
- LocationTypeA
  - ID
  - Attr1
  - Attr2
- LocationTypeA_Notes
  - ID
  - Attr1
  - ...
  - LocationTypeA_fk
- LocationTypeA_Notes_Multimedia
  - ID
  - Attr1
  - ...
  - LocationTypeA_Notes_fk
And so on, this would be quite annoying to do, but after it's done, developing on this structure should not be so troublesome.
Create a table with a unique identifier for the location and point content there, like so:
- Location
  - ID
- LocationTypeA
  - ID
  - Attr1
  - Attr2
  - Location_fk
- Notes
  - ID
  - Attr1
  - ...
  - Location_fk
- Multimedia
  - ID
  - Attr1
  - ...
  - Notes_fk
As you see, this is far more simple and also easier to develop, but I just don't like the looks of that table with only IDs (yeah, that's truly the only objection I have to this, it's the option I like the most, to be honest).
Similar to option 2, but I would have an enormous table of attributes shaped like this:
- Location
  - ID
  - Type
- Attribute
  - Name
  - Value
And so on, or a table for each attribute; a la Drupal. This would be a pain to develop because then it would take several insert/update operations to do something on a location and the Attribute table would be several times bigger than the location table (or end up with an enormous amount of attribute tables); it also has the same issue of the surrogate-keys-only table (just it has a "type" now, which I would use to define the behavior of the location programmatically), but it's a pretty solution.

So, to the question: which would be a better solution performance and scalability-wise?, which would you go with or which alternatives would you propose? I don't have a problem implementing any of these, options 2 and 3 would be an interesting development, I've never done something like that, but I don't want to go with an option that will collapse on itself when the content grows a bit; you're probably thinking "why not just use Drupal if you know it works like you expect it to?", and I'm thinking "you obviously don't know how difficult it is to use Drupal, either that or you're an expert, which I'm most definitely not".

Also, now that I've written all of this, do you think option 2 is a good idea overall?, do you know of a better way to group entities / simulate inheritance? (please, don't say "just use inheritance!", I'm restricted to using MySQL).

Thanks for your feedback, I'm sorry if I wrote too much and meant too little.

回答1:

ORM systems usually use the following, mostly the same solutions as you listed there:

One table per hierarchy

Pros:

Simple approach.
Easy to add new classes, you just need to add new columns for the additional data.
Supports polymorphism by simply changing the type of the row.
Data access is fast because the data is in one table.
Ad-hoc reporting is very easy because all of the data is found in one table.

Cons:

Coupling within the class hierarchy is increased because all classes are directly coupled to the same table.
A change in one class can affect the table which can then affect the other classes in the hierarchy.
Space potentially wasted in the database.
Indicating the type becomes complex when significant overlap between types exists.
Table can grow quickly for large hierarchies.

When to use:

This is a good strategy for simple and/or shallow class hierarchies where there is little or no overlap between the types within the hierarchy.

One table per concrete class

Pros:

Easy to do ad-hoc reporting as all the data you need about a single class is stored in only one table.
Good performance to access a single object’s data.

Cons:

When you modify a class you need to modify its table and the table of any of its subclasses. For example if you were to add height and weight to the Person class you would need to add columns to the Customer, Employee, and Executive tables.
Whenever an object changes its role, perhaps you hire one of your customers, you need to copy the data into the appropriate table and assign it a new POID value (or perhaps you could reuse the existing POID value).
It is difficult to support multiple roles and still maintain data integrity. For example, where would you store the name of someone who is both a customer and an employee?

When to use:

When changing types and/or overlap between types is rare.

One table per class

Pros:

Easy to understand because of the one-to-one mapping.
Supports polymorphism very well as you merely have records in the appropriate tables for each type.
Very easy to modify superclasses and add new subclasses as you merely need to modify/add one table.
Data size grows in direct proportion to growth in the number of objects.

Cons:

There are many tables in the database, one for every class (plus tables to maintain relationships).
Potentially takes longer to read and write data using this technique because you need to access multiple tables. This problem can be alleviated if you organize your database intelligently by putting each table within a class hierarchy on different physical disk-drive platters (this assumes that the disk-drive heads all operate independently).
Ad-hoc reporting on your database is difficult, unless you add views to simulate the desired tables.

When to use:

When there is significant overlap between types or when changing types is common.

Generic Schema

Pros:

Works very well when database access is encapsulated by a robust persistence framework.
It can be extended to provide meta data to support a wide range of mappings, including relationship mappings. In short, it is the start at a mapping meta data engine.
It is incredibly flexible, enabling you to quickly change the way that you store objects because you merely need to update the meta data stored in the Class, Inheritance, Attribute, and AttributeType tables accordingly.

Cons:

Very advanced technique that can be difficult to implement at first.
It only works for small amounts of data because you need to access many database rows to build a single object.
You will likely want to build a small administration application to maintain the meta data.
Reporting against this data can be very difficult due to the need to access several rows to obtain the data for a single object.

When to use:

For complex applications that work with small amounts of data, or for applications where you data access isn’t very common or you can pre-load data into caches.

来源：https://stackoverflow.com/questions/18646837/implementing-inheritance-in-mysql-alternatives-and-a-table-with-only-surrogate

标签

mysql

database

database-design

scalability

database-performance