问题
I have a really simple mediaTypes table which contains the following columns:
id string
name string
Each mediaType record can have many "placements", which could easily be designed as follows:
Placements
id string
mediaTypeId string (links to mediaTypes.id)
name string
detail_col_1
detail_col_2
...etc
However depending on the media type, a placement can contain different details, so if I designed the schema this way I may end up with a lot of nullable columns.
To get around this, I could have an aPlacements table and a bPlacements table to match each different media type.
aPlacements
id string
mediaTypeId string (links to mediaTypes.id)
name string
placement_details_relevant_to_media_type_col_1
placement_details_relevant_to_media_type_col_2
bPlacements
id string
mediaTypeId string (links to mediaTypes.id)
name string
placement_details_relevant_to_media_type_col_1
placement_details_relevant_to_media_type_col_2
The drawback of this is how would I then find a placement by id as I'd have to query across all tables:
SELECT * FROM aPlacements WHERE id = '1234'
UNION ALL
SELECT * FROM bPlacements WHERE id = '1234'
etc
The whole design feels like a bit of a design smell. Any suggestions on how I could clean up this schema?
回答1:
Noting the Relational database tag.
The whole design feels like a bit of a design smell
Yes. It smells for two reasons.
- You have
ids
as Identifiers in each table. That will confuse you, and make for code that is easy to screw up. For an Identifier:- name it for the thing that it Identifies
eg.mediaType
,placementCode
(they are strings, which is correct) - where it is located as a Foreign Key, name it exactly the same, so that there is no confusion about what the column is, and what PK it references
- name it for the thing that it Identifies
However depending on the
mediaType
, a placement can contain different details
- What you are seeking in logical terms, is an OR Gate.
In Relational terms, it is a Subtype, here an Exclusive Subtype.
That is, with complete integrity and constraints.mediaType
is the Discriminator.
if I designed the schema this way I may end up with a lot of nullable columns.
Yes, you are correct. Nullable columns indicates that the modelling exercise, Normalisation, is incomplete. Two Subtype tables is correct.
Relational Data Model
Note • Notation
All my data models are rendered in IDEF1X, the Standard for modelling Relational databases since 1993
My IDEF1X Introduction is essential reading for beginners
Note • Content
Exclusive Subtype
- Each
Placement
is either aPlacementA
xor aPlacementB
- Refer to Subtype for full details on Subtype implementation.
- Each
Relational Key
- They are strings, as you have given.
- They are "made up from the data", as required by the Relational Model.
- Such Keys are Logical, they ensure the rows are unique.
- Further they provide Relational Integrity (as distinct from Referential Integrity), which cannot be shown here, in this small data model.
- Note that
IDs
that are manufactured by the system, which is NOT data, and NOT seen by the user, are physical, pointing to Records (not logical rows). They provide record uniqueness but not row uniqueness. They cannot provide Relational integrity. - The RM requires that rows (not records) are unique.
SQL
The drawback of this is how would I then find a placement by id as I'd have to query across all tables:
Upgraded as per above, that would be:
The drawback of this is how would I then find the relevant Placement columns by the PK
Placement
, as I'd have to query across all tables:
First, understand that SQL works perfectly for Relational databases, but it is, by its nature, a low-level language. Most of us in the real world use an IDE (I don't know anyone who does not), thus much of its cumbersomeness is eased, and many coding errors are eliminated.
Where we have to code SQL directly, yes, that is what you have to do. Get used to it. There are just two tables here.
Your code will not work, it assumes the columns are identical datatypes and in the same order (which is required for the UNION). There are not.
Do not force them to be, just to make your UNION succeed. There may well be additional columns in one or the other Subtype, later on, and then your code will break, badly, everywhere that it is deployed.
For code that is implemented, never use asterisk in a SELECT (it is fine for development only). That guarantees failure when the database changes. Always use a column list, and request only the columns you need.
SELECT Placement, ColumnA1, ColumnA2, ColumnB1 = "", ColumnB2 = "", ColumnB3 = "" FROM PlacementA WHERE Placement = 'ABCD' -- UNION -- SELECT Placement, "", "", ColumnB1, ColumnB2, ColumnB3 FROM PlacementB WHERE Placement = 'ABCD'
View
The Relational Model, and SQL its data sublanguage, has the concept of a View. This is how one would use it. Each Basetype and Subtype combination is considered a single unit, a single row.
CREATE VIEW PlacementA_V AS SELECT Placement, MediaType, ColumnCommon, ColumnA1, ColumnA2 FROM Placement BASE JOIN PlacementA SUBA ON BASE.Placement = SUBA.Placement
Enjoy.
Comments
In postgres, is there a way I could setup a constraint where the placement can ONLY exist in either PlacementA OR PlacementB and not both?
That is Exclusivity.
If you read the linked Subtype doc, I have given a full explanation and technical details for implementation in SQL, including all code (follow the links in each document). It consists of:
.
aCONSTRAINT
that calls aFUNCTION
.ALTER TABLE ProductBook -- subtype ADD CONSTRAINT ProductBook_Excl_ck -- check an existential condition, which calls -- function using PK & discriminator CHECK ( dbo.ValidateExclusive_fn ( ProductId, "B" ) = 1 )
We have had that capability in SQL for over 15 years in my experience.
Pusgre**NON*sql is not SQL compliant in many areas.
None of the freeware/shareware/vapourware/noware is SQL compliant (their use of the term SQL is fraudulent). They do not have a Server Architecture, most do not have ACID Transactions, etc.
Therefore, no. It cannot call a Function from DDL.As long as you understand and implement Standards, such as Open Architecture, to the degree possible in your particular NONsql suite (it cannot be labelled a platform because it has no Server Architecture), that is the best you can do.
The Open Architecture Standard demands:
- no direct
INSERT/UPDATE/DELETE
to the tables all your writes to the db are done via OLTP Transactions
- which in SQL means:
Stored Procedures withBEGIN TRAN ... COMMIT/ROLLBACK TRAN
- but in PusgreNONsql means:
Functions which are supposed to be "atomic"
(quotes because it is nowhere near the Atomic that is implemented in SQL ACID Transactions [the A in ACID stands for Atomic] )
- which in SQL means:
- no direct
Therefore, take the Exclusivity code in the Function I have given in SQL, and:
deploy it in every "atomic" Function that
INSERT/DELETEs
to the Basetype or Subtype tables in your pretend sql suite.
(I do not allow UPDATE to a Key, referCASCADE
above.)while we are here, it must be mentioned, such "atomic" Functions need to likewise have code to ensure that the Basetype-Subtype pair is INSERT/DELETEd as pair or not at all.
回答2:
maybe this is a subjective solution. If the Placements table have no much columns, ej: (detail_col_1, detail_col_2, detail_col_3.. detail_col_6) the table design is not that bad, I mean, it doesn`t depend of how many null columns you got, maybe it looks ugly but it should work. Now, if you want a complex method I'd suggest some of these:
- Simple Placements table with json in it:
MediaTypes
+ id
+ name
Placements
+ id
+ mediaTypeId
+ name
+ detail
In detail I can define my attributes as json, and set the correct values for each type:
row 1: {'attr1': valx, 'attr2': valy} row 2: {'attr4': valz, 'attr1': valw}
Now, the problem here is the query filter (you cannot). This should work if you want to save extra info.
- An elegant way:
MediaTypes
+ id
+ name
Placements
+ id
+ mediaTypeId
+ name
DetailAttributes //table of attributes for any type
+ id
+ name
+ mediaTypeId
PlacementDetailAttributes //many to many rel between DetailAttributes&Placements
+ placementId
+ detailAttributeId
+ value
With this approach you can add many attributes as you want. Query filter by attributes should work too!!
来源:https://stackoverflow.com/questions/58862651/how-do-i-get-around-this-relational-database-design-smell