Temporal database design, with a twist (live vs draft rows)

我们两清 提交于 2019-11-27 11:43:30

Anchor modeling is a nice way to implement a temporal dB -- see the Wikipedia article too. Takes some time to get used to, but work nice. There is an online modeling tool and if you load the supplied XML file [File -> Load Model from Local File] you should see something like this -- also use [Layout --> Togle Names].

The [Generate --> SQL Code] will produce DDL for tables, views and point-in-time functions. The code is quite long, so I am not posting it here. Check the code out -- you may need to modify it for your DB.

Here is the file to load into modeling tool.

<schema>
<knot mnemonic="EXP" descriptor="Expired" identity="smallint" dataRange="char(1)">
<identity generator="true"/>
<layout x="713.96" y="511.22" fixed="true"/>
</knot>
<anchor mnemonic="US" descriptor="User" identity="int">
<identity generator="true"/>
<attribute mnemonic="USN" descriptor="UserName" dataRange="varchar(32)">
<layout x="923.38" y="206.54" fixed="true"/>
</attribute>
<layout x="891.00" y="242.00" fixed="true"/>
</anchor>
<anchor mnemonic="PO" descriptor="Post" identity="int">
<identity generator="true"/>
<attribute mnemonic="TIT" descriptor="Title" dataRange="varchar(2)">
<layout x="828.00" y="562.00" fixed="true"/>
</attribute>
<layout x="855.00" y="471.00" fixed="true"/>
</anchor>
<anchor mnemonic="TG" descriptor="Tag" identity="int">
<identity generator="true"/>
<attribute mnemonic="TGT" descriptor="TagText" dataRange="varchar(32)">
<layout x="551.26" y="331.69" fixed="true"/>
</attribute>
<layout x="637.29" y="263.43" fixed="true"/>
</anchor>
<anchor mnemonic="BO" descriptor="Body" identity="int">
<identity generator="true"/>
<attribute mnemonic="BOT" descriptor="BodyText" dataRange="varchar(max)">
<layout x="1161.00" y="491.00" fixed="true"/>
</attribute>
<layout x="1052.00" y="465.00" fixed="true"/>
</anchor>
<tie timeRange="datetime">
<anchorRole role="IsTagged" type="PO" identifier="true"/>
<anchorRole role="IsAttached" type="TG" identifier="true"/>
<anchorRole role="BYAuthor" type="US" identifier="false"/>
<knotRole role="Until" type="EXP" identifier="false"/>
<layout x="722.00" y="397.00" fixed="true"/>
</tie>
<tie timeRange="datetime">
<anchorRole role="Contains" type="PO" identifier="true"/>
<anchorRole role="ContainedIn" type="BO" identifier="false"/>
<layout x="975.00" y="576.00" fixed="true"/>
</tie>
<tie>
<anchorRole role="CreatedBy" type="TG" identifier="true"/>
<anchorRole role="Author" type="US" identifier="false"/>
<layout x="755.10" y="195.17" fixed="true"/>
</tie>
<tie>
<anchorRole role="CreatedBy" type="PO" identifier="true"/>
<anchorRole role="Author" type="US" identifier="false"/>
<layout x="890.69" y="369.09" fixed="true"/>
</tie>
<tie>
<anchorRole role="ModifiedBy" type="BO" identifier="true"/>
<anchorRole role="Author" type="US" identifier="false"/>
<layout x="1061.81" y="322.34" fixed="true"/>
</tie>
</schema>

I've implemented a temporal database using SCD type 2 and PostgreSQL Rules and Triggers, and wrapped it in a self-contained package for ActiveRecord: http://github.com/ifad/chronomodel

The design is independent from the language / framework, though - you can create Rules and Triggers manually and the database will take care of the rest. Have a look at https://github.com/ifad/chronomodel/blob/master/README.sql.

Also efficient indexing and querying of temporal data using geometric operators is included as a bonus. :-)

post2tag_revs has a problem in that it is trying to express 2 fundamentally different concepts.

A tag applied to a draft post revision only ever applies to that one revision, unless the revision is ever published.

Once a tag is published (i.e. associated with a published post revision), it applies to every future revision of the post until it is revoked.

And associating with a published revision, or unasociating, isn't necessarily simultaneous with a revision being published, unless you artificially enforce this by cloning a revision just so you can associate tag additions or removals...

I'd change the model by making post2tag_revs.post_rev only relevant for draft tags. Once the revision is published (and the tag is live), I'd use a time stamp column to mark the beginning and end of the published validity. You may or may not want a new post2tag_revs entry to represent this change.

As you point out, this makes this relationship bi-temporal. You might improve performance in the "normal" case by adding a boolean to post2tag to indicate that the tag is currently associated with the post.

I think I nailed it. Basically, you add a (unique) draft field to the relevant tables, and you work on the drafts as if they were a new post/tag/etc.:

posts (
 id pkey,
 public,
 created_at stamptz,
 updated_at stamptz,
 updated_by int,
 draft int fkey posts (id) unique
)

post_revs (
 id,
 public,
 created_at,
 created_by,
 deleted_at,
 pkey (id, created_at)
)

tags (
 id pkey,
 public,
 created_at,
 updated_at,
 updated_by,
 draft fkey tags (id) unique
)


tag_revs (
 id,
 public,
 created_at,
 created_by,
 deleted_at,
 pkey (id, created_at)
)

post2tag (
 post_id fkey posts(id),
 tag_id fkey tags(id),
 public,
 created_at,
 updated_at,
 updated_by,
 pkey (post_id, tag_id)
)

post2tag_revs (
 post_id,
 tag_id,
 public,
 created_at,
 created_by,
 deleted_at,
 pkey (post_id, tag_id, created_at)
)

Use only 3 tables: posts, tags and post2tag.

Add start_time and end_time columns to all tables. Add unique index for key, start_time and end_time. Add unique index for key where end_time is null. Add trigers.

For current:

SELECT ... WHERE end_time IS NULL

At time:

WHERE (SELECT CASE WHEN end_time IS NULL
THEN (start_time <= at_time)
ELSE (start_time <= at_time AND end_time > at_time)
END)

Search of current data is not slow because of functional index.

Edit:

CREATE UNIQUE INDEX ... ON post2tag (post_id, tag_id) WHERE end_time IS NULL;
CREATE UNIQUE INDEX ... ON post2tag (post_id, tag_id, start_time, end_time);

FOREIGN KEY (post_id, start_time, end_time) REFERENCES posts (post_id, start_time, end_time) ON DELETE CASCADE ON UPDATE CASCADE;
FOREIGN KEY (tag_id, start_time, end_time) REFERENCES tags (tag_id, start_time, end_time) ON DELETE CASCADE ON UPDATE CASCADE;
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!