How to implement a temporal table using JPA?

后端 未结 4 1709
有刺的猬
有刺的猬 2021-01-31 04:29

I would like to know how to implement temporal tables in JPA 2 with EclipseLink. By temporal I mean tables who define validity period.

One problem that I\'m facing is t

4条回答
  •  无人及你
    2021-01-31 04:49

    I am very interested in this topic. I am working for several years now in the development of applications which use these patterns, the idea came in our case from a German diploma thesis.

    I didn't know the "DAO Fusion" frameworks, they provide interesting information and links, thanks for providing this information. Especially the pattern page and the aspects page are great!

    To your questions: no, I cannot point out other sites, examples or frameworks. I am afraid that you have to use either the DAO Fusion framework or implement this functionality by yourself. You have to distinguish which kind of functionality you really need. To speak in terms of "DAO Fusion" framework: do you need both "valid temporal" and "record temporal"? Record temporal states when the change applied to your database (usually used for auditing issues), valid temporal states when the change occurred in real life or is valid in real life (used by the application) which might differ from record temporal. In most cases one dimension is sufficient and the second dimension is not needed.

    Anyway, temporal functionality has impacts on your database. As you stated: "which now their primary keys include the validity period". So how do you model the identity of an entity? I prefer the usage of surrogate keys. In that case this means:

    • one id for the entity
    • one id for the object in the database (the row)
    • the temporal columns

    The primary key for the table is the object id. Each entity has one or more (1-n) entries in a table, identified by the object id. Linking between tables is based on the entity id. Since the temporal entries multiply the amount of data, standard relationships don't work. A standard 1-n relationship might become a x*1-y*n relationship.

    How do you solve this? The standard approach would be to introduce a mapping table, but this is not a naturally approach. Just for editing one table (eg. an residence change occurs) you would also have to update/insert the mapping table which is strange for every programmer.

    The other approach would be not to use a mapping table. In this case you cannot use referential integrity and foreign keys, each table is acting isolated, the linking from one table to the others must be implemented manual and not with JPA functionality.

    The functionality of initializing database objects should be within the objects (as in the DAO Fusion framework). I would not put it in a service. If you give it into a DAO or use Active Record Pattern is up to you.

    I am aware that my answer doesn't provide you with an "ready to use" framework. You are in a very complicated area, from my experience resources to this usage scenario are very hard to find. Thanks for your question! But anyway I hope that I helped you in your design.

    In this answer you will find the reference book "Developing Time-Oriented Database Applications in SQL", see https://stackoverflow.com/a/800516/734687

    Update: Example

    • Question: Let's say that I have a PERSON table who has a surrogate key which is a field named "id". Every referencing table at this point will have that "ID" as a foreign key constraint. If I add temporal columns now I have to change the primary key to "id+from_date+to_date". Before changing the primary key I would have to first drop every foreign constraint of every referencing table to the this referenced table (Person). Am I right? I believe that's what you mean with the surrogate key. ID is a generated key that could be generated by a sequence. The business key of the Person table is the SSN.
    • Answer: Not exactly. SSN would be a natural key, which I do not use for objcet identity. Also "id+from_date+to_date" would be a composite key, which I would also avoid. If you look at the example you would have two tables, person and residence and for our example say we have a 1-n relationship with a foreign key residence. Now we adding temporal fields on each table. Yes we drop every foreign key constraint. Person will get 2 IDs, one ID to identify the row (call it ROW_ID), one ID to identify the person itself (call it ENTIDY_ID) with an index on that id. Same for the person. Of course your approach would work too, but in that case you would have operations which change the ROW_ID (when you close a time interval), which I would avoid.

    To extend the example implemented with the assumptions above (2 tables, 1-n):

    • a query to show all entries in the database (all validity information and record - aka technical - information included):

      SELECT * FROM Person p, Residence r
      WHERE p.ENTITY_ID = r.FK_ENTITY_ID_PERSON          // JOIN 
    • a query to hide the record - aka technical - information. This shows all the validy-Changes of the entities.

      SELECT * FROM Person p, Residence r
      WHERE p.ENTITY_ID = r.FK_ENTITY_ID_PERSON AND
      p.recordTo=[infinity] and r.recordTo=[infinity]    // only current technical state
    • a query to show the actual values.

      SELECT * FROM Person p, Residence r
      WHERE p.ENTITY_ID = r.FK_ENTITY_ID_PERSON AND
      p.recordTo=[infinity] and r.recordTo=[infinity] AND
      p.validFrom <= [now] AND p.validTo > [now] AND        // only current valid state person
      r.validFrom <= [now] AND r.validTo > [now]            // only current valid state residence

    As you can see I never use the ROW_ID. Replace [now] with a timestamp to go back in time.

    Update to reflect your update
    I would recommend the following data model:

    Introduce a "PlaysInTeam" table:

    • ID
    • ID Team (foreign key to team)
    • ID Player (foreign key to player)
    • ValidFrom
    • ValidTo

    When you list the players of a team you have to query with the date for which the relationship is valid and has to be in [ValdFrom, ValidTo)

    For making team temporal I have two approaches;

    Approach 1: Introduce a "Season" table which models a validity for a season

    • ID
    • Season name (eg. Summer 2011)
    • From (maybe not necessary, because every one knows when the season is)
    • To (maybe not necessary, because every one knows when the season is)

    Split the team table. You will have fields which belong to the team and which are not time relevant (name, address, ...) and fields which are time relevant for a season (win, loss, ..). In that case I would use Team and TeamInSeason. PlaysInTeam could link to TeamInSeason instead of Team (has to be considered - I would let it point to Team)

    TeamInSeason

    • ID
    • ID Team
    • ID Season
    • Win
    • Loss
    • ...

    Approach 2: Do not model the season explicitly. Split the team table. You will have fields which belong to the team and which are not time relevant (name, address, ...) and fields which are time relevant (win, loss, ..). In that case I would use Team and TeamInterval. TeamInterval would have fields "from" and "to" for the interval. PlaysInTeam could link to TeamInterval instead of Team (I would let it on Team)

    TeamInterval

    • ID
    • ID Team
    • From
    • To
    • Win
    • Loss
    • ...

    In both approaches: if you do not need a seperate team table for no time relevant field, do not split.

提交回复
热议问题