JPA EntityManager: Why use persist() over merge()?

后端 未结 15 1448
说谎
说谎 2020-11-22 03:09

EntityManager.merge() can insert new objects and update existing ones.

Why would one want to use persist() (which can only create new objec

相关标签:
15条回答
  • 2020-11-22 03:16

    JPA is indisputably a great simplification in the domain of enterprise applications built on the Java platform. As a developer who had to cope up with the intricacies of the old entity beans in J2EE I see the inclusion of JPA among the Java EE specifications as a big leap forward. However, while delving deeper into the JPA details I find things that are not so easy. In this article I deal with comparison of the EntityManager’s merge and persist methods whose overlapping behavior may cause confusion not only to a newbie. Furthermore I propose a generalization that sees both methods as special cases of a more general method combine.

    Persisting entities

    In contrast to the merge method the persist method is pretty straightforward and intuitive. The most common scenario of the persist method's usage can be summed up as follows:

    "A newly created instance of the entity class is passed to the persist method. After this method returns, the entity is managed and planned for insertion into the database. It may happen at or before the transaction commits or when the flush method is called. If the entity references another entity through a relationship marked with the PERSIST cascade strategy this procedure is applied to it also."

    The specification goes more into details, however, remembering them is not crucial as these details cover more or less exotic situations only.

    Merging entities

    In comparison to persist, the description of the merge's behavior is not so simple. There is no main scenario, as it is in the case of persist, and a programmer must remember all scenarios in order to write a correct code. It seems to me that the JPA designers wanted to have some method whose primary concern would be handling detached entities (as the opposite to the persist method that deals with newly created entities primarily.) The merge method's major task is to transfer the state from an unmanaged entity (passed as the argument) to its managed counterpart within the persistence context. This task, however, divides further into several scenarios which worsen the intelligibility of the overall method's behavior.

    Instead of repeating paragraphs from the JPA specification I have prepared a flow diagram that schematically depicts the behaviour of the merge method:

    So, when should I use persist and when merge?

    persist

    • You want the method always creates a new entity and never updates an entity. Otherwise, the method throws an exception as a consequence of primary key uniqueness violation.
    • Batch processes, handling entities in a stateful manner (see Gateway pattern).
    • Performance optimization

    merge

    • You want the method either inserts or updates an entity in the database.
    • You want to handle entities in a stateless manner (data transfer objects in services)
    • You want to insert a new entity that may have a reference to another entity that may but may not be created yet (relationship must be marked MERGE). For example, inserting a new photo with a reference to either a new or a preexisting album.
    0 讨论(0)
  • 2020-11-22 03:20

    Either way will add an entity to a PersistenceContext, the difference is in what you do with the entity afterwards.

    Persist takes an entity instance, adds it to the context and makes that instance managed (ie future updates to the entity will be tracked).

    Merge returns the managed instance that the state was merged to. It does return something what exists in PersistenceContext or creates a new instance of your entity. In any case, it will copy the state from the supplied entity, and return managed copy. The instance you pass in will not be managed (any changes you make will not be part of the transaction - unless you call merge again). Though you can use the returned instance (managed one).

    Maybe a code example will help.

    MyEntity e = new MyEntity();
    
    // scenario 1
    // tran starts
    em.persist(e); 
    e.setSomeField(someValue); 
    // tran ends, and the row for someField is updated in the database
    
    // scenario 2
    // tran starts
    e = new MyEntity();
    em.merge(e);
    e.setSomeField(anotherValue); 
    // tran ends but the row for someField is not updated in the database
    // (you made the changes *after* merging)
          
    // scenario 3
    // tran starts
    e = new MyEntity();
    MyEntity e2 = em.merge(e);
    e2.setSomeField(anotherValue); 
    // tran ends and the row for someField is updated
    // (the changes were made to e2, not e)
    

    Scenario 1 and 3 are roughly equivalent, but there are some situations where you'd want to use Scenario 2.

    0 讨论(0)
  • 2020-11-22 03:20

    Some more details about merge which will help you to use merge over persist:

    Returning a managed instance other than the original entity is a critical part of the merge process. If an entity instance with the same identifier already exists in the persistence context, the provider will overwrite its state with the state of the entity that is being merged, but the managed version that existed already must be returned to the client so that it can be used. If the provider did not update the Employee instance in the persistence context, any references to that instance will become inconsistent with the new state being merged in.

    When merge() is invoked on a new entity, it behaves similarly to the persist() operation. It adds the entity to the persistence context, but instead of adding the original entity instance, it creates a new copy and manages that instance instead. The copy that is created by the merge() operation is persisted as if the persist() method were invoked on it.

    In the presence of relationships, the merge() operation will attempt to update the managed entity to point to managed versions of the entities referenced by the detached entity. If the entity has a relationship to an object that has no persistent identity, the outcome of the merge operation is undefined. Some providers might allow the managed copy to point to the non-persistent object, whereas others might throw an exception immediately. The merge() operation can be optionally cascaded in these cases to prevent an exception from occurring. We will cover cascading of the merge() operation later in this section. If an entity being merged points to a removed entity, an IllegalArgumentException exception will be thrown.

    Lazy-loading relationships are a special case in the merge operation. If a lazy-loading relationship was not triggered on an entity before it became detached, that relationship will be ignored when the entity is merged. If the relationship was triggered while managed and then set to null while the entity was detached, the managed version of the entity will likewise have the relationship cleared during the merge."

    All of the above information was taken from "Pro JPA 2 Mastering the Java™ Persistence API" by Mike Keith and Merrick Schnicariol. Chapter 6. Section detachment and merging. This book is actually a second book devoted to JPA by authors. This new book has many new information then former one. I really recommed to read this book for ones who will be seriously involved with JPA. I am sorry for anonimously posting my first answer.

    0 讨论(0)
  • 2020-11-22 03:22

    I noticed that when I used em.merge, I got a SELECT statement for every INSERT, even when there was no field that JPA was generating for me--the primary key field was a UUID that I set myself. I switched to em.persist(myEntityObject) and got just INSERT statements then.

    0 讨论(0)
  • 2020-11-22 03:25

    If you're using the assigned generator, using merge instead of persist can cause a redundant SQL statement, therefore affecting performance.

    Also, calling merge for managed entities is also a mistake since managed entities are automatically managed by Hibernate and their state is synchronized with the database record by the dirty checking mechanism upon flushing the Persistence Context.

    To understand how all this works, you should first know that Hibernate shifts the developer mindset from SQL statements to entity state transitions.

    Once an entity is actively managed by Hibernate, all changes are going to be automatically propagated to the database.

    Hibernate monitors currently attached entities. But for an entity to become managed, it must be in the right entity state.

    To understand the JPA state transitions better, you can visualize the following diagram:

    JPA entity state transitions

    Or if you use the Hibernate specific API:

    Hibernate entity state transitions

    As illustrated by the above diagrams, an entity can be in one of the following four states:

    • New (Transient)

      A newly created object that hasn’t ever been associated with a Hibernate Session (a.k.a Persistence Context) and is not mapped to any database table row is considered to be in the New (Transient) state.

      To become persisted we need to either explicitly call the EntityManager#persist method or make use of the transitive persistence mechanism.

    • Persistent (Managed)

      A persistent entity has been associated with a database table row and it’s being managed by the currently running Persistence Context. Any change made to such an entity is going to be detected and propagated to the database (during the Session flush-time). With Hibernate, we no longer have to execute INSERT/UPDATE/DELETE statements. Hibernate employs a transactional write-behind working style and changes are synchronized at the very last responsible moment, during the current Session flush-time.

    • Detached

      Once the currently running Persistence Context is closed all the previously managed entities become detached. Successive changes will no longer be tracked and no automatic database synchronization is going to happen.

      To associate a detached entity to an active Hibernate Session, you can choose one of the following options:

      • Reattaching

        Hibernate (but not JPA 2.1) supports reattaching through the Session#update method. A Hibernate Session can only associate one Entity object for a given database row. This is because the Persistence Context acts as an in-memory cache (first level cache) and only one value (entity) is associated with a given key (entity type and database identifier). An entity can be reattached only if there is no other JVM object (matching the same database row) already associated with the current Hibernate Session.

      • Merging

      The merge is going to copy the detached entity state (source) to a managed entity instance (destination). If the merging entity has no equivalent in the current Session, one will be fetched from the database. The detached object instance will continue to remain detached even after the merge operation.

    • Removed

      Although JPA demands that managed entities only are allowed to be removed, Hibernate can also delete detached entities (but only through a Session#delete method call). A removed entity is only scheduled for deletion and the actual database DELETE statement will be executed during Session flush-time.

    0 讨论(0)
  • 2020-11-22 03:26

    Going through the answers there are some details missing regarding `Cascade' and id generation. See question

    Also, it is worth mentioning that you can have separate Cascade annotations for merging and persisting: Cascade.MERGE and Cascade.PERSIST which will be treated according to the used method.

    The spec is your friend ;)

    0 讨论(0)
提交回复
热议问题