I have a very simple scenario involving a database and a JMS in an application server (Glassfish). The scenario is dead simple:
You are experiencing the classic XA 2-PC race condition. It does happen in production environments.
There are 3 things coming to my mind.
Weblogic has this LLR optimization avoids this problem and gives you all XA guarantees.