问题
I am having trouble recovering from a broken database connection. My test case uses the following libraries.
guice-persist 4.0
Hibernate core 4.3.1
HikariCP-java6-2.3.9 (connection pool).
My test runs a simple read in a loop with a small sleep between each read.
@com.google.inject.persist.Transactional
protected void simpleRead() {
dao = new MyDao(....)
dao.findBy(...)
}
The Dao acquires its entityManager like this
@Inject
protected com.google.inject.Provider<EntityManager> entityManager;
The guice module is bound like this
@Override
public void configure() {
install(new JpaPersistModule("test"));
bind(JPAInitializer.class).asEagerSingleton();
}
My persistence.xml defines the transaction-type as "RESOURCE-LOCAL".
And I have a few hikari settings, but Im not sure if these are relevant
<property name="hibernate.connection.provider_class" value="com.zaxxer.hikari.hibernate.HikariConnectionProvider" />
<property name="hibernate.hikari.maxLifetime" value="1800000" />
<property name="hibernate.hikari.connectionTestQuery" value="SELECT 1" />
<property name="hibernate.hikari.leakDetectionThreshold" value="300000" />
While the test is looping through the reads, I break the SQLServer connection by stopping the service from task manager. If the loop was not in the simplRead() method, at the time, it gracefully handles the error on the next attempted read and re-connects when the service is started again. However, if it was in the simpleRead() method when the service is stopped, it fails and never recovers. The first failure is reported as follows
org.hibernate.TransactionException: JDBC begin transaction failed:
javax.persistence.PersistenceException: org.hibernate.TransactionException: JDBC begin transaction failed:
at org.hibernate.jpa.spi.AbstractEntityManagerImpl.convert(AbstractEntityManagerImpl.java:1763)
at org.hibernate.jpa.spi.AbstractEntityManagerImpl.convert(AbstractEntityManagerImpl.java:1677)
at org.hibernate.jpa.spi.AbstractEntityManagerImpl.throwPersistenceException(AbstractEntityManagerImpl.java:1771)
at org.hibernate.jpa.internal.TransactionImpl.begin(TransactionImpl.java:64)
at com.google.inject.persist.jpa.JpaLocalTxnInterceptor.invoke(JpaLocalTxnInterceptor.java:66)
at net.impro.portal.server.model.dao.TestConnectionStuff.testConnectionBroken(TestConnectionStuff.java:40)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:80)
at org.testng.internal.Invoker.invokeMethod(Invoker.java:714)
at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:901)
at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1231)
at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:127)
at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:111)
at org.testng.TestRunner.privateRun(TestRunner.java:767)
at org.testng.TestRunner.run(TestRunner.java:617)
at org.testng.SuiteRunner.runTest(SuiteRunner.java:334)
at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:329)
at org.testng.SuiteRunner.privateRun(SuiteRunner.java:291)
at org.testng.SuiteRunner.run(SuiteRunner.java:240)
at org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:52)
at org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:86)
at org.testng.TestNG.runSuitesSequentially(TestNG.java:1197)
at org.testng.TestNG.runSuitesLocally(TestNG.java:1122)
at org.testng.TestNG.run(TestNG.java:1030)
at org.testng.remote.RemoteTestNG.run(RemoteTestNG.java:111)
at org.testng.remote.RemoteTestNG.initAndRun(RemoteTestNG.java:204)
at org.testng.remote.RemoteTestNG.main(RemoteTestNG.java:175)
Caused by: org.hibernate.TransactionException: JDBC begin transaction failed:
at org.hibernate.engine.transaction.internal.jdbc.JdbcTransaction.doBegin(JdbcTransaction.java:76)
at org.hibernate.engine.transaction.spi.AbstractTransactionImpl.begin(AbstractTransactionImpl.java:162)
at org.hibernate.internal.SessionImpl.beginTransaction(SessionImpl.java:1431)
at org.hibernate.jpa.internal.TransactionImpl.begin(TransactionImpl.java:61)
... 26 more
Caused by: java.sql.SQLException: I/O Error: Connection reset by peer: socket write error
at net.sourceforge.jtds.jdbc.TdsCore.executeSQL(TdsCore.java:1059)
at net.sourceforge.jtds.jdbc.TdsCore.submitSQL(TdsCore.java:905)
at net.sourceforge.jtds.jdbc.JtdsConnection.setAutoCommit(JtdsConnection.java:2291)
at com.zaxxer.hikari.proxy.ConnectionProxy.setAutoCommit(ConnectionProxy.java:334)
at com.zaxxer.hikari.proxy.ConnectionJavassistProxy.setAutoCommit(ConnectionJavassistProxy.java)
at org.hibernate.engine.transaction.internal.jdbc.JdbcTransaction.doBegin(JdbcTransaction.java:72)
... 29 more
Caused by: java.net.SocketException: Connection reset by peer: socket write error
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(Unknown Source)
at java.net.SocketOutputStream.write(Unknown Source)
at java.io.DataOutputStream.write(Unknown Source)
at net.sourceforge.jtds.jdbc.SharedSocket.sendNetPacket(SharedSocket.java:717)
at net.sourceforge.jtds.jdbc.RequestStream.putPacket(RequestStream.java:570)
at net.sourceforge.jtds.jdbc.RequestStream.flush(RequestStream.java:518)
at net.sourceforge.jtds.jdbc.TdsCore.executeSQL(TdsCore.java:1046)
... 34 more
...
And because that now doesnt clean up properly, the subsequent requests give fail like this
org.hibernate.TransactionException: Already have an associated managed connection
Im not finding it easy to determine which layer is responsible for this problem. From what I have read, the connection pool should test broken connections and clean things up, but I suspect its guice-persist which forever has the broken EntityManager reference associated with its thread-local object. Of course, it may well also be the way Im using them.
EDIT
After further investigation, I noticed that if I wait > 1000ms between each test read, then the code behaves as expected. I get the StackTrace below, and then after I restore the SQLServer service, it happily reconnects. Anything less than 1000ms and I get the error listed above in the original post. It seems hibernate does not handle the IO/Error in the same way and the connection is forever left in a broken state.
javax.persistence.QueryTimeoutException: Could not open connection
at org.hibernate.jpa.spi.AbstractEntityManagerImpl.convert(AbstractEntityManagerImpl.java:1725)
at org.hibernate.jpa.spi.AbstractEntityManagerImpl.convert(AbstractEntityManagerImpl.java:1677)
at org.hibernate.jpa.spi.AbstractEntityManagerImpl.throwPersistenceException(AbstractEntityManagerImpl.java:1771)
at org.hibernate.jpa.internal.TransactionImpl.begin(TransactionImpl.java:64)
at com.google.inject.persist.jpa.JpaLocalTxnInterceptor.invoke(JpaLocalTxnInterceptor.java:66)
at net.impro.portal.server.model.dao.TestConnectionStuff.testConnectionBroken(TestConnectionStuff.java:40)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:80)
at org.testng.internal.Invoker.invokeMethod(Invoker.java:714)
at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:901)
at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1231)
at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:127)
at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:111)
at org.testng.TestRunner.privateRun(TestRunner.java:767)
at org.testng.TestRunner.run(TestRunner.java:617)
at org.testng.SuiteRunner.runTest(SuiteRunner.java:334)
at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:329)
at org.testng.SuiteRunner.privateRun(SuiteRunner.java:291)
at org.testng.SuiteRunner.run(SuiteRunner.java:240)
at org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:52)
at org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:86)
at org.testng.TestNG.runSuitesSequentially(TestNG.java:1197)
at org.testng.TestNG.runSuitesLocally(TestNG.java:1122)
at org.testng.TestNG.run(TestNG.java:1030)
at org.testng.remote.RemoteTestNG.run(RemoteTestNG.java:111)
at org.testng.remote.RemoteTestNG.initAndRun(RemoteTestNG.java:204)
at org.testng.remote.RemoteTestNG.main(RemoteTestNG.java:175)
Caused by: org.hibernate.QueryTimeoutException: Could not open connection
at org.hibernate.exception.internal.SQLExceptionTypeDelegate.convert(SQLExceptionTypeDelegate.java:83)
at org.hibernate.exception.internal.StandardSQLExceptionConverter.convert(StandardSQLExceptionConverter.java:49)
at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:126)
at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:112)
at org.hibernate.engine.jdbc.internal.LogicalConnectionImpl.obtainConnection(LogicalConnectionImpl.java:235)
at org.hibernate.engine.jdbc.internal.LogicalConnectionImpl.getConnection(LogicalConnectionImpl.java:171)
at org.hibernate.engine.transaction.internal.jdbc.JdbcTransaction.doBegin(JdbcTransaction.java:67)
at org.hibernate.engine.transaction.spi.AbstractTransactionImpl.begin(AbstractTransactionImpl.java:162)
at org.hibernate.internal.SessionImpl.beginTransaction(SessionImpl.java:1471)
at org.hibernate.jpa.internal.TransactionImpl.begin(TransactionImpl.java:61)
... 26 more
Caused by: java.sql.SQLTimeoutException: Timeout after 30004ms of waiting for a connection.
at com.zaxxer.hikari.pool.BaseHikariPool.getConnection(BaseHikariPool.java:233)
at com.zaxxer.hikari.pool.BaseHikariPool.getConnection(BaseHikariPool.java:183)
at com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:93)
at com.zaxxer.hikari.hibernate.HikariConnectionProvider.getConnection(HikariConnectionProvider.java:101)
at org.hibernate.internal.AbstractSessionImpl$NonContextualJdbcConnectionAccess.obtainConnection(AbstractSessionImpl.java:380)
at org.hibernate.engine.jdbc.internal.LogicalConnectionImpl.obtainConnection(LogicalConnectionImpl.java:228)
... 31 more
Caused by: java.sql.SQLException: Network error IOException: Connection refused: connect
at net.sourceforge.jtds.jdbc.JtdsConnection.<init>(JtdsConnection.java:436)
at net.sourceforge.jtds.jdbc.Driver.connect(Driver.java:184)
at com.zaxxer.hikari.util.DriverDataSource.getConnection(DriverDataSource.java:95)
at com.zaxxer.hikari.util.DriverDataSource.getConnection(DriverDataSource.java:101)
at com.zaxxer.hikari.pool.BaseHikariPool.addConnection(BaseHikariPool.java:444)
at com.zaxxer.hikari.pool.BaseHikariPool$1.run(BaseHikariPool.java:419)
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Caused by: java.net.ConnectException: Connection refused: connect
at java.net.DualStackPlainSocketImpl.waitForConnect(Native Method)
at java.net.DualStackPlainSocketImpl.socketConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.doConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source)
at java.net.AbstractPlainSocketImpl.connect(Unknown Source)
at java.net.PlainSocketImpl.connect(Unknown Source)
at java.net.SocksSocketImpl.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at net.sourceforge.jtds.jdbc.SharedSocket.createSocketForJDBC3(SharedSocket.java:288)
at net.sourceforge.jtds.jdbc.SharedSocket.<init>(SharedSocket.java:251)
at net.sourceforge.jtds.jdbc.JtdsConnection.<init>(JtdsConnection.java:331)
... 10 more
回答1:
What version of jTDS are you using? It looks like that version of the driver is not wrapping the SocketException
with a SQLException
containing a SQLState value starting with '08' (indicating disconnection).
HikariCP checks for these SQLStates and flags the connection for eviction. That connection should be closed and never returned again.
Newer versions of the jTDS driver appears to handle it correctly, if I am reading their code correctly.
UPDATE: This of course assumes that Hibernate still closes the Connection after the exception (like in a finally block).
UPDATE2: I missed the second exception part, that looks like a Hibernate issue to me ... not properly recovering from the first exception. If you enable logging for the com.zaxxer.hikari
package, you should see a warning about the connection being evicted. Do you see it?
UPDATE3: I looked at jTDS 1.3.0, and it is doing the right thing. You can see here that it will generate a SQLException with SQLState="08S01", which will be checked by HikariCP and the Connection evicted.
More suspicion now lies on Hibernate. I do not have the time right now to look at their code, but I suggest you do, using the line numbers in the stacktrace.
回答2:
I finally found the problem and it is a known hibernate bug with "blocker" status.
Without any reasonable workaround, database recovery seems impossible or at least very impractical.
来源:https://stackoverflow.com/questions/31858194/recovering-from-broken-database-connection-with-hibernate