Recovering from broken database connection with hibernate

痞子三分冷 提交于 2019-12-12 01:25:33

问题


I am having trouble recovering from a broken database connection. My test case uses the following libraries.
guice-persist 4.0
Hibernate core 4.3.1
HikariCP-java6-2.3.9 (connection pool).

My test runs a simple read in a loop with a small sleep between each read.

@com.google.inject.persist.Transactional
protected void simpleRead() {
  dao = new MyDao(....)
  dao.findBy(...)
}

The Dao acquires its entityManager like this

@Inject
protected com.google.inject.Provider<EntityManager> entityManager;

The guice module is bound like this

@Override
public void configure() {
  install(new JpaPersistModule("test"));
  bind(JPAInitializer.class).asEagerSingleton();
}

My persistence.xml defines the transaction-type as "RESOURCE-LOCAL".

And I have a few hikari settings, but Im not sure if these are relevant

 <property name="hibernate.connection.provider_class" value="com.zaxxer.hikari.hibernate.HikariConnectionProvider" /> 
 <property name="hibernate.hikari.maxLifetime" value="1800000" />
 <property name="hibernate.hikari.connectionTestQuery" value="SELECT 1" />
 <property name="hibernate.hikari.leakDetectionThreshold" value="300000" />

While the test is looping through the reads, I break the SQLServer connection by stopping the service from task manager. If the loop was not in the simplRead() method, at the time, it gracefully handles the error on the next attempted read and re-connects when the service is started again. However, if it was in the simpleRead() method when the service is stopped, it fails and never recovers. The first failure is reported as follows


org.hibernate.TransactionException: JDBC begin transaction failed: javax.persistence.PersistenceException: org.hibernate.TransactionException: JDBC begin transaction failed: at org.hibernate.jpa.spi.AbstractEntityManagerImpl.convert(AbstractEntityManagerImpl.java:1763) at org.hibernate.jpa.spi.AbstractEntityManagerImpl.convert(AbstractEntityManagerImpl.java:1677) at org.hibernate.jpa.spi.AbstractEntityManagerImpl.throwPersistenceException(AbstractEntityManagerImpl.java:1771) at org.hibernate.jpa.internal.TransactionImpl.begin(TransactionImpl.java:64) at com.google.inject.persist.jpa.JpaLocalTxnInterceptor.invoke(JpaLocalTxnInterceptor.java:66) at net.impro.portal.server.model.dao.TestConnectionStuff.testConnectionBroken(TestConnectionStuff.java:40) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:80) at org.testng.internal.Invoker.invokeMethod(Invoker.java:714) at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:901) at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1231) at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:127) at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:111) at org.testng.TestRunner.privateRun(TestRunner.java:767) at org.testng.TestRunner.run(TestRunner.java:617) at org.testng.SuiteRunner.runTest(SuiteRunner.java:334) at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:329) at org.testng.SuiteRunner.privateRun(SuiteRunner.java:291) at org.testng.SuiteRunner.run(SuiteRunner.java:240) at org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:52) at org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:86) at org.testng.TestNG.runSuitesSequentially(TestNG.java:1197) at org.testng.TestNG.runSuitesLocally(TestNG.java:1122) at org.testng.TestNG.run(TestNG.java:1030) at org.testng.remote.RemoteTestNG.run(RemoteTestNG.java:111) at org.testng.remote.RemoteTestNG.initAndRun(RemoteTestNG.java:204) at org.testng.remote.RemoteTestNG.main(RemoteTestNG.java:175) Caused by: org.hibernate.TransactionException: JDBC begin transaction failed: at org.hibernate.engine.transaction.internal.jdbc.JdbcTransaction.doBegin(JdbcTransaction.java:76) at org.hibernate.engine.transaction.spi.AbstractTransactionImpl.begin(AbstractTransactionImpl.java:162) at org.hibernate.internal.SessionImpl.beginTransaction(SessionImpl.java:1431) at org.hibernate.jpa.internal.TransactionImpl.begin(TransactionImpl.java:61) ... 26 more Caused by: java.sql.SQLException: I/O Error: Connection reset by peer: socket write error at net.sourceforge.jtds.jdbc.TdsCore.executeSQL(TdsCore.java:1059) at net.sourceforge.jtds.jdbc.TdsCore.submitSQL(TdsCore.java:905) at net.sourceforge.jtds.jdbc.JtdsConnection.setAutoCommit(JtdsConnection.java:2291) at com.zaxxer.hikari.proxy.ConnectionProxy.setAutoCommit(ConnectionProxy.java:334) at com.zaxxer.hikari.proxy.ConnectionJavassistProxy.setAutoCommit(ConnectionJavassistProxy.java) at org.hibernate.engine.transaction.internal.jdbc.JdbcTransaction.doBegin(JdbcTransaction.java:72) ... 29 more Caused by: java.net.SocketException: Connection reset by peer: socket write error at java.net.SocketOutputStream.socketWrite0(Native Method) at java.net.SocketOutputStream.socketWrite(Unknown Source) at java.net.SocketOutputStream.write(Unknown Source) at java.io.DataOutputStream.write(Unknown Source) at net.sourceforge.jtds.jdbc.SharedSocket.sendNetPacket(SharedSocket.java:717) at net.sourceforge.jtds.jdbc.RequestStream.putPacket(RequestStream.java:570) at net.sourceforge.jtds.jdbc.RequestStream.flush(RequestStream.java:518) at net.sourceforge.jtds.jdbc.TdsCore.executeSQL(TdsCore.java:1046) ... 34 more ...

And because that now doesnt clean up properly, the subsequent requests give fail like this

org.hibernate.TransactionException: Already have an associated managed connection

Im not finding it easy to determine which layer is responsible for this problem. From what I have read, the connection pool should test broken connections and clean things up, but I suspect its guice-persist which forever has the broken EntityManager reference associated with its thread-local object. Of course, it may well also be the way Im using them.

EDIT

After further investigation, I noticed that if I wait > 1000ms between each test read, then the code behaves as expected. I get the StackTrace below, and then after I restore the SQLServer service, it happily reconnects. Anything less than 1000ms and I get the error listed above in the original post. It seems hibernate does not handle the IO/Error in the same way and the connection is forever left in a broken state.

javax.persistence.QueryTimeoutException: Could not open connection
    at org.hibernate.jpa.spi.AbstractEntityManagerImpl.convert(AbstractEntityManagerImpl.java:1725)
    at org.hibernate.jpa.spi.AbstractEntityManagerImpl.convert(AbstractEntityManagerImpl.java:1677)
    at org.hibernate.jpa.spi.AbstractEntityManagerImpl.throwPersistenceException(AbstractEntityManagerImpl.java:1771)
    at org.hibernate.jpa.internal.TransactionImpl.begin(TransactionImpl.java:64)
    at com.google.inject.persist.jpa.JpaLocalTxnInterceptor.invoke(JpaLocalTxnInterceptor.java:66)
    at net.impro.portal.server.model.dao.TestConnectionStuff.testConnectionBroken(TestConnectionStuff.java:40)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
    at java.lang.reflect.Method.invoke(Unknown Source)
    at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:80)
    at org.testng.internal.Invoker.invokeMethod(Invoker.java:714)
    at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:901)
    at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1231)
    at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:127)
    at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:111)
    at org.testng.TestRunner.privateRun(TestRunner.java:767)
    at org.testng.TestRunner.run(TestRunner.java:617)
    at org.testng.SuiteRunner.runTest(SuiteRunner.java:334)
    at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:329)
    at org.testng.SuiteRunner.privateRun(SuiteRunner.java:291)
    at org.testng.SuiteRunner.run(SuiteRunner.java:240)
    at org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:52)
    at org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:86)
    at org.testng.TestNG.runSuitesSequentially(TestNG.java:1197)
    at org.testng.TestNG.runSuitesLocally(TestNG.java:1122)
    at org.testng.TestNG.run(TestNG.java:1030)
    at org.testng.remote.RemoteTestNG.run(RemoteTestNG.java:111)
    at org.testng.remote.RemoteTestNG.initAndRun(RemoteTestNG.java:204)
    at org.testng.remote.RemoteTestNG.main(RemoteTestNG.java:175)
Caused by: org.hibernate.QueryTimeoutException: Could not open connection
    at org.hibernate.exception.internal.SQLExceptionTypeDelegate.convert(SQLExceptionTypeDelegate.java:83)
    at org.hibernate.exception.internal.StandardSQLExceptionConverter.convert(StandardSQLExceptionConverter.java:49)
    at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:126)
    at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:112)
    at org.hibernate.engine.jdbc.internal.LogicalConnectionImpl.obtainConnection(LogicalConnectionImpl.java:235)
    at org.hibernate.engine.jdbc.internal.LogicalConnectionImpl.getConnection(LogicalConnectionImpl.java:171)
    at org.hibernate.engine.transaction.internal.jdbc.JdbcTransaction.doBegin(JdbcTransaction.java:67)
    at org.hibernate.engine.transaction.spi.AbstractTransactionImpl.begin(AbstractTransactionImpl.java:162)
    at org.hibernate.internal.SessionImpl.beginTransaction(SessionImpl.java:1471)
    at org.hibernate.jpa.internal.TransactionImpl.begin(TransactionImpl.java:61)
    ... 26 more
Caused by: java.sql.SQLTimeoutException: Timeout after 30004ms of waiting for a connection.
    at com.zaxxer.hikari.pool.BaseHikariPool.getConnection(BaseHikariPool.java:233)
    at com.zaxxer.hikari.pool.BaseHikariPool.getConnection(BaseHikariPool.java:183)
    at com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:93)
    at com.zaxxer.hikari.hibernate.HikariConnectionProvider.getConnection(HikariConnectionProvider.java:101)
    at org.hibernate.internal.AbstractSessionImpl$NonContextualJdbcConnectionAccess.obtainConnection(AbstractSessionImpl.java:380)
    at org.hibernate.engine.jdbc.internal.LogicalConnectionImpl.obtainConnection(LogicalConnectionImpl.java:228)
    ... 31 more
Caused by: java.sql.SQLException: Network error IOException: Connection refused: connect
    at net.sourceforge.jtds.jdbc.JtdsConnection.<init>(JtdsConnection.java:436)
    at net.sourceforge.jtds.jdbc.Driver.connect(Driver.java:184)
    at com.zaxxer.hikari.util.DriverDataSource.getConnection(DriverDataSource.java:95)
    at com.zaxxer.hikari.util.DriverDataSource.getConnection(DriverDataSource.java:101)
    at com.zaxxer.hikari.pool.BaseHikariPool.addConnection(BaseHikariPool.java:444)
    at com.zaxxer.hikari.pool.BaseHikariPool$1.run(BaseHikariPool.java:419)
    at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
    at java.util.concurrent.FutureTask.run(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.lang.Thread.run(Unknown Source)
Caused by: java.net.ConnectException: Connection refused: connect
    at java.net.DualStackPlainSocketImpl.waitForConnect(Native Method)
    at java.net.DualStackPlainSocketImpl.socketConnect(Unknown Source)
    at java.net.AbstractPlainSocketImpl.doConnect(Unknown Source)
    at java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source)
    at java.net.AbstractPlainSocketImpl.connect(Unknown Source)
    at java.net.PlainSocketImpl.connect(Unknown Source)
    at java.net.SocksSocketImpl.connect(Unknown Source)
    at java.net.Socket.connect(Unknown Source)
    at net.sourceforge.jtds.jdbc.SharedSocket.createSocketForJDBC3(SharedSocket.java:288)
    at net.sourceforge.jtds.jdbc.SharedSocket.<init>(SharedSocket.java:251)
    at net.sourceforge.jtds.jdbc.JtdsConnection.<init>(JtdsConnection.java:331)
    ... 10 more

回答1:


What version of jTDS are you using? It looks like that version of the driver is not wrapping the SocketException with a SQLException containing a SQLState value starting with '08' (indicating disconnection).

HikariCP checks for these SQLStates and flags the connection for eviction. That connection should be closed and never returned again.

Newer versions of the jTDS driver appears to handle it correctly, if I am reading their code correctly.

UPDATE: This of course assumes that Hibernate still closes the Connection after the exception (like in a finally block).

UPDATE2: I missed the second exception part, that looks like a Hibernate issue to me ... not properly recovering from the first exception. If you enable logging for the com.zaxxer.hikari package, you should see a warning about the connection being evicted. Do you see it?

UPDATE3: I looked at jTDS 1.3.0, and it is doing the right thing. You can see here that it will generate a SQLException with SQLState="08S01", which will be checked by HikariCP and the Connection evicted.

More suspicion now lies on Hibernate. I do not have the time right now to look at their code, but I suggest you do, using the line numbers in the stacktrace.




回答2:


I finally found the problem and it is a known hibernate bug with "blocker" status.

Without any reasonable workaround, database recovery seems impossible or at least very impractical.



来源:https://stackoverflow.com/questions/31858194/recovering-from-broken-database-connection-with-hibernate

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!