Bugzilla – Bug 3685
RFT loosing connection with MySQL
Last modified: 2006-02-10 15:31:17
You need to log in before you can comment on or make changes to this bug.
This problem is closely related to the bug: http://bugzilla.globus.org/globus/show_bug.cgi?id=3191. I've noticed that RFT doesn't check if the connection is alive before executing SQL Query. In case of Postgres DB this was a problem only in case when DB Server was restarted. In case of MySQL DB this is a bit bigger issue. MySQL uses timeout for inactive connection and it's default value is 28800s (8 hours). If RFT doesn't use DB for 8 hours, all subsequent queries will fail and it will report error (example is listed at the end). Furthermore, it seems that there isn't a way to set this timeout to infinite. I solved the problem by setting timeout (exact variable name is wait_timeout, see: http://dev.mysql.com/doc/mysql/en/server-system-variables.html) to longer period. However, this is not really an ellegant solution. Will this issue be resolved in future versions? ---------------------------------------------------- Here's the example of failed RFT request: $ globusrun-ws -submit -F remote.resource -s -c /bin/hostname Delegating user credentials...Done. Submitting job...Done. Job ID: uuid:050d0c40-14a2-11da-84a2-000a5e1d7991 Termination time: 08/25/2005 13:21 GMT Current job state: Active Current job state: CleanUp-Hold remote.resource Current job state: CleanUp Current job state: Failed Destroying job...Done. Cleaning up any delegated credentials...Done. globusrun-ws: Job failed: Staging error for RSL element fileCleanUp. java.rmi.RemoteException: Unable to create RFT resource; nested exception is: org.globus.transfer.reliable.service.database.RftDBException: Error inserting records into database [Caused by: No operations allowed after connection closed.Connection was implicitly closed due to underlying exception/error: ** BEGIN NESTED EXCEPTION ** com.mysql.jdbc.CommunicationsException MESSAGE: Communications link failure due to underlying exception: ** BEGIN NESTED EXCEPTION ** java.io.EOFException STACKTRACE: java.io.EOFException at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:1902) at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:2348) at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:2858) at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1570) at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:1665) at com.mysql.jdbc.Connection.execSQL(Connection.java:2972) at com.mysql.jdbc.Connection.setAutoCommit(Connection.java:4870) at org.apache.commons.dbcp.DelegatingConnection.setAutoCommit(DelegatingConnection.java:237) at org.apache.commons.dbcp.PoolableConnectionFactory.activateObject(PoolableConnectionFactory.java:273) at org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:753) at org.apache.commons.dbcp.PoolingDriver.connect(PoolingDriver.java:146) at java.sql.DriverManager.getConnection(DriverManager.java:512) at java.sql.DriverManager.getConnection(DriverManager.java:193) at org.globus.transfer.reliable.service.database.RFTDatabaseSetup.getDBConnection(RFTDatabaseSetup.java:198) at org.globus.transfer.reliable.service.database.ReliableFileTransferDbAdapter.getTotalNumberOfBytes(ReliableFileTransferDbAdapter.java:1854) at org.globus.transfer.reliable.service.factory.TotalNumberOfBytesTransferred.getTotalNumberOfBytesTransferred(TotalNumberOfBytesTransferred.java:51) at sun.reflect.GeneratedMethodAccessor113.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:324) at org.globus.wsrf.impl.ReflectionResourceProperty.getValueSimple(ReflectionResourceProperty.java:340) at org.globus.wsrf.impl.ReflectionResourceProperty.toSOAPElementSimple(ReflectionResourceProperty.java:431) at org.globus.wsrf.impl.ReflectionResourceProperty.toSOAPElements(ReflectionResourceProperty.java:320) at org.globus.wsrf.impl.properties.GetMultipleResourcePropertiesProvider.getMultipleResourceProperties(GetMultipleResourcePropertiesProvider.java:97) at sun.reflect.GeneratedMethodAccessor112.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:324) at org.apache.axis.providers.java.RPCProvider.invokeMethod(RPCProvider.java:384) at org.globus.axis.providers.RPCProvider.invokeMethodSub(RPCProvider.java:107) at org.globus.axis.providers.PrivilegedInvokeMethodAction.run(PrivilegedInvokeMethodAction.java:42) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:379) at org.globus.gsi.jaas.GlobusSubject.runAs(GlobusSubject.java:55) at org.globus.gsi.jaas.JaasSubject.doAs(JaasSubject.java:90) at org.globus.axis.providers.RPCProvider.invokeMethod(RPCProvider.java:97) at org.apache.axis.providers.java.RPCProvider.processMessage(RPCProvider.java:281) at org.apache.axis.providers.java.JavaProvider.invoke(JavaProvider.java:319) at org.apache.axis.strategies.InvocationStrategy.visit(InvocationStrategy.java:32) at org.apache.axis.SimpleChain.doVisiting(SimpleChain.java:118) at org.apache.axis.SimpleChain.invoke(SimpleChain.java:83) at org.apache.axis.handlers.soap.SOAPService.invoke(SOAPService.java:450) at org.apache.axis.server.AxisServer.invoke(AxisServer.java:285) at org.globus.wsrf.container.ServiceThread.doPost(ServiceThread.java:677) at org.globus.wsrf.container.ServiceThread.process(ServiceThread.java:398) at org.globus.wsrf.container.GSIServiceThread.process(GSIServiceThread.java:130) at org.globus.wsrf.container.ServiceThread.run(ServiceThread.java:302) ** END NESTED EXCEPTION **
Yes it will be fixed in next release.
Excellent. It would be good to add to documentation that wait_timeout has to be set to longer period (just to prevent users questions in future)?
A fix is committed for this in trunk and the release branch
Can you provide a patch for this bug against the Globus 4.0.1 source base? Our hope is to get this into the VDT as soon as possible for deployment on Open Science Grid, and we're currently basing our builds on Globus 4.0.1. Thanks!
Alain Try this patch and let me know if you see any problems : http://www-unix.mcs.anl.gov/~madduri/patch_for_roy
Ravi, the patch seems to apply correctly. I'll do a build soon and we can test it. Just to verify: you just changed RFTDatabaseSetup.java for this patch, and nothing else--correct? Thanks!
Yes, That is true.