Bug 3685 - RFT loosing connection with MySQL
: RFT loosing connection with MySQL
Status: RESOLVED FIXED
: RFT
RFT
: 4.0.1
: PC Linux
: P1 major
: ---
Assigned To:
:
:
:
: 4197
  Show dependency treegraph
 
Reported: 2005-08-24 08:53 by
Modified: 2006-02-10 15:31 (History)


Attachments


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2005-08-24 08:53:12
This problem is closely related to the bug:
http://bugzilla.globus.org/globus/show_bug.cgi?id=3191. 

I've noticed that RFT doesn't check if the connection is alive before executing
SQL Query. In case of Postgres DB this was a problem only in case when DB Server
was restarted. 

In case of MySQL DB this is a bit bigger issue. MySQL uses timeout for inactive
connection and it's default value is 28800s (8 hours). If RFT doesn't use DB for
8 hours, all subsequent queries will fail and it will report error (example is
listed at the end). Furthermore, it seems that there isn't a way to set this
timeout to infinite.

I solved the problem by setting timeout (exact variable name is wait_timeout,
see: http://dev.mysql.com/doc/mysql/en/server-system-variables.html) to longer
period. However, this is not really an ellegant solution. Will this issue be
resolved in future versions?

----------------------------------------------------
Here's the example of failed RFT request:

$ globusrun-ws -submit -F remote.resource -s -c /bin/hostname
Delegating user credentials...Done.
Submitting job...Done.
Job ID: uuid:050d0c40-14a2-11da-84a2-000a5e1d7991
Termination time: 08/25/2005 13:21 GMT
Current job state: Active
Current job state: CleanUp-Hold
remote.resource
Current job state: CleanUp
Current job state: Failed
Destroying job...Done.
Cleaning up any delegated credentials...Done.
globusrun-ws: Job failed: Staging error for RSL element fileCleanUp.




java.rmi.RemoteException: Unable to create RFT resource; nested exception is:
        org.globus.transfer.reliable.service.database.RftDBException: Error
inserting records into database [Caused by: No operations allowed after
connection closed.Connection was implicitly closed due to underlying
exception/error:


** BEGIN NESTED EXCEPTION **

com.mysql.jdbc.CommunicationsException
MESSAGE: Communications link failure due to underlying exception:

** BEGIN NESTED EXCEPTION **

java.io.EOFException

STACKTRACE:

java.io.EOFException
        at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:1902)
        at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:2348)
        at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:2858)
        at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1570)
        at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:1665)
        at com.mysql.jdbc.Connection.execSQL(Connection.java:2972)
        at com.mysql.jdbc.Connection.setAutoCommit(Connection.java:4870)
        at
org.apache.commons.dbcp.DelegatingConnection.setAutoCommit(DelegatingConnection.java:237)
        at
org.apache.commons.dbcp.PoolableConnectionFactory.activateObject(PoolableConnectionFactory.java:273)
        at
org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:753)
        at org.apache.commons.dbcp.PoolingDriver.connect(PoolingDriver.java:146)
        at java.sql.DriverManager.getConnection(DriverManager.java:512)
        at java.sql.DriverManager.getConnection(DriverManager.java:193)
        at
org.globus.transfer.reliable.service.database.RFTDatabaseSetup.getDBConnection(RFTDatabaseSetup.java:198)
        at
org.globus.transfer.reliable.service.database.ReliableFileTransferDbAdapter.getTotalNumberOfBytes(ReliableFileTransferDbAdapter.java:1854)
        at
org.globus.transfer.reliable.service.factory.TotalNumberOfBytesTransferred.getTotalNumberOfBytesTransferred(TotalNumberOfBytesTransferred.java:51)
        at sun.reflect.GeneratedMethodAccessor113.invoke(Unknown Source)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:324)
        at
org.globus.wsrf.impl.ReflectionResourceProperty.getValueSimple(ReflectionResourceProperty.java:340)
        at
org.globus.wsrf.impl.ReflectionResourceProperty.toSOAPElementSimple(ReflectionResourceProperty.java:431)
        at
org.globus.wsrf.impl.ReflectionResourceProperty.toSOAPElements(ReflectionResourceProperty.java:320)
        at
org.globus.wsrf.impl.properties.GetMultipleResourcePropertiesProvider.getMultipleResourceProperties(GetMultipleResourcePropertiesProvider.java:97)
        at sun.reflect.GeneratedMethodAccessor112.invoke(Unknown Source)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:324)
        at
org.apache.axis.providers.java.RPCProvider.invokeMethod(RPCProvider.java:384)
        at
org.globus.axis.providers.RPCProvider.invokeMethodSub(RPCProvider.java:107)
        at
org.globus.axis.providers.PrivilegedInvokeMethodAction.run(PrivilegedInvokeMethodAction.java:42)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:379)
        at org.globus.gsi.jaas.GlobusSubject.runAs(GlobusSubject.java:55)
        at org.globus.gsi.jaas.JaasSubject.doAs(JaasSubject.java:90)
        at org.globus.axis.providers.RPCProvider.invokeMethod(RPCProvider.java:97)
        at
org.apache.axis.providers.java.RPCProvider.processMessage(RPCProvider.java:281)
        at org.apache.axis.providers.java.JavaProvider.invoke(JavaProvider.java:319)
        at
org.apache.axis.strategies.InvocationStrategy.visit(InvocationStrategy.java:32)
        at org.apache.axis.SimpleChain.doVisiting(SimpleChain.java:118)
        at org.apache.axis.SimpleChain.invoke(SimpleChain.java:83)
        at org.apache.axis.handlers.soap.SOAPService.invoke(SOAPService.java:450)
        at org.apache.axis.server.AxisServer.invoke(AxisServer.java:285)
        at org.globus.wsrf.container.ServiceThread.doPost(ServiceThread.java:677)
        at org.globus.wsrf.container.ServiceThread.process(ServiceThread.java:398)
        at
org.globus.wsrf.container.GSIServiceThread.process(GSIServiceThread.java:130)
        at org.globus.wsrf.container.ServiceThread.run(ServiceThread.java:302)


** END NESTED EXCEPTION **
------- Comment #1 From 2005-08-24 11:34:09 -------
Yes it will be fixed in next release.
------- Comment #2 From 2005-08-25 09:31:10 -------
Excellent. 

It would be good to add to documentation that wait_timeout has to be set to
longer period (just to prevent users questions in future)?
------- Comment #3 From 2006-02-07 01:51:13 -------
A fix is committed for this in trunk and the release branch
------- Comment #4 From 2006-02-07 09:26:50 -------
Can you provide a patch for this bug against the Globus 4.0.1 source base? Our
hope is to get this into the VDT as soon as possible for deployment on Open
Science Grid, and we're currently basing our builds on Globus 4.0.1. 

Thanks!
------- Comment #5 From 2006-02-07 13:05:41 -------
Alain
Try this patch and let me know if you see any problems :
http://www-unix.mcs.anl.gov/~madduri/patch_for_roy
------- Comment #6 From 2006-02-10 15:22:21 -------
Ravi, the patch seems to apply correctly. I'll do a build soon and we can test
it. 

Just to verify: you just changed RFTDatabaseSetup.java for this patch, and
nothing else--correct?

Thanks!
------- Comment #7 From 2006-02-10 15:31:17 -------
Yes, That is true.