Bugzilla – Bug 5762
resetting maxattempts value
Last modified: 2008-01-07 16:01:52
You need to log in before you can comment on or make changes to this bug.
I was running scaling tests of WS-GRAM, moving output to a 3rd party gridftp service. I would always get FileStaging timeout errors on some fraction of jobs. I was able to get rid of these timeout errors by adding the line control_preauth_timeout 300 to the $GL/etc/gridftp.conf file on the destination server. It was suggested that I could also fix this problem either on the WS server (in the rft_database) or dynamically in the RSL file. via: <fileStageOut> <maxAttempts>5</maxAttempts> ... </fileStageOut> I added the above and removed the fix from the gridftp.conf file and found this solution did not work. Rather the rate of timeouts was as if I did not include the <maxAttempts> directive. I did check the database on the server and the rft_database.maxattempts did reflect the new value of '5'. I also changed the default in this field from '1' to '5'. Neither change had the expected behavior compared to changing the 'control_preauth_timeout' value. -Jeff WS Server Version: vdt-version ... Globus Toolkit, web-services, server 4.0.5 ... WS Client Version: vdt-version ... Globus Toolkit, web-services, client 4.0.5 ... Destination Gridftp server version: ../sbin/globus-gridftp-server -v globus_gridftp_server: 2.3 (1144436882-63)
Jeff, after looking at the RFT code: the exception that is thrown in your case (i.e. if the authentication of a connection is done too late) is an RftException and not an RftTransientException. But retries are only done in RFT when an RftTransientException is catched. If an RftException is catched the transfer will fail immediately and no retries are done even if you specified a number of retries in your transfer description. It looks to me that the suggestion unfortunately does not work for now and that you currently rely on the GridFTP configuration change which seems to me the better solution anyway.
Subject: Re: resetting maxattempts value thanks Martin, I understand & I do have a solution that works. The only possible drawback I see is that I cannot put this solution in one place. I have to hope all gridftp servers I might contact have sufficiently long timeouts set. I do understand the default timeout will be set longer in the future so at that point it shouldn't be an issue at all. Jeff > http://bugzilla.globus.org/bugzilla/show_bug.cgi?id=5762 > > > > > > ------- Comment #1 from feller@mcs.anl.gov 2008-01-07 16:01 ------- > Jeff, > after looking at the RFT code: the exception that is thrown > in your case (i.e. if the authentication of a connection is > done too late) is an RftException and not an RftTransientException. > But retries are only done in RFT when an RftTransientException > is catched. If an RftException is catched the transfer will fail > immediately and no retries are done even if you specified a number > of retries in your transfer description. > It looks to me that the suggestion unfortunately does not work > for now and that you currently rely on the GridFTP configuration > change which seems to me the better solution anyway. > > > > > ------- You are receiving this mail because: ------- > You reported the bug, or are watching the reporter.