Bug 5762 - resetting maxattempts value
: resetting maxattempts value
Status: NEW
: RFT
RFT
: 4.0.5
: Open Science Grid (OSG) Linux
: P3 normal
: ---
Assigned To:
:
:
:
:
  Show dependency treegraph
 
Reported: 2008-01-07 15:34 by
Modified: 2008-01-07 16:01 (History)


Attachments


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2008-01-07 15:34:54
I was running scaling tests of WS-GRAM, moving output to a 3rd party gridftp
service. I would always get FileStaging timeout errors on some fraction of
jobs. I was able to get rid of these timeout errors by adding the line 

control_preauth_timeout 300

to the $GL/etc/gridftp.conf file on the destination server.

It was suggested that I could also fix this problem either on the WS server (in
the rft_database) or dynamically in the RSL file. via:

     <fileStageOut>
         <maxAttempts>5</maxAttempts>
      ...
     </fileStageOut>

I added the above and removed the fix from the gridftp.conf file and found this
solution did not work. Rather the rate of timeouts was as if I did not include
the <maxAttempts> directive.  I did check the database on the server and the
rft_database.maxattempts did reflect the new value of '5'. I also changed the
default in this field from '1' to '5'.  Neither change had the expected
behavior compared to changing the 'control_preauth_timeout' value.  

 -Jeff


WS Server Version:

vdt-version
...
 Globus Toolkit, web-services, server 4.0.5
...

WS Client Version:

vdt-version
...
 Globus Toolkit, web-services, client 4.0.5
...


Destination Gridftp server version:

../sbin/globus-gridftp-server -v
globus_gridftp_server: 2.3 (1144436882-63)
------- Comment #1 From 2008-01-07 16:01:52 -------
Jeff,
after looking at the RFT code: the exception that is thrown
in your case (i.e. if the authentication of a connection is
done too late) is an RftException and not an RftTransientException.
But retries are only done in RFT when an RftTransientException
is catched. If an RftException is catched the transfer will fail
immediately and no retries are done even if you specified a number
of retries in your transfer description.
It looks to me that the suggestion unfortunately does not work
for now and that you currently rely on the GridFTP configuration
change which seems to me the better solution anyway.
------- Comment #2 From 2008-01-07 17:15:04 -------
Subject: Re:  resetting maxattempts value


thanks Martin,

I understand & I do have a solution that works.  The only possible
drawback I see is that I cannot put this solution in one place. I have
to hope all gridftp servers I might contact have sufficiently long
timeouts set. I do understand the default timeout will be set longer in
the future so at that point it shouldn't be an issue at all.

Jeff 

> http://bugzilla.globus.org/bugzilla/show_bug.cgi?id=5762
> 
> 
> 
> 
> 
> ------- Comment #1 from feller@mcs.anl.gov  2008-01-07 16:01 -------
> Jeff,
> after looking at the RFT code: the exception that is thrown
> in your case (i.e. if the authentication of a connection is
> done too late) is an RftException and not an RftTransientException.
> But retries are only done in RFT when an RftTransientException
> is catched. If an RftException is catched the transfer will fail
> immediately and no retries are done even if you specified a number
> of retries in your transfer description.
> It looks to me that the suggestion unfortunately does not work
> for now and that you currently rely on the GridFTP configuration
> change which seems to me the better solution anyway.
> 
> 
> 
> 
> ------- You are receiving this mail because: -------
> You reported the bug, or are watching the reporter.