Bug 3599 - Job can restart with internal state "Restart"
: Job can restart with internal state "Restart"
Status: RESOLVED FIXED
: GRAM
wsrf managed execution job service
: 4.0.0
: PC Linux
: P1 blocker
: 4.0.1
Assigned To:
:
:
:
:
  Show dependency treegraph
 
Reported: 2005-07-26 15:37 by
Modified: 2005-07-27 09:42 (History)


Attachments


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2005-07-26 15:37:32
There is a minor logic bug in the service code that can cause a job that is in
the process of restarting when the container crashes to endlessly try to
restart.  Specifically, if the container crashes when restartInternalState is
set to something and internalState is set to "Restart", the code will copy
"Restart" to restartInternalState, obliterating the checkpointed internal state
and causing the state machine to continue to restart the job since the
checkpointed state appears to be "Restart".
------- Comment #1 From 2005-07-26 19:31:33 -------
My fix worked (hour long recoverability test completed successfully), awaiting
commit approval.
------- Comment #2 From 2005-07-27 09:42:11 -------
Fix in globus_4_0_branch.  It seems this fix was already in the trunk, so no
change there.