Bugzilla – Bug 2730
Job description variable fault lost
Last modified: 2005-08-03 16:48:40
You need to log in before you can comment on or make changes to this bug.
When an exception is thrown from ManagedExecutableJobResource.resolveVariables() throws an exception, the caller (StateMachine) does not cause the job to fail; instead the job is submitted as normal.
Fix in gram_bug_2730_branch.
The fix for this does not quite work in the 4.0 branch. See test globus_wsrf_gram_scheduler_test case submit205 failures on fork and most other schedulers. Right now it looks like the resource's fault is set, then the state machine is run in the systemCancel state. This causes some strange behavior because the job wasn't submitted or registered to the JSM. The fault kind of trickles back to the user in a strange form if streaming is being used, but is otherwise lost. I think maybe instead it should just throw the ServiceLevelException fault to the client which called createManagedJob and not create the resource or use the state machine at all.
If I'm remembering this bug correctly, there are problems with just throwing an exception but I can't remember what exactly. This was why I fixed it like I did. I don't understand why the JSM has anything to do with this if the fault occurs before the job is submitted to the scheduler.
Fix in trunk and globus_4_0_branch. There was a bug in the cancel() method whereby the system cancel service data wasn't being set.