Bugzilla – Bug 6028
globusrun-ws fails in termination in special situation
Last modified: 2008-04-30 15:53:29
You need to
before you can comment on or make changes to this bug.
I get the following in the situation where i submit an interactive job using
globusrun-ws, shutdown the container on the server-side after the job is
started and restart the container immediately. the cleanup and done
notifications are sent by the restarted container.
[martin@osg-test1 ~]$ globusrun-ws -submit -c /bin/sleep 30
Job ID: uuid:60930abc-0fa7-11dd-95bc-0013d4c3b957
Termination time: 04/21/3008 13:32 GMT
Current job state: Pending
Current job state: Active
Current job state: CleanUp
Current job state: Done
globusrun-ws: Unable to destroy job: globus_i_kill.c::88:
Error destroying job
Failed sending request
System error in writev: Broken pipe
globus_xio: A system call failed: Broken pipe
No faults in the server-side container log.
It works fine if i use batch mode for submission and kill the job later.
I've updated globus_xio's http driver to indicate persistent connection drop
failures when the occur, globus_c_ws_messaging to have an attribute to
automatically retry when those faults occur, and updated globusrun-ws to use
that attribute. It's committed to trunk.