Bugzilla – Bug 6043
Confusing JobID in timeout message
Last modified: 2012-09-05 11:43:47
You need to log in before you can comment on or make changes to this bug.
globusrun-ws prints a job id after a job is submitted, but then if the job times out, the error message contains a different job id value than that used as the JobID element in the createManagedJob input message. This makes for odd reading. Also, it's more difficult to look in container logs to figure out what is going on with a particular job. Example: % globusrun-ws -submit -term +00:01 -c /bin/sleep 121 Submitting job...Done. Job ID: uuid:f3f77c82-12c9-11dd-8881-0017f2cb7d49 Termination time: 04/25/2008 13:18 GMT Current job state: Active Current job state: Failed globusrun-ws: Job failed: The lifetime of job f4460960-12c9-11dd-a497-c1270f93a800 expired. Resource will be in read-only mode and will be removed after all cleanup steps are completed.
What globusrun-ws prints seems to be the client-side generated uuid. In the epr that comes back from MJFS.createManagedJob you get the uuid that has been created by the Gram and which is used in subsequent loggings and messages on the server-side. Maybe globusrun-ws should print the uuid that comes back from the service, since the client-side generated one is only used for that reliable job submission thing, where a client can resubmit a job with the same client-side generated uuid if no answer came back on the createManagedJob() call and be sure that no new job is started if the first one is still active.
The element is called JobID though in the XML schema. In my opinion, the client isn't really supposed to know about the internals of the service implementation to be able to make sense out of the fault message.
Doing some bugzilla cleanup... Resolving old GRAM3 and GRAM4 issues that are no longer relevant since we've moved on to GRAM5. Also, we're now tracking issue in jira. Any new issues should be added here: http://jira.globus.org/secure/VersionBoard.jspa?selectedProjectId=10363