Bug 4684 - Loading persisted jobs with expired delegation resources causes stacktraces
: Loading persisted jobs with expired delegation resources causes stacktraces
Status: RESOLVED WONTFIX
: GRAM
wsrf managed execution job service
: 4.0.2
: Macintosh All
: P2 normal
: 4.2.1
Assigned To:
:
:
:
:
  Show dependency treegraph
 
Reported: 2006-09-05 14:50 by
Modified: 2012-09-05 11:43 (History)


Attachments


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2006-09-05 14:50:18
We've gotten messages like this a few times:
http://www-unix.globus.org/mail_archive/gt4-friends/2005/09/msg00058.html

The trace looks like this:
2005-09-09 15:59:57,628 INFO  exec.RunQueue [Thread-3,<clinit>:54] Starting
state machine with 16 run queues.
2005-09-09 15:59:59,274 ERROR delegation.DelegationUtil
[Thread-11,getDelegationResource:253] Error getting delegation resource
org.globus.wsrf.NoSuchResourceException
        at
org.globus.delegation.service.DelegationResource.load(DelegationResource.java:395)
        at
org.globus.delegation.service.DelegationHome.find(DelegationHome.java:53)
        at
org.globus.delegation.DelegationUtil.getDelegationResource(DelegationUtil.java:251)
        at
org.globus.delegation.DelegationUtil.registerDelegationListener(DelegationUtil.java:166)
        at
org.globus.exec.service.utils.DelegatedCredential.getDelegatedCredential(DelegatedCredential.java:178)
        at
org.globus.exec.service.utils.DelegatedCredential.getDelegatedCredential(DelegatedCredential.java:79)
        at
org.globus.exec.service.job.ManagedJobResourceImpl.getStagingCredential(ManagedJobResourceImpl.java:476)
        at
org.globus.exec.service.exec.StateMachine.processRestartState(StateMachine.java:682)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at
org.globus.exec.service.exec.StateMachine.processState(StateMachine.java:364)
        at org.globus.exec.service.exec.RunQueue.run(RunQueue.java:158)
2005-09-09 15:59:59,278 ERROR exec.RunQueue [Thread-11,run:162] Unable to
process state transition.
2005-09-09 16:00:00,777 INFO  exec.ManagedExecutableJobHome
[Thread-3,recover:163] Recovered resource with ID
b8958da2-20bc-11da-b66f-00301bab9d00.


I contend that this should be an error message more like "Error recovering job
<UUID> because the delegated credential associated with it has expired.  Either
refresh the credential at <EPR> or remove the file <wherever the persisted
resource is>." and there should be no delegation service stack trace shown.
------- Comment #1 From 2007-02-22 09:09:11 -------
Refreshing expired delegation EPR is not feasible. The lifetime of the
delegation resource (to which the EPR points to), is set to the lifetime of the
delegated credential. So if a the credential is not refreshed prior to
expiration the resource is deleted and the EPR is not valid any more.

Also, when the resource is deleted, the persisted file is removed. So the user
can neither refresh EPR nor is there a file to clean up. If GRAM still needs
the delegated credential to finish the job, it is going to fail. 

I agree about not printing the trace and suggest we print a warning that the
credential associated with that recovered job has expired and attempt to finish
the job. 

Stu and I have been talking about adding a section to GRAM documentation about
solutions to renewing the delegated credentials automatically, but not much we
can do at this point if the credential expires when the container is down.
------- Comment #2 From 2007-02-22 09:18:05 -------
Fair enough regarding the refresh - I think the main issue here is that GRAM
should be handling what to do in the "expired deleg resource + not expired job"
and not throwing scare-inducing ERROR stacktraces for a situation that comes up
pretty regularly.
------- Comment #3 From 2007-06-25 09:19:06 -------
My 2 cents. I have seen this one occasionally, but cannot remember what affect
this has on the submitted job.  If this causes the job to not complete as
intended, then it should be a ERROR rather than a WARNING.  If it has no
affect, then who cares and I agree that GRAM should handle it internally.  As
for the stack traces, they should never occur unless it is a problem that was
NEVER anticipated.  A meaningful ERROR message for anticipated conditions is
best with emphasis on "meaningful".
------- Comment #4 From 2012-09-05 11:43:14 -------
Doing some bugzilla cleanup...  Resolving old GRAM3 and GRAM4 issues that are
no longer relevant since we've moved on to GRAM5.  Also, we're now tracking
issue in jira.  Any new issues should be added here:

http://jira.globus.org/secure/VersionBoard.jspa?selectedProjectId=10363