Bug 5837 - Change processing the restart state of resources in a container start
: Change processing the restart state of resources in a container start
Status: RESOLVED FIXED
: GRAM
wsrf managed execution job service
: alpha
: Macintosh All
: P2 normal
: 4.2
Assigned To:
:
:
:
:
  Show dependency treegraph
 
Reported: 2008-02-05 12:47 by
Modified: 2008-02-06 14:17 (History)


Attachments


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2008-02-05 12:47:36
Currently next to some other steps the following happens to a job resource
in a container restart:

1. a job re-subscribes for state change notifications at the JobStateMonitor
   if a local job id is part of the resource data (i.e. if the job had already
   been submitted to the local resource manager)
2. the job credential (if any) is fetched and thus the job credential listener
   is re-constituted
3. the staging credential (if any) is fetched and thus the job credential 
   listener is re-constituted

ad 1.
  With audit logging the local job id does not get nullified once the
  job has been fully processed by the local resource manager. So in a
  recovery situation also jobs that had passed submission to the LRM and
  are in a final state (Done, Failed, UserTerminateDone,
  UserTerminateFailed) will also re-register at the JobStateMonitor.
  Jobs in a final state will never un-register from the JSM and will
  just burden the JobStateMonitor.

ad 2.
  This is unnecessary because the credential had already been loaded during
  initialization of the resource when it is loaded from disk by the 
  PersistenceHelper (unless it's in internal state None. If so it's not 
  necessary because it will be loaded while processing the internal state
  None.

ad 3.
  This is unnecessary. We fetch the credential when we actually perform
  staging steps. So there's no need to do this in a restart.