Bug 5803 - Change of lifetime policy in gram4's job resources.
: Change of lifetime policy in gram4's job resources.
Status: RESOLVED FIXED
: GRAM
wsrf managed execution job service
: 4.1.2
: Macintosh All
: P2 normal
: 4.2
Assigned To:
:
:
:
:
  Show dependency treegraph
 
Reported: 2008-01-21 16:55 by
Modified: 2008-04-04 07:53 (History)


Attachments


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2008-01-21 16:55:28
Currently a job resource does never expire if a client does not
specify a lifetime. This and the fact that not all clients delete
their jobs lead to growing persistence directories and admins are
not able to determine which data may be wiped out and which may
still be needed.
TeraGrid admins and OSG would like to see this changed.

There will be 2 new parameters in Gram4’s JNDI configuration:

maxJobLifetime:
max lifetime a client can specify in the initial job submission and
in subsequent setTerminationTime calls

timeToLiveAfterCompletion:
amount of time a job resource keeps on existing after the job has
been fully processed in case the client did not specify a job lifetime.

Values are specified in seconds. A negative value means that there’s no
limit.
The parameters are exposed as resource properties of the factory resources
to enable a client to query the values of the configuration parameters.

Scenarios for clients:

1. A client does not specify a lifetime:
The job does not expire until it is fully processed. After that the lifetime
will be set to (now + timeToLiveAfterCompletion). By this it is guaranteed
that a job runs to completion (including fileStageOut and fileCleanUp) and
a client has the ability to query the status of a job for a while before it
will be removed.

2. The client specifies a lifetime:
The job will definitely be killed and removed when the lifetime expires
regardless of the status of the job. A client can however extend the
lifetime (restricted by maxJobLifetime). To restrict clients that want to set
the lifetime until the year 2040 or so an admin can (need not) specify a limit
in the configuration parameter maxJobLifetime

What we ignore in this approach:
Jobs that had been submitted without a lifetime and are in a hold state
forever, i.e. don't ever finish processing, will never expire. This seems to
be an exceptional case.
An admin will however be able to identify these jobs from the persistence
data.
For now we don't assume that this is a real problem.

Implementation is almost complete.
------- Comment #1 From 2008-04-04 07:53:51 -------
if the client does not set a lifetime: ideally would be to set the RP 
terminationTime to null then. Java WS Core interprets this as infinite. 
Unfortunately this causes problems with
Axis in a recovery situation: The xsd:dateTime field cannot be
deserialized if it's null
(Axis CalendarDeserializer:
        // validate fixed portion of format
        if (source == null || source.length() == 0) {
            throw new NumberFormatException(
                    Messages.getMessage("badDateTime00"));
        }
)

For now i set the lifetime to (now + 1000 years) which is maybe
a bit hackish, but it works and does the job.