Bug 6084 - pbs.pm does not recognize "C" state
: pbs.pm does not recognize "C" state
Status: RESOLVED FIXED
: LRMA
Jobmanagers
: unspecified
: PC Linux
: P1 blocker
: 4.0.9
Assigned To:
:
:
:
:
  Show dependency treegraph
 
Reported: 2008-05-15 13:51 by
Modified: 2009-01-23 15:13 (History)


Attachments


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2008-05-15 13:51:47
We had to make a minor change to pbs.pm to accommodate the PBS install at
Vanderbilt. There is a state "C" which can show up in qstat meaning the job is
completed. Apparently with a regular installation of PBS, this state almost
never shows up. However, there is an option to allow jobs to remain in "C" for
some period of time. This option is used at Vanderbilt with a setting of one
hour. This means that the job is not allowed to finish from a GRAM point of
view until an hour after it's nominal end time. The patch to pbs.pm's poll()
subroutine is simple enough, just return DONE if the job is seen in the "C"
state.

       elsif(/R|E/)
       {
           $state = Globus::GRAM::JobState::ACTIVE;
       }
       elsif(/C/)
       {
           $state = Globus::GRAM::JobState::DONE;
       }
------- Comment #1 From 2008-05-20 13:34:55 -------
This change could also be interesting for the scheduler event generator
parsing the pbs logs. Do you know how to configure pbs to allow jobs to
remain in state "C" for a while?
------- Comment #2 From 2008-05-20 13:38:45 -------
found a webpage about how to configure pbs to keep jobs in state C ...
------- Comment #3 From 2008-05-20 14:05:44 -------
It seems that PBS logs a job to be done when it switches the status
from R to C: The PBS-SEG treats such a job to be "Done". So in Gram4 a
PBS configuration with keeping jobs in state C for a while is not a
problem.
I don't see a problem with this patch and suggest to apply it. Joe can't
see immediate problems either. Probably it can't make it into 4.2.0 at
this point, but 4.0.8 and 4.2.1 seems ok.
------- Comment #4 From 2008-07-25 10:50:27 -------
Hi Eric,

I am not sure if you saw JP questions below.  Can you confirm that a job's
stdout/err has been transferred by PBS before the "C" state is detectable?

If this is going to be put in 4.0.8, this will need to be confirmed.

-Stu

Begin forwarded message:

From: JP Navarro <navarro@mcs.anl.gov>
Date: July 18, 2008 4:51:06 PM CDT
To: lrma-dev@globus.org
Cc: gram-dev <gram-dev@globus.org>, Charles Bacon <bacon@mcs.anl.gov>
Subject: [lrma-dev] Re: [gram-dev] Fwd: GRAM PBS Job Manager patch

Since they've only been running the patch for a day, should we wait several
weeks before integrating it so it's more thoroughly tested. Maybe put it on
the TODO before the next GT 4.0.x release?

Are there other aspects besides STDOUT/STDERR delivery and exit status that
we might want to verify still work?  Is job exist status currently fully
reported to PreWS and WS GRAM clients?

JP
------- Comment #5 From 2008-07-25 13:43:26 -------
No, I did not get the message (I'm not on the lists). Yes, STDERR and STDOUT
are delivered just fine with this patch. As I understand it, the "C" state is
just a place keeper so that a user knows their job did run and exit.
------- Comment #6 From 2008-10-09 10:11:51 -------
TeraGrid would like this fix.  Let's get this committed for the next point
release.
------- Comment #7 From 2009-01-23 15:13:30 -------
Committed to 4.0 branch (already in 4.2 and trunk)