Bugzilla – Bug 5143
DONE state never reported for Condor jobs when using Condor-G grid monitor
Last modified: 2012-09-12 10:42:16
You need to log in before you can comment on or make changes to this bug.
There's a bug in the poll() function of condor.pm, the Condor interface perl module of GRAM. It doesn't affect the job-manager, but it does affect Condor-G's Grid Monitor. If all jobs are sharing the same Condor user job log and the last event is a terminated event, that event can end up being double-counted across multiple calls to poll(). If the number of terminated events is greater than the gram job count, then poll() returns ACTIVE instead of DONE. The problem is that $record is not declared to be locally scoped in poll(), so its contents are preserved from one call to poll() to another. I will attache a patch.
Created an attachment (id=1218) [details] Patch to fix double-counting of condor log events
The VDT is unaffected by this bug, as it uses a separate Condor user log for each job, as outlined in bugzilla ticket 4199.