Bugzilla – Bug 4199
Patch pre-WS GRAM to use individual condor logs for jobs
Last modified: 2011-12-14 07:49:12
You need to log in before you can comment on or make changes to this bug.
Users have recently noticed a problem with pre-WS GRAM and the change introduced in Globus 4.0 to use a single Condor job log for all jobs. The pre-WS GRAM poll function reads the entire condor log each time it determines the state of a job. That is, the entire log is read all the way through once for every job several times a minute. The log contains entries for all previous jobs and cannot be easily rotated. After several thousand jobs have been run, the work needed to poll each current job grows significantly, as entries for all previous jobs are reread. We have developed a patch that restores the old configuration of a separate Condor job log file for each job. This is only done for pre-WS GRAM. If the gram script for condor detects that it's being called by WS GRAM, it still uses the single condor log file read by the SEG. I understand that Globus plans to eventually have pre-WS GRAM use the SEG, which will also solve this problem. I believe this patch serves as an important interim solution.
Created an attachment (id=840) [details] Patch for separate Condor job log files for pre-WS GRAM
Alain - Is this something required for the vdt 1.3.10 release timeframe?
We're going to apply the patch on our own, so it is not a rush for you to patch it in Globus. It's something for you to consider for the future.
Ok. Alain, next question, is this something that will be needed in the community branch? e.g. Will you need this in future GT 4.0 community branch versions? If not, then we will just need to reconcile this in the trunk along with other SEG enhancements.
Subject: Re: Patch pre-WS GRAM to use individual condor logs for jobs Should it be in the community branch? Well, at some point the VDT plans to adopt the same version of Globus as TeraGrid, and that will be the community branch. I'm not sure of the exact timeline, but we hope to move there soon. So ideally it would be on the community branch. If it's not, it will simply be an extra patch for us, and that's not a huge deal. -alain
This was fixed in 5.0.2 but not resolved.