Bug 4312 - WS GRAM resourceAllocationGroup bug
: WS GRAM resourceAllocationGroup bug
Status: RESOLVED FIXED
: GRAM
wsrf managed job factory service
: 4.0.1
: All Linux
: P3 major
: 4.2
Assigned To:
:
:
:
:
  Show dependency treegraph
 
Reported: 2006-03-27 23:18 by
Modified: 2006-04-07 16:40 (History)


Attachments


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2006-03-27 23:18:24
A TeraGrid collaborator found that when a job specifies:
        <resourceAllocationGroup>
             <hostType>ia32-compute</hostType>
             <hostCount>2</hostCount>
         </resourceAllocationGroup>
globusrun-ws fails with:
globusrun-ws: Job failed: The executable could not be started.
qsub: Job exceeds queue resource limits
I tracked down that the job was being rejected because it included:
  #PBS -l nodes=2:ia64-compute:ppn=0.5
where "ppn=0.5" isn't valid. If one specifies <cpusPerHost>1</cpusPerHost> the
submit works.
When cpusPerHost isn't specified ":ppn=.." should not get put in the job
either. I think this makes
the most sense.

Also, found a documentation inconsistency at
http://www.globus.org/toolkit/docs/4.0/admin/docbook/ch11.html
where it talks about cpusPerHost and cpusPerNode, the former is the correct
tag.

Thanks, JP
------- Comment #1 From 2006-03-28 13:59:58 -------
I'm having trouble updating the downloads page with a link to the new update
package, so here it is:

http://www-unix.mcs.anl.gov/~lane/Downloads/globus_gram_job_manager-7.13.tar.gz

This causes ExtensionHandler.pm to not add a "ppn" clause if neither cpuCount
nor cpusPerHost are specified.

I also fixed the doc typo locally, but the CVS access to the docs is preventing
me from updating that too. Hopefully that will be fixed soon.