Bugzilla – Bug 3766
CAMPAIGN: WS-GRAM Job Description Extensions Handling Support
Last modified: 2005-09-26 12:31:53
You need to log in before you can comment on or make changes to this bug.
Projects: ??? Technologies: Globus Resource Allocation Manager (GRAM) Definition: Adding custom WS-GRAM job description extensions is easy, but the current resource manager adapter implementation in perl has no easy way of handling the extensions element XML blob. To make this easier, an addition perl module should be created that is called to handle this XML blob. By default, this perl module should handle simple attributes (begin tag, CDATA, end tag). It is also proposed that some quivalent of the attributes needed by TeraGrid and were available in GT2 and GT3 also be supported by default for PBS. The extensions handler shall be invoked as follows: new Globus::GRAM::ExtensionsHandler($class, $description); where $class is the specific JobManager subclass (i.e. Globus::GRAM::JobManager::fork) and $description is a JobDescription reference. All changes to the JobDescription reference are done within the parsing handler subs. Here is the proposed schema for host selection along with some examples: <group name="HostSelectionByTypeGroup"> <sequence> <!-- Select hosts by amount and (optionally) type. --> <element name="hostType" type="xsd:string" minOccurs="0"/> <element name="hostCount" type="xsd:int" minOccurs="0"/> </sequence> </group> <group name="HostSelectionByNameGroup"> <sequence> <!-- Select hosts by name. The number of hosts is then the number of hostName elements in the JDD. --> <element name="hostName" type="xsd:string" minOccurs="0" maxOccurs="unbounded"/> </sequence> </group> <group name="HostSelectionGroup"> <!-- Select either by [type]/amount or name. --> <choice> <group ref="tns:HostSelectionByTypeGroup"/> <group ref="tns:HostSelectionByNameGroup"/> </choice> </group> <group name="ProcessAllocationGroup"> <choice> <!-- Let adapter determine allocation to nodes (replaces "count"). --> <element name="processCount" type="xsd:int"/> <!-- Specify specific allocations per host. The total process count is then the number of hosts times processesPerHost --> <element name="processesPerHost" type="xsd:int"/> </choice> </group> <complexType name="ResourceAllocationGroupType"> <sequence> <!-- Select hosts for this RAG. --> <group ref="tns:HostSelectionGroup"/> <!-- Select the number of processes or their uniform distribution among the hosts in this RAG. --> <group ref="tns:ProcessAllocationGroup"/> </sequence> </complexType> <!-- If multiple resourceAllocationGroup elements are specified in the JDD, then there are total host and process counts based on the sum of the explicit or implied host and process counts for each RAG. --> <element name="resourceAllocationGroup" type="tns:ResourceAllocationGroupType"/> <!-- Example 1: Equivalent to count = 10 --> <resourceAllocationGroup> <!-- default hostCount = 1 --> <processCount>10</processCount> </resourceAllocationGroup> <!-- Example 2: Host selection by type --> <resourceAllocationGroup> <!-- 10 processes on 1 activmural host --> <hostType>activemural</hostType> <!-- default hostCount = 1 --> <processCount>10</processCount> </resourceAllocationGroup> <resourceAllocationGroup> <!-- 10 ia64-compute hosts, 2 processes per host --> <hostType>ia64-compute</hostType> <hostCount>5</hostCount> <processesPerHost>2</processesPerHost> </resourceAllocationGroup> <!-- Example 3: Host selection by name --> <resourceAllocationGroup> <!-- 2 hosts with 5 processes per host --> <hostName>tg-v001</hostType> <hostName>tg-v002</hostType> <processCount>10</processCount> </resourceAllocationGroup> <resourceAllocationGroup> <!-- 2 hosts with 5 processes per host --> <hostName>tg-c024</hostType> <hostName>tg-c025</hostType> <processesPerHost>2</processesPerHost> </resourceAllocationGroup> Deliverables: 1) A new ExtensionsHandler.pm perl module which handles the "extensions" JDD element. 2) An updated JobManager.pm which can call the new module if the "xmlextensions" flag attribute is set in the perl job description. 3) An update package for 4.0.x since this module. 4) Documentation explaing how to take advantage of the new module as well as a listing of supported constructs and well-known attributes. Tasks: 1) Create the base ExtensionsHandler.pm with a parser that is handed the "extensions" element and references to basic parsing handler subs. 2) Implement code to convert basic tag-value-tag data into named strings. 3) Add support to the default plugin for converting multiple, sequential, tag-value-tag elements with the same tag name into one named string array. 4) Add support to the default plugin for converting elements with a "name" attribute and a simple value into a named hashtable. 5) Add support to the default plugin for node selection elements according to the above mentioned schema. 6) Verify unit tests pass. 7) Add developers guide sub-section on how to use modify the ExtensionsHandler.pm. 8) Add developers guide sub-section on default construct support. Time Estimate: 7 days
Tasks 1-3 are done.
Just some notes for translation into PBS directives: The format of PBS' node selection directive is as follows: #PBS -l nodes=node_spec[+node_spec...] HostSelectionByTypeGroup translates to host type directives like the following: #PBS -l nodes=2:visnode+3:compnode HostSelectionByNameGroup translates to host name directives like the following: #PBS -l nodes=node001+node002+node003 ProcessAllocationGroup translates to directives with "ppn" like the following: #PBS -l nodes=4:visnode:ppn=2+5:compnode:ppn=3 #PBS -l nodes=node001:ppn=2+node002:ppn=3+node003:ppn=3 Example 1 is equivalent to nodes=1:ppn=10 Example 2 is equivalent to nodes=activemural:ppn=10+5:ia64-compute:ppn=2 Example 3 is equivalent to nodes=tg-v001:ppn=5+tg-v002:ppn=5+tg-c024:2+tg-c025:ppn=2
Task 5 is half done. I added support for only one resourceAllocationGroup for PBS. I still need to add support for multiple resourceAllocationGroups. I still need to do task 4.
Finished tasks 4 and 5.
I finished task 6. I forgot to add a task for creating and publishing the update package. This needs to be done before any doc regarding this can be written (only 4.0 doc currently).
The update packages are currently being published here: http://www-unix.globus.org/toolkit/downloads/development/ The links currently point to tarballs in my personal site path, but should be moved somewhere more specific to the toolkit.
Finished task #8, so all that's left is to add doc on how to customize ExtensionsHandler.pm. Here's the new supported construct doc: http://www-unix.globus.org/toolkit/docs/4.0/execution/wsgram/admin-index.html#s-wsgram-admin-extensions
Task 7 is finished and all deliverables have been fulfilled. This campaign should be closed.