Bug 3766 - CAMPAIGN: WS-GRAM Job Description Extensions Handling Support
: CAMPAIGN: WS-GRAM Job Description Extensions Handling Support
Status: RESOLVED FIXED
: GRAM
wsrf scheduler interface
: development
: All All
: P1 normal
: 4.2
Assigned To:
:
:
:
:
  Show dependency treegraph
 
Reported: 2005-09-20 16:17 by
Modified: 2005-09-26 12:31 (History)


Attachments


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2005-09-20 16:17:52
Projects: ???
        
Technologies:   Globus Resource Allocation Manager (GRAM)

Definition:

Adding custom WS-GRAM job description extensions is easy, but the current
resource manager adapter implementation in perl has no easy way of handling
the extensions element XML blob. To make this easier, an addition perl
module should be created that is called to handle this XML blob. By default,
this perl module should handle simple attributes (begin tag, CDATA, end tag).
It is also proposed that some quivalent of the attributes needed by TeraGrid
and were available in GT2 and GT3 also be supported by default for PBS.

The extensions handler shall be invoked as follows:

	new Globus::GRAM::ExtensionsHandler($class, $description);

where $class is the specific JobManager subclass (i.e.
Globus::GRAM::JobManager::fork)
and $description is a JobDescription reference.  All changes to the
JobDescription reference are done within the parsing handler subs.

Here is the proposed schema for host selection along with some examples:

<group name="HostSelectionByTypeGroup">
    <sequence>
        <!-- Select hosts by amount and (optionally) type. -->
        <element name="hostType" type="xsd:string" minOccurs="0"/>
        <element name="hostCount" type="xsd:int" minOccurs="0"/>
    </sequence>
</group>

<group name="HostSelectionByNameGroup">
    <sequence>
        <!-- Select hosts by name. The number of hosts is then the
             number of hostName elements in the JDD. -->
        <element name="hostName" type="xsd:string"
                 minOccurs="0"    maxOccurs="unbounded"/>
    </sequence>
</group>

<group name="HostSelectionGroup">
    <!-- Select either by [type]/amount or name. -->
    <choice>
        <group ref="tns:HostSelectionByTypeGroup"/>
        <group ref="tns:HostSelectionByNameGroup"/>
    </choice>
</group>

<group name="ProcessAllocationGroup">
    <choice>
        <!-- Let adapter determine allocation to nodes (replaces
            "count"). -->
        <element name="processCount" type="xsd:int"/>

        <!-- Specify specific allocations per host. The total process
             count is then the number of hosts times processesPerHost
        -->
        <element name="processesPerHost" type="xsd:int"/>
    </choice>
</group>

<complexType name="ResourceAllocationGroupType">
    <sequence>
        <!-- Select hosts for this RAG. -->
        <group ref="tns:HostSelectionGroup"/>

        <!-- Select the number of processes or their uniform
             distribution among the hosts in this RAG. -->
        <group ref="tns:ProcessAllocationGroup"/>
    </sequence>
</complexType>

<!-- If multiple resourceAllocationGroup elements are specified in the
     JDD, then there are total host and process counts based on the sum
     of the explicit or implied host and process counts for each RAG.
-->
<element name="resourceAllocationGroup" type="tns:ResourceAllocationGroupType"/>

<!-- Example 1: Equivalent to count = 10 -->
<resourceAllocationGroup>
    <!-- default hostCount = 1 -->
    <processCount>10</processCount>
</resourceAllocationGroup>

<!-- Example 2: Host selection by type -->
<resourceAllocationGroup>
    <!-- 10 processes on 1 activmural host -->

    <hostType>activemural</hostType>
    <!-- default hostCount = 1 -->
    <processCount>10</processCount>
</resourceAllocationGroup>

<resourceAllocationGroup>
    <!-- 10 ia64-compute hosts, 2 processes per host -->

    <hostType>ia64-compute</hostType>
    <hostCount>5</hostCount>
    <processesPerHost>2</processesPerHost>
</resourceAllocationGroup>

<!-- Example 3: Host selection by name -->
<resourceAllocationGroup>
    <!-- 2 hosts with 5 processes per host -->

    <hostName>tg-v001</hostType>
    <hostName>tg-v002</hostType>
    <processCount>10</processCount>
</resourceAllocationGroup>

<resourceAllocationGroup>
    <!-- 2 hosts with 5 processes per host -->

    <hostName>tg-c024</hostType>
    <hostName>tg-c025</hostType>
    <processesPerHost>2</processesPerHost>
</resourceAllocationGroup>


Deliverables:

1) A new ExtensionsHandler.pm perl module which handles the "extensions" JDD
   element.
2) An updated JobManager.pm which can call the new module if the
   "xmlextensions" flag attribute is set in the perl job description.
3) An update package for 4.0.x since this module.
4) Documentation explaing how to take advantage of the new module as well as
   a listing of supported constructs and well-known attributes.

Tasks:

1) Create the base ExtensionsHandler.pm with a parser that is handed the
  "extensions" element and references to basic parsing handler subs.
2) Implement code to convert basic tag-value-tag data into named strings.
3) Add support to the default plugin for converting multiple,
  sequential, tag-value-tag elements with the same tag name into one
  named string array.
4) Add support to the default plugin for converting elements with a
  "name" attribute and a simple value into a named hashtable.
5) Add support to the default plugin for node selection elements according
to the above mentioned schema.
6) Verify unit tests pass.
7) Add developers guide sub-section on how to use modify the ExtensionsHandler.pm.
8) Add developers guide sub-section on default construct support.

Time Estimate:

7 days
------- Comment #1 From 2005-09-20 16:19:50 -------
Tasks 1-3 are done.
------- Comment #2 From 2005-09-22 17:41:36 -------
Just some notes for translation into PBS directives:

The format of PBS' node selection directive is as follows:

    #PBS -l nodes=node_spec[+node_spec...]

HostSelectionByTypeGroup translates to host type directives like the following:

    #PBS -l nodes=2:visnode+3:compnode


HostSelectionByNameGroup translates to host name directives like the following:

    #PBS -l nodes=node001+node002+node003


ProcessAllocationGroup translates to directives with "ppn" like the following:

    #PBS -l nodes=4:visnode:ppn=2+5:compnode:ppn=3
    #PBS -l nodes=node001:ppn=2+node002:ppn=3+node003:ppn=3

Example 1 is equivalent to nodes=1:ppn=10

Example 2 is equivalent to nodes=activemural:ppn=10+5:ia64-compute:ppn=2

Example 3 is equivalent to nodes=tg-v001:ppn=5+tg-v002:ppn=5+tg-c024:2+tg-c025:ppn=2
------- Comment #3 From 2005-09-22 20:39:46 -------
Task 5 is half done.  I added support for only one resourceAllocationGroup for
PBS.  I still need to add support for multiple resourceAllocationGroups.  I
still need to do task 4.
------- Comment #4 From 2005-09-23 10:28:51 -------
Finished tasks 4 and 5.
------- Comment #5 From 2005-09-23 11:59:20 -------
I finished task 6.

I forgot to add a task for creating and publishing the update package.  This
needs to be done before any doc regarding this can be written (only 4.0 doc
currently).
------- Comment #6 From 2005-09-23 15:37:59 -------
The update packages are currently being published here:

http://www-unix.globus.org/toolkit/downloads/development/

The links currently point to tarballs in my personal site path, but should be
moved somewhere more specific to the toolkit.
------- Comment #7 From 2005-09-23 19:42:19 -------
Finished task #8, so all that's left is to add doc on how to customize
ExtensionsHandler.pm.  Here's the new supported construct doc:

http://www-unix.globus.org/toolkit/docs/4.0/execution/wsgram/admin-index.html#s-wsgram-admin-extensions
------- Comment #8 From 2005-09-26 12:31:53 -------
Task 7 is finished and all deliverables have been fulfilled.  This campaign
should be closed.