<?xml version="1.0" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "http://bugzilla.globus.org/bugzilla/bugzilla.dtd">

<bugzilla version="3.2.3"
          urlbase="http://bugzilla.globus.org/bugzilla/"
          maintainer="bacon@mcs.anl.gov"
>

    <bug>
          <bug_id>5610</bug_id>
          
          <creation_ts>2007-10-12 14:33</creation_ts>
          <short_desc>Improving GRAM4 service performance and reliability beyond 4.0.5</short_desc>
          <delta_ts>2012-09-05 11:43:38</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>GRAM</product>
          <component>Roadmap</component>
          <version>unspecified</version>
          <rep_platform>Macintosh</rep_platform>
          <op_sys>All</op_sys>
          <bug_status>RESOLVED</bug_status>
          <resolution>WONTFIX</resolution>
          
          
          
          
          <priority>P3</priority>
          <bug_severity>normal</bug_severity>
          <target_milestone>---</target_milestone>
          <dependson>5247</dependson>
    
    <dependson>5611</dependson>
    
    <dependson>5612</dependson>
    
    <dependson>5642</dependson>
    
    <dependson>5660</dependson>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Stuart Martin">smartin@mcs.anl.gov</reporter>
          <assigned_to name="Stuart Martin">smartin@mcs.anl.gov</assigned_to>
          <cc>bester@mcs.anl.gov</cc>
    
    <cc>feller@mcs.anl.gov</cc>

      

      
          <long_desc isprivate="0">
            <who name="Stuart Martin">smartin@mcs.anl.gov</who>
            <bug_when>2007-10-12 14:33:07</bug_when>
            <thetext>A number of improvements have been made to the GRAM4 service resulting in improved reliability and performance in GT 4.0.5. However, additional performance and reliability improvements are needed.  OSG, TG and other production grids are starting to investigate and use GRAM4 more.  The GRAM team has been working with these groups which encompass a variety of client and service hosts, client programs used, client execution scenarios, ...  There are many many variables here!  We have compiled a new guide that helped to minimize and/or resolve some issue http://www-unix.globus.org/toolkit/docs/4.0/execution/wsgram/WS_GRAM_Performance_Guide.html.  But there is more work to do to further improve GRAM4 reliability.

During stress tests, the client and/or service container has shown to becoming unresponsive, resulting in client-side timeouts.  This has occurred with as few as 50 concurrent job submission.  But for other tests, it has taken 200 or 400 job submissions.  Adding some timings has shown that there could be a problem in the container code.  Given this, the first step seems clear that we need to test/analyze/profile the container options (starting with the GT standalone container) for our GT services in order to provide a baseline for what our services can expect.  Once we have decided on a container option, we can investigate further what methods make sense to minimize failures and ensure a high-level of reliability.  As part of this process, we will establish a testing baseline where improvements can be measured and tracked.

A number of known issues have been documented here: &quot;reliability plan&quot; http://www-unix.mcs.anl.gov/~feller/Globus/  These will be converted into bugzilla entries and prioritized.  Progress on these issues can be made while the container analysis is being done.</thetext>
          </long_desc>
          <long_desc isprivate="0">
            <who name="Stuart Martin">smartin@mcs.anl.gov</who>
            <bug_when>2012-09-05 11:43:38</bug_when>
            <thetext>Doing some bugzilla cleanup...  Resolving old GRAM3 and GRAM4 issues that are no longer relevant since we&apos;ve moved on to GRAM5.  Also, we&apos;re now tracking issue in jira.  Any new issues should be added here:

http://jira.globus.org/secure/VersionBoard.jspa?selectedProjectId=10363</thetext>
          </long_desc>
      
      

    </bug>

</bugzilla>