<?xml version="1.0" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "http://bugzilla.globus.org/bugzilla/bugzilla.dtd">

<bugzilla version="3.2.3"
          urlbase="http://bugzilla.globus.org/bugzilla/"
          maintainer="bacon@mcs.anl.gov"
>

    <bug>
          <bug_id>747</bug_id>
          
          <creation_ts>2003-02-24 13:51</creation_ts>
          <short_desc>jobmanager replies success on failed requests</short_desc>
          <delta_ts>2003-03-10 11:11:30</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>GRAM</product>
          <component>gt2 Gatekeeper/Jobmanager</component>
          <version>1.6</version>
          <rep_platform>All</rep_platform>
          <op_sys>All</op_sys>
          <bug_status>RESOLVED</bug_status>
          <resolution>FIXED</resolution>
          
          
          
          
          <priority>P2</priority>
          <bug_severity>normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Jaime Frey">jfrey@cs.wisc.edu</reporter>
          <assigned_to name="Stuart Martin">smartin@mcs.anl.gov</assigned_to>
          <cc>adesmet@cs.wisc.edu</cc>
    
    <cc>bester@mcs.anl.gov</cc>
    
    <cc>dangulo@cs.uchicago.edu</cc>
    
    <cc>pfc@cs.wisc.edu</cc>
    
    <cc>roy@cs.wisc.edu</cc>
    
    <cc>tannenba@cs.wisc.edu</cc>

      

      
          <long_desc isprivate="0">
            <who name="Jaime Frey">jfrey@cs.wisc.edu</who>
            <bug_when>2003-02-24 13:51:48</bug_when>
            <thetext>In our latest CMS testbed run, I found one (probably more, but I haven&apos;t looked
at enough log files yet) jobmanager that failed to add stderr to the gass cache,
but replied to the client (Condor-G) with error 110 (WAITING_TO_COMMIT), which
is incorrect. Condor-G was then very confused when it couldn&apos;t commit the
submission (because the jobmanager has exitted) and couldn&apos;t restart it (because
the jobmanager has removed the job state file).

I think the problem is in globus_l_gram_job_manager_reply() at
globus_gram_job_manager_state.c:2144. The code sets failure_code to
WAITING_TO_COMMIT if two-phase was requested, even if the request failed
(over-writing the existing failure code).</thetext>
          </long_desc>
          <long_desc isprivate="0">
            <who name="Joe Bester">bester@mcs.anl.gov</who>
            <bug_when>2003-03-10 11:11:30</bug_when>
            <thetext>Update package at 
ftp://ftp.globus.org/pub/gt2/2.2/2.2.4/updates/src/globus_gram_job_manager-3.7.tar.gz 
resolves this problem. 
 
joe </thetext>
          </long_desc>
      
      

    </bug>

</bugzilla>