<?xml version="1.0" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "http://bugzilla.globus.org/bugzilla/bugzilla.dtd">

<bugzilla version="3.2.3"
          urlbase="http://bugzilla.globus.org/bugzilla/"
          maintainer="bacon@mcs.anl.gov"
>

    <bug>
          <bug_id>722</bug_id>
          
          <creation_ts>2003-02-17 17:23</creation_ts>
          <short_desc>gatekeeper reboot could spell doom for globus jobs submitted to Condor</short_desc>
          <delta_ts>2003-03-10 11:16:31</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>GRAM</product>
          <component>gt2 Gatekeeper/Jobmanager</component>
          <version>1.6</version>
          <rep_platform>PC</rep_platform>
          <op_sys>All</op_sys>
          <bug_status>RESOLVED</bug_status>
          <resolution>FIXED</resolution>
          
          
          
          
          <priority>P2</priority>
          <bug_severity>critical</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Todd Tannenbaum">tannenba@cs.wisc.edu</reporter>
          <assigned_to name="Stuart Martin">smartin@mcs.anl.gov</assigned_to>
          <cc>adesmet@cs.wisc.edu</cc>
    
    <cc>bester@mcs.anl.gov</cc>
    
    <cc>jfrey@cs.wisc.edu</cc>
    
    <cc>jms@mcs.anl.gov</cc>
    
    <cc>pfc@cs.wisc.edu</cc>
    
    <cc>roy@cs.wisc.edu</cc>
    
    <cc>vwelch@uiuc.edu</cc>

      

      
          <long_desc isprivate="0">
            <who name="Todd Tannenbaum">tannenba@cs.wisc.edu</who>
            <bug_when>2003-02-17 17:23:04</bug_when>
            <thetext>Hi -

In Globus 2.2.4 (and probably anything based on gram 1.6), there is 
a nasty bug in the source file:
   gram/jobmanager/setup/condor/condor.in
which ultimately gets installed in:
   lib/perl/Globus/GRAM/JobManager/condor.pm script

The problem is the value used for the variable condor_logfile.
It looks like it uses tmpdir which is set to /tmp in Core::Paths, 
which ultimately tells Condor to place the job 
log someplace in /tmp/.   This is bad!!  The Condor job log
has the same requirements has the jobmanager state files -- specifically,
it :
  a) must be persistent even across system reboots (and thus /tmp is a bad idea)
  b) must NOT be on NFS volumes because problems w/ file locks on NFS.
Perhaps condor_logfile variable should point to the same subdirectory that
jobmanager state files go?

Thanks,
Todd
UW-Madison Condor Team</thetext>
          </long_desc>
          <long_desc isprivate="0">
            <who name="Stuart Martin">smartin@mcs.anl.gov</who>
            <bug_when>2003-02-18 13:09:17</bug_when>
            <thetext>Todd,

what about using the GLOBUS_LOCATION env var in the condor perl script?  For 
now that would probably be the best way to go.

So instead of this:

    $self-&gt;{condor_logfile} = $Globus::Core::Paths::tmpdir
                            . &quot;/gram_condor_log.&quot;
                            . $self-&gt;{JobDescription}-&gt;uniq_id();

Do this:

    $self-&gt;{condor_logfile} = $ENV{GLOBUS_LOCATION}
                            . &quot;/gram_condor_log.&quot;
                            . $self-&gt;{JobDescription}-&gt;uniq_id();

It is no guarantee that the $GL is not on NFS, but I think it typically isn&apos;t.
The JM code writes the state file to $GL/tmp/... unless the JM is configured 
with &quot;-state-file-dir &lt;dir&gt;&quot;  A more complete solution would be to set another 
env var like GLOBUS_JM_STATE_DIR and then the condor script could look for that 
and fallback to $GL.

Maybe better yet would be to change the name of the -state-file-dir argument 
and ENV var that the JM sets to something that indicates it&apos;s use like -local-
tmp-dir (env var GLOBUS_LOCAL_TMP_DIR).  Then the condor script could use the 
GLOBUS_LOCAL_TMP_DIR env var to set the location of the condor_logfile.

What do you think?

-Stu</thetext>
          </long_desc>
          <long_desc isprivate="0">
            <who name="Jaime Frey">jfrey@cs.wisc.edu</who>
            <bug_when>2003-02-18 14:05:01</bug_when>
            <thetext>As I suggested in my duplicate bug report (#721, sorry about that), it
seems most obvious to put the condor user log files in the same directory
as the job state files. Both represent persistent information about the
job. Also, both get locked so they can&apos;t be on NFS. I think local-tmp-dir
is a bad name, since it implies that files written there don&apos;t need to be
long-lived. I think job-state-dir or jm-state-dir is better.

-- Jaime



</thetext>
          </long_desc>
          <long_desc isprivate="0">
            <who name="Stuart Martin">smartin@mcs.anl.gov</who>
            <bug_when>2003-02-18 14:27:34</bug_when>
            <thetext>*** Bug 721 has been marked as a duplicate of this bug. ***</thetext>
          </long_desc>
          <long_desc isprivate="0">
            <who name="Todd Tannenbaum">tannenba@cs.wisc.edu</who>
            <bug_when>2003-02-20 09:05:01</bug_when>
            <thetext>At 01:09 PM 2/18/2003, bugzilla-daemon@mcs.anl.gov wrote:
&gt;http://bugzilla.globus.org/bugzilla/show_bug.cgi?id=722
&gt;
&gt;smartin@mcs.anl.gov changed:
&gt;
&gt;            What    |Removed                     |Added
&gt;----------------------------------------------------------------------------
&gt;              Status|NEW                         |ASSIGNED
&gt;
&gt;
&gt;
&gt;------- Additional Comments From smartin@mcs.anl.gov  2003-02-18 13:09 -------
&gt;Todd,
&gt;
&gt;what about using the GLOBUS_LOCATION env var in the condor perl script?  For
&gt;now that would probably be the best way to go.
&gt;
&gt;So instead of this:
&gt;
&gt;     $self-&gt;{condor_logfile} = $Globus::Core::Paths::tmpdir
&gt;                             . &quot;/gram_condor_log.&quot;
&gt;                             . $self-&gt;{JobDescription}-&gt;uniq_id();
&gt;
&gt;Do this:
&gt;
&gt;     $self-&gt;{condor_logfile} = $ENV{GLOBUS_LOCATION}
&gt;                             . &quot;/gram_condor_log.&quot;
&gt;                             . $self-&gt;{JobDescription}-&gt;uniq_id();

I don&apos;t think GLOBUS_LOCATION is a good idea, because wouldn&apos;t most sites 
disallow world writes to this directory?

&gt;It is no guarantee that the $GL is not on NFS, but I think it typically isn&apos;t.
&gt;The JM code writes the state file to $GL/tmp/...

I think it makes good sense to set condor_logfile to the same place 
jobmanager state files go ($GL/tmp), since they both have the same 
requirements: (a) must be local [not on nfs], (b) must be a subdirectory 
writable by all users in the gridmap file, (c) must be persistent until 
explicitly removed ( i.e. not be deleted at bootup or by cron or some such ).



&gt;Maybe better yet would be to change the name of the -state-file-dir argument
&gt;and ENV var that the JM sets to something that indicates it&apos;s use like -local-
&gt;tmp-dir (env var GLOBUS_LOCAL_TMP_DIR).  Then the condor script could use the
&gt;GLOBUS_LOCAL_TMP_DIR env var to set the location of the condor_logfile.
&gt;
&gt;What do you think?

Sure, that&apos;d would probably be the cleanest.  But I would remove the term 
&quot;TMP&quot; from the actual name of the environment variable, since that implies 
to sysadmins it is safe to point it at /tmp, which is definitely *not* 
safe! Perhaps GLOBUS_SPOOL_DIR ?

very best regards from CERN,
Todd

p.s. Joe Bester is sitting here, and he says that does does _not_ want to 
say hi to you.  Perhaps you guys have worked together too much?  ;^)




-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Todd Tannenbaum                       University of Wisconsin-Madison
Condor Project Research               Department of Computer Sciences
tannenba@cs.wisc.edu                  1210 W. Dayton St. Rm #3357
http://www.cs.wisc.edu/~tannenba      Madison, WI 53706-1685
Phone: (608) 263-7132  FAX: (240) 359-5654 


</thetext>
          </long_desc>
          <long_desc isprivate="0">
            <who name="Joe Bester">bester@mcs.anl.gov</who>
            <bug_when>2003-03-10 11:16:31</bug_when>
            <thetext>Following update packages use the state file directory for condor log files. 
 
ftp://ftp.globus.org/pub/gt2/2.2/2.2.4/updates/src/globus_gram_job_manager-3.7.tar.gz 
ftp://ftp.globus.org/pub/gt2/2.2/2.2.4/updates/src/globus_gram_job_manager_setup_condor-1.2.tar.gz 
 
joe </thetext>
          </long_desc>
      
      

    </bug>

</bugzilla>