Bug 3773 - ws gram job description extensions for condor
: ws gram job description extensions for condor
wsrf scheduler interface
: 4.0.1
: Macintosh All
: P3 enhancement
: 4.2
Assigned To:
  Show dependency treegraph
Reported: 2005-09-22 08:16 by
Modified: 2006-06-26 11:46 (History)



You need to log in before you can comment on or make changes to this bug.

Description From 2005-09-22 08:16:25
Work with condor team to add extensions in the JDD to set the below values in
condor.pm.  What other 
extensions should be considered?

    print SCRIPT_FILE "should_transfer_files = IF_NEEDED\n"; 
    print SCRIPT_FILE "WhenToTransferOutput = ON_EXIT\n\n";
------- Comment #1 From 2005-09-22 11:09:35 -------
You need to set 'transfer_input_file = <list of files staged in>'. The problem
how to get the list of files 
to be transferred.

For pre-WS GRAM, the list of staged-in files from the RSL is available to the
proxy_relocate and stage_in 
perl subroutines, but not the submit one. One hack would be to write out the
list to a file in stage_in 
and read it back in submit.

WS GRAM looks to be more troublesome. None of the perl subroutines know the
list of staged-in files.

If the client specifies a custom directory to run the job from, the submit
subroutine could request to 
transfer all the files in that directory. This assumes the client isn't using
the same directory for many 
jobs and they're not using the default of $(HOME).

A more intrusive change would be to the change both pre-WS and WS GRAM to give
the list of staged-
in files to the submit subroutine.

The real solution is to fully drop the shared file-system assumption and say
that all files transferred 
into a certain directory will be available to the job on the execute machine.
Any files outside that 
directory may or may not be available. The default for the directory would be a
unique directory for the 
job created and cleaned up automatically by GRAM. The client could specify a
different directory, but 
they should be aware of the consequences of sharing a directory across many
------- Comment #2 From 2005-09-26 12:49:22 -------
With the new extensions handling code I just added to the Perl backend, you can
add multiple 'transfer_input_file' elements to list the files that need transfering:


$description->transfer_input_file() should then return

[ '/some/path/to/a/file', '/some/path/to/a/notherfile' ].

I can add support for this and the flag elements in the adapter pretty easily. 
Users will simply have to explicity indicate what files need staging to the
compute node.  These should just be the file URLs from fileStageIn's sourceURL
elements without the 'file://'.
------- Comment #3 From 2005-09-26 13:37:26 -------
An update packages for 4.0.1 are available from the following page:


Feel free to reopen this bug if I did something wrong.  I don't know how to tell
whether it actually worked, but the extensions are getting through fine.  Here's
the job description I used that at very least didn't cause any errors according
to globusrun-ws:

<?xml version="1.0" encoding="UTF-8"?>

You could also just have one transfer_input_file element with a comma-separated
list of files like the Condor job description uses.  The string value will just
be passed directly instead of converted from an array of values.
------- Comment #4 From 2005-09-26 14:40:57 -------
Can this new syntax be used to add arbitrary expressions to the condor submit
description file, or just 
attributes it knows about?

What happens if a job description with these condor attributes is submitted to

Are there plans to address the general problem of batch systems with no shared
file system?
------- Comment #5 From 2005-09-26 16:15:09 -------
The condor.pm still has support for the condor_submit parameter, so I believe
you do the following then you can specify arbitrary parameters:

    <condor:condor_submit name="attrname">attrvalue</condor:condor_submit>

The generic extensions support only makes values available to the adapter
module.  It's up to the module to actually do something with it.  So if PBS
doesn't go looking for those parameters then it won't do anything.

Our stance, as far as I can remember, on the general problem of systems with
non-shared filesystems has been that it's too hard to have a general, automatic
solution.  We provide the filesystem mapping machanism to allow admins to
transfer nodes which share a filesystem with the compute nodes.  The user would
then setup file staging directives that would take advantage of this to
files to the transfer nodes which then makes the data accessible by the compute
------- Comment #6 From 2006-06-26 07:37:34 -------

I'm trying to get my GT 4.0.2 set-up under Linux
to submit Jobs to a Windows Condor pool without success.


Host1: GT 4.0.2 + Condor Central Manager + Condor Submit + Condor Executor
Host2: A Windows XP box set up as Condor Executor


reflects this correctly!

When I submit my job to WS GRAM it always specifies LINUX as a requirement but
I want
this job to run on the windows box.

In the mail archive I found several hints like this one:


which basically says I should add this:

    <condorsubmit name="Requirements">Arch == "INTEL" &amp;&amp; OpSys

to my job description.
Unfortunately WS GRAM does not change anything to the ClassAd.

My second approach was to try the extension mechanism in another way as
described here:


I tried to add:


also no success.

Is the extension tag even interpreted with 4.0.2? Or is it a bug maybe?
------- Comment #7 From 2006-06-26 11:46:45 -------

Please do not ask semi-related user support questions in bug reports
(especially if the reports are already closed!). If you have questions about
GRAM, sign up for the gram-user mailing list and post those questions there.
Bug reports are for problems that are clearly bugs. If you're not sure, post to
gram-user first. If you're fairly certain it's a new bug, then file a new bug
report. We will determine if it's a duplicate or not.