Bug 2734 - non-shared FS scheduler file list
: non-shared FS scheduler file list
wsrf scheduler interface
: development
: Macintosh All
: P3 enhancement
: 4.4
Assigned To:
  Show dependency treegraph
Reported: 2005-02-14 16:12 by
Modified: 2012-09-05 11:42 (History)



You need to log in before you can comment on or make changes to this bug.

Description From 2005-02-14 16:12:43
Condor has feature to transfer-files-if-needed and then given a file list in
the Condor job specification, 
it will transfer the files to the compute node.  This is useful when the condor
pool is on a non-shared 
file system.  GRAM could be extended to support this.  Investigation should be
done to see if/how other 
schedulers support this.  It could be that a new rsl element is needed that
contains a directory or a file 
list from which the file list can be passed to Condor.  Note that the files
specified in fileStageIn is not 
sufficient since the client can stage files separate from the gram job.  This
enhancement will not be 
considered in GT 4.0.
------- Comment #1 From 2005-02-14 19:38:56 -------
Sent via email from karl:

LSF also has a notion of staging files from the front-end, e.g. local
submitting, node to the compute node(s).  This is why I have asserted
that a generalized staging model has to account for several file

   1. filespace where the globusrun tool is running (for if we have
      embedded file server capability)

   2. filespace where GRAM invokes local scheduler adapter

   3. filespace of compute nodes

   4. and of course generalized global URL to 3rd party data sites

It is not clear how our current staging should interact w/ the Condor
or LSF staging mechanism.  As it is spec'd, the WS-GRAM really handles
remote->compute staging via the ftp mappings and side-steps the
front-end filespace entirely.  This is not true for the credential
handline, but all staging and cleanup is so.

We could support explicit job file movement from the front-end to
compute nodes as a sort of local "buffer" area, but then we would
probably need to support RFT staging into the front-end space in
addition to (or instead of?) the to the compute nodes.  Basically, we
have no clear usage model for a system w/o uniform filespaces across
all compute nodes accessible from one ManagedJobFactory resource.

------- Comment #2 From 2005-11-02 19:02:42 -------
It looks like PBS also has a mechanism for doing this (SGE can do it in the
prolog and epilog sections via scripts).  I don't remember why we avoided this
before in our re-design for GT4 GRAM, but this seems like it could be fairly
straight forward.

I'd say we need two things: 1) a configuration option per RM/factory resource
that indicates whether second-level staging is enabled or not, and 2) a list of
files to transfer to/from the job's working directory.  If second-level staging
is enabled but there is no associated file list for a particular staging
direction, then the file list is generated from the file:// URLs in the
fileStageIn/fileStageOut elements.
------- Comment #3 From 2005-11-02 20:09:25 -------
How about having a staging directory for each job, and all files in that
directory would be transferred. 
Either the client could declare it (with the option of it being relative to
some admin-configured base 
directory) or the server could declare it (and provide a macro for the client
to refer to it in transfer 
directives). That way, clients could do their own staging if they wanted to. I
believe this maps directly onto 
PBS's file staging support. 
Better yet, maybe a combination of the two ideas. Everything in the staging
directory plus everything in the 
staging directives in the job description get transferred.
------- Comment #4 From 2007-09-19 11:37:53 -------
Reassigning to current GRAM developer to close/fix as appropriate.
------- Comment #5 From 2008-02-04 14:35:37 -------
in 4.0 this is possible by adding extension elements for condor.  A general
solution needs to be considered in context of a future GRAM4 JSDL
implementation or other OGF specifications like DMI.
------- Comment #6 From 2012-09-05 11:42:28 -------
Doing some bugzilla cleanup...  Resolving old GRAM3 and GRAM4 issues that are
no longer relevant since we've moved on to GRAM5.  Also, we're now tracking
issue in jira.  Any new issues should be added here: