Bugzilla – Bug 3876
Automatic transfer of all files modified by a job in GRAM
Last modified: 2012-09-05 11:42:57
You need to
before you can comment on or make changes to this bug.
A feature I'd like to see is an option to automatically stage out all files
written by the job. There are a
number of applications where what files written is highly dependent on some
input files. The submitter
may not know the exact list of files that will be written but wants all of
them. To work properly, this would
require that each job have its own execute directory, which conflicts somewhat
with the current ability for
the submitter to specify an arbitrary execute directory for the job.
One way you could implement this in the current framework would be to introduce
directory that's always created and unique per job. If the submitter uses that
directory as the job's execute
directory, they can use the automatically-transfer-all-ouput option. If they
specify a different directory,
they can't use the option. Or they're told they risk transferring files
unrelated to the job.
What's the problem with telling the application to spit out it's data into a
specific directory and then simply transfering that whole directory back?
The main problem I see with that idea is that most of the time, input and
output files live in the same
directory. You only want to transfer back the output files. Transferring the
input files as well would be
especially bad if they're very large.
Reassigning to current GRAM developer to close/fix as appropriate.
This functionality exists by specifying a directory to stage out condor-g
manages this for the user.
See comment #2.
This doesn't provide a way to transfer the output files but not the input files
for the job.
I'm reviving this after a discussion with Jaime.
One possibility would be to check the job's working dir for any files that have
been modified after the LRM job submission took place. If so, then those files
are included in the list of files to be staged. In the GRAM job description a
new variable could be added that expands to the set of changed files.
Something like GLOBUS_MODIFIED_JOB_FILES could be specified in the source URL
along with a destination URL with the remote GridFTP dir where the files will
be transferred to.
This feature would have (would still be) useful for processing NanoHub jobs on
Marking as 4.4 for now.
*** Bug 4397 has been marked as a duplicate of this bug. ***
Doing some bugzilla cleanup... Resolving old GRAM3 and GRAM4 issues that are
no longer relevant since we've moved on to GRAM5. Also, we're now tracking
issue in jira. Any new issues should be added here: