| Summary: | Default scratchDirectory doesn't exist | ||
|---|---|---|---|
| Product: | GRAM | Reporter: | Jaime Frey <jfrey@cs.wisc.edu> |
| Component: | wsrf managed execution job service | Assignee: | Stuart Martin <smartin@mcs.anl.gov> |
| Status: | RESOLVED WONTFIX | ||
| Severity: | normal | CC: | dangulo@cs.uchicago.edu, greenc@fnal.gov, lane@mcs.anl.gov, madduri@mcs.anl.gov, roy@cs.wisc.edu, rynge@isi.edu, smartin@mcs.anl.gov |
| Priority: | P3 | ||
| Version: | 4.0.1 | ||
| Target Milestone: | 4.2.1 | ||
| Hardware: | All | ||
| OS: | All | ||
We deliberately decided that the scratch directory should not be managed by
GRAM
since we can't know the intentions of the administrator regarding the scratch
directory. The substitution variable is provided as a convenience to the the
administrator for pointing people at the desired scratch area. If the
administrator didn't create the scratch directory, then we are assuming that it
shouldn't be used. I think the proper thing to do is to throw an error if
${SCRATCH_DIR} is used and scratchDirectory is not specified explicitly (i.e.
provide no default). We debated this idea when we were changing it from the
dangerous default of ~/, but people (myself included, IIRC) thought that a sane
default was warranted. It seems, though, that this just causes frustration and
false expectations.
I think part of the problem is a difference in how the scratchDirectory is perceived. I've also viewed it as a base directory, and every job should have a subdirectory under it. In writing Condor-G, I thought "Globus tells me where to create my job's scratch directory, but I create it as part of the job." Since the scratchDirectory is shared across all jobs, it makes no sense to store job files directly in it (rather than in subdirectories). So I expected the scratchDirectory to already exist. My design choice would be for GRAM to automatically create a scratch directory for every job and that's the default directory for the job. The directory is deleted along with the job. If the submitter wants the job to run in a different directory, they can say so and at worst you have an extra empty directory for the life of the job.
*** Bug 4768 has been marked as a duplicate of this bug. ***
Doing some bugzilla cleanup... Resolving old GRAM3 and GRAM4 issues that are no longer relevant since we've moved on to GRAM5. Also, we're now tracking issue in jira. Any new issues should be added here: http://jira.globus.org/secure/VersionBoard.jspa?selectedProjectId=10363
At some point, the default value for scratchDirectory in WS-GRAM (set in $GL/etc/gram-service-*/jndi- config.xml) was changed to ${GLOBUS_USER_HOME}/.globus/scratch. This directory doesn't exist, so jobs that try to use the scratch directory (like Condor-G) hang for long periods of time while RFT fails to create files under the directory. I don't know if or when RFT will time out or how GRAM will react. In the mean time, the client has no idea that there's a problem. The default value for scratchDirectory should either already exist or be created automatically by Globus.