Bugzilla – Bug 4207
Enabling dynamic job description variables using softenv
Last modified: 2007-08-22 15:28:11
You need to log in before you can comment on or make changes to this bug.
Title: Enabling dynamic job description variables using softenv Projects: TG Technologies: WS GRAM Definition: TG uses softenv to define software environments across all TG resource providers. WS GRAM softenv extensions have been written to provide TG users with an easy way to get the same softenv functionality for a remote job submitted to a WS GRAM service. However, the softenv environment is not used for some of the functionality provided by WS GRAM. For example, in file staging, softenv is currently not used. There are a number of common TG softenv variables that a TG user might want to use in a WS GRAM job document. For example, TG_GLOBAL_SCRATCH, TG_CLUSTER_SCRATCH, TG_NODE_SCRATCH, TG_COMMUNITY. These are all variables that a TG user might want to use in stage in job directives. For example: <fileStageIn> <transfer> <sourceUrl>gsiftp://lucky0.mcs.anl.gov:2811/bin/echo</sourceUrl> <destinationUrl>file:///${TG_GLOBAL_SCRATCH}/osg/osg_app</destinationUrl> </transfer> </fileStageIn> The solution is to have WS GRAM support a generic variable/value config file that will be used to set the JDD variable/value mapping that will be used to resolve any matching variables found in a submitted job document. There are a few WS GRAM JDD variables defined in the service jndi config file. Those variables should be moved into this file so all variables are define in on place. Then a TG setup package can be written to add to or replace the default set of variables. The TG setup package will discover environment variables defined for specific SoftEnv keys and create entries in the config file for each environment variable. A new service parameter is needed to set the path to the JDD variable file. This file should be checked periodically (configurable frequency) to see if it has changed. If so, it should be reread and the new content replaces the old. There is one missing piece of functionality for setting the softenv environment for a TG user's job. This is separate from the JDD variables enhancement. It is desired for a WS GRAM service to be configured with a default softenv key that will be used to set a default software environment for all jobs for all users for a specific WS GRAM service type (PBS, Fork, Condor, ...). A new service config parameter will be created for this specific purpose, "default_softenv_key" "@teragrid-basic". Using the previously provided softenv extension enhancements, a TG user will be able to override/extend the default software environment by setting a softenv key in a job description extension, e.g. <extensions> <softenv>@teragrid-basic-plus</softenv> </extensions> Here is an example of a complete WS GRAM job description document using some TG variables. <job> <executable>${TG_COMMUNITY}/osg/osg_app</executable> <directory>${TG_GLOBAL_SCRATCH}/osg</directory> <argument>Hello</argument> <stdout>${GLOBUS_USER_HOME}/osg_app.stdout.${GLOBUS_JOB_ID}</stdout> <stderr>${GLOBUS_USER_HOME}/osg_app.stderr.${GLOBUS_JOB_ID}</stderr> <fileStageIn> <transfer> <sourceUrl>gsiftp://lucky0.mcs.anl.gov:2811/bin/echo</sourceUrl> <destinationUrl>file:///${TG_GLOBAL_SCRATCH}/osg/osg_app</destinationUrl> </transfer> <transfer> <sourceUrl>gsiftp://lucky0.mcs.anl.gov:2811/some/file1</sourceUrl> <destinationUrl>file:///${TG_NODE_SCRATCH}/file1</destinationUrl> </transfer> </fileStageIn> <fileStageOut> <transfer> <sourceUrl>file:///${GLOBUS_USER_HOME}/osg_app.stdout.${GLOBUS_JOB_ID}</sourceUrl> <destinationUrl>gsiftp://lucky0.mcs.anl.gov:2811/tmp/results/stdout.${GLOBUS_JOB_ID}</ destinationUrl> </transfer> </fileStageOut> <fileCleanUp> <deletion> <file>file:///${TG_GLOBAL_SCRATCH}/osg/osg_app</file> </deletion> <deletion> <file>file:///${GLOBUS_USER_HOME}/osg_app.stdout.${GLOBUS_JOB_ID}</file> </deletion> <deletion> <file>file:///${GLOBUS_USER_HOME}/osg_app.stderr.${GLOBUS_JOB_ID}</file> </deletion> </fileCleanUp> <extensions> <softenv>@teragrid-basic</softenv> </extensions> </job> Helpful TG links: - Common Environment Variables section of: http://www.teragrid.org/userinfo/ guide_environment.html - The TG_COMMUNITY variable is listed at: http://www.teragrid.org/userinfo/ guide_software_request.html Deliverables: 1) New WS GRAM version with generic JDD variable support a) committed to trunk b) committed to community branch 2) New WS GRAM documentation describing new features Tasks: 1) Create JDD variable config file path setting in jndi-config.xml. 2) Move any JDD variables defined in jndi-config.xml to the new config file. 3) Add JDD variable config file refresh period setting in jndi-config.xml 3) Add sweeper thread code to reload the JDD variable config file according to the value from #3. 4) Add code to add JDD variable mappings from the config file before resolving JDD variable references. 5) Create TG setup package for adding to the JDD variable config file for specific softenv keys. 6) Add default softenv key setting to jndi-config.xml 7) Add code to prepend the default softenv key to the list of <softenv> elements in the <extensions> section of the JDD. 8) Merge changes to the trunk (assuming changes were made directly to the community branch). 9) Test the refresh thread. 10) Test the default softenv key. Time Estimate: 5 days
Tasks 1 through 3 are done. I included only GLOBUS_LOCATION as all the other substitution variables are set dynamically. The scratch directory setting is used also for populating a resource property, so it remains separate and will continue to be set dynamically. The default refresh is 480 minutes (8 hours).
Correction: 1-3 are only done in the community branch.
Task 7 should be changed from this: 7) Add code to prepend the default softenv key to the list of <softenv> elements in the <extensions> section of the JDD. to this: 7) Add code so that the default softenv key is not used of the user specifies a key for his job. The user specifying any key for a job is like them defining the keys for a .soft file. So the sys-admin default setting in that case should be ignored. This desciption of using the softenv extension should be in the documentation.
I just noticed that there are two #3 tasks, so I'll just relist them including Stu's change. I also added some documentation tasks that should have been included. 1) Create JDD variable config file path setting in jndi-config.xml. 2) Move any JDD variables defined in jndi-config.xml to the new config file. 3) Add JDD variable config file refresh period setting in jndi-config.xml 4) Add sweeper thread code to reload the JDD variable config file according to the value from #3. 5) Add code to add JDD variable mappings from the config file before resolving JDD variable references. 6) Create TG setup package for adding to the JDD variable config file for specific softenv keys. 7) Add default softenv key setting to jndi-config.xml 8) Add code so that the default softenv key is not used if the user specifies any keys in the extensions section. 9) Merge changes to the trunk (assuming changes were made directly to the community branch). 10) Test the refresh thread. 11) Test the default softenv key. 12) Add admin and user documentation describing the workings of the substitution definitions configuration. 13) Add adming and user documenation describing the workings of the default softenv key
Tasks 3 & 4 are done.
Re the revised 7). When a user logs in for the first time SoftEnv creates a new default .soft file. Instead of duplicating that information in the Globus JM why not just have a Yes/No config flag. The default for that flag would be No. If the admin sets it to Yes or the user specified SoftEnv keys, then SoftEnv would be initialized. If the user didn't specify SoftEnv keys, then it would be up to SoftEnv to define the default.
Revised #7 and in so doing added another task... 1) Create JDD variable config file path setting in jndi-config.xml. 2) Move any JDD variables defined in jndi-config.xml to the new config file. 3) Add JDD variable config file refresh period setting in jndi-config.xml 4) Add sweeper thread code to reload the JDD variable config file according to the value from #3. 5) Add code to add JDD variable mappings from the config file before resolving JDD variable references. 6) Create TG setup package for adding to the JDD variable config file for specific softenv keys. 7) Add default softenv usage flag to jndi-config.xml 8) Add support in the service and the adapters for the default softenv usage flag 9) Add code so that the default softenv environment is not used if the user specifies any keys in the extensions section. 10) Merge changes to the trunk (assuming changes were made directly to the community branch). 11) Test the refresh thread. 12) Test the default softenv key. 13) Add admin and user documentation describing the workings of the substitution definitions configuration. 14) Add adming and user documenation describing the workings of the default softenv key
Work has been delayed this week on account of helping to get some GRAM and RFT issues resolved for OSG. I should be able to get back on this full time next week.
Tasks 5 and 7 are done.
Tasks 1-5, 7-9, and 11 are done. Test 12 is also done but should be "Test the default softenv usage flag". Tasks that remain: 6) Create TG setup package for adding to the JDD variable config file for 10) Merge changes to the trunk (assuming changes were made directly to the community branch). 13) Add admin and user documentation describing the workings of the substitution definitions configuration. 14) Add admin and user documenation describing the workings of the default softenv key
Task 10 is done except for the setup package which isn't done for the community branch either. To continue, I need information about what specific substitutions are desired for TerraGrid so I can create the setup package (tast 6). Stu, can you get post that information here? Thanks!
I think all the TG env variable listed here would make for good JDD variables. JP can you confirm? Are there other one besides these? http://www.teragrid.org/userinfo/guide_environment.html#variables
I would just include the following the TeraGrid configuration example: TG_NODE_SCRATCH TG_CLUSTER_HOME TG_CLUSTER_SCRATCH TG_CLUSTER_PFS TG_CLUSTER_GPFS TG_CLUSTER_PVFS TG_CLUSTER_LUSTRE TG_APPS_PREFIX
Created an attachment (id=893) [details] common teragrid setup package Here's the setup package that will be used by future RM-specific setup packages to populate the substitution definitions file with the desired TG variables. Be aware that it's not in proper source package format (i.e. gpt-build will choke on it). I wanted to put it up here as it would appear in a CVS export if I in fact had committed it to CVS. I'm not sure who is going to keep this package and where. Anybody have an oppinion on this?
Created an attachment (id=895) [details] Condor TeraGrid setup package
Created an attachment (id=896) [details] Fork TeraGrid setup package
Created an attachment (id=897) [details] LSF TeraGrid setup package
Created an attachment (id=898) [details] Multi TeraGrid setup package
Created an attachment (id=899) [details] PBS TeraGrid setup package
Can somebody from the GIG Pack group answer the question of where these setup packages should be comitted. Thanks!
Created an attachment (id=900) [details] common teragrid setup package Fixed a bug in the detection of previously set TG varaiables.
I'm stalling this campaign. As usual, there's no place to put the documentation. If anybody wants to tell me what to do with the setup packages I created then I can do that. Otherwise there's nothing left for me to do at the moment.
As far as I'm concerned, you can keep them anywhere you'd like, as long as we know how to get them. Are these in a branch somewhere? If you could bootstrap them and make them available anywhere, we can pull them into the Teragrid's repo.
The 4.2-drafts doc branch is in place, but I haven't gotten a chance to get back to the doc part of this campaign. I'll update the status when I'm prepared to work on it.