Bugzilla – Bug 3242
Software selection thru WS GRAM RSL
Last modified: 2012-09-05 11:42:36
You need to log in before you can comment on or make changes to this bug.
Modules and SoftEnv are two tools used by users of Grids, like the TeraGrid, to symbolically choose the software they need in their application environment. Symbolic software selection is particularly important for Grids because it standardizes software selection and hides the variability of where software is installed on individual grid resources. Users can currently describe software requirements before running applications in a shell and in batch jobs, but not thru GRAM execution or job services. This is a request to extend the WS GRAM execution or job service RSL for GT 4.2 to include a verb for symbolically describing what software is required by the spawned process or job. The TeraGrid is already working with the Condor project to make SoftEnv based symbolic software selection available in Condor.
Can you give an example of what this might look like? Is it truely just a custom string that can be used in a hacked scheduler script? If so, you could probably do this now with a little work using the <extensions> element in GT4 GRAM's job description schema.
In RSL syntax the parameter would look something like: (software_environment=(@teragrid-basic +globus +mpich-g2)) Where the value is an order list of strings. The Gatekeeper would need to execute something along the lines of source "`jobmanager-env-setup \"<above_value>\"`" before executing the user specified program regardless of jobmanager (fork, pbs, lsf, whatever). The jobmanager-env-setup script would be provided by GT but would need to be customized to implement environment setup using SoftEnv of Modules. Or perhaps better, we would need to create two add-on packages, similary to the JobManager add-on packages fo PBS and other resource managers, to implement jobmanager-env-setup using SoftEnv or Modules. The critical element is that the setup must happen before the user specified program is executed so that the environment variabled defined when the program runs includes whatever is needed to use the select software (@teragrid-basic +globus +mpich-g2 for example).
This is a critical feature for many large sites around the world, not "just" TeraGrid.
Olle, What do you mean by your comment? Can you describe what feature you are talking about? Is it that you want integrated softenv like TG? JP wants the softenv command(s) to be included in the scheduler command file after the scheduler commands but before the user application is executed. Is that what you want? Also, JP mentioned that softenv could be used to locate the scheduler commands called in the perl scripts to submit/cancel the job. e.g. qsub, qdel, ... This is problematic for TG as they install new PBS versions and the JM is still pointing to the old ones. If the perl scripts used softenv, then the correct version would automatically be used. -Stu
Bumping the priority as TG would like this done by end of November to be included in the upcoming TG production release.
Just to be sure, this is NOT going to be added to the 4.0 branch. This is a feature addition not a bug fix (I've changed the severity to "enhancement" as it should be.). If a CVS snapshot is to be used, then it concerns me that a TG software release is relying on untested development software.
The TeraGrid isn't running stock GT 4.0.1. We already have a collection of Globus patches that fix TeraGrid issues, and improvements that to address TeraGrid requirements, all of which will be tested on the TeraGrid before going production.
I was asked to add a comment, so I'll do that in all haste: What JP describes makes perfect sense: as part of the Globus-generated submission script, do a callout to set the appropriate environment variables so that executables, libraries and licenses can be found. Note that this also requires a relaxation in that GRAM can not look for, and expect to find, the executable on beforehand. I can assist in testing the integration to Modules, as that is what is used at KTH. Just to be clear: in addition to be something that is statically configured by the site admin only, the client should also be allowed to trigger this in the GRAM request, e.g. "&(software=gaussian-8.1)(executable=gaussian)" I am already thinking of pushing this one step further: allowing for thís "software module" directive to not only set environment variables but also allow it to fill in / override any other element in the job description, such as the name of the executable to run. This would allow for a great deal of simplication on the client side: stage in the right set of input files, specify (software="myprog-1.0") and let the local sysadmin fill out the rest. /Olle
Can somebody provide documentation links for Modules and SoftEnv. I can't seem to find anything with Google. Thanks.
Follow the SoftEnv links at http://www-unix.mcs.anl.gov/systems/software/msys/. SoftEnv is an MCS systems developed tool. I'm currently supporting it for the TeraGrid. If we talk directly about what and how to do it will save you some time. I'm available any time starting next week.
http://modules.sourceforge.net/ Typical commands: [~] $ module avail gcc gcc/3.2.1(default) gcc/2.95.3 gcc/3.3.1 [~] $ module add gcc [~] $ gcc --version gcc (GCC) 3.2.1 Copyright (C) 2002 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. [~] $ module add gcc/3.3.1 [~] $ gcc --version gcc (GCC) 3.3.1 Copyright (C) 2003 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. [~] $
I added a new Bugzilla entry (bug #3967) specifically for the TeraGrid SoftEnv extensions. I'm not marking this as a dependency to this bug since it's tailored for SoftEnv/is not generic and won't be distributed to anyone but TeraGrid.
Reassigning to current GRAM developer to close/fix as appropriate.
We've implemented a softenv specific solution for TeraGrid (and anyone using softenv). I don't see a more general solution happening in 4.2. Moving to 4.4. We'll reevaluate there.
Doing some bugzilla cleanup... Resolving old GRAM3 and GRAM4 issues that are no longer relevant since we've moved on to GRAM5. Also, we're now tracking issue in jira. Any new issues should be added here: http://jira.globus.org/secure/VersionBoard.jspa?selectedProjectId=10363