Bugzilla – Bug 5744
Add a default factory
Last modified: 2008-02-06 14:25:30
You need to log in before you can comment on or make changes to this bug.
Right now, globusrun-ws assumes that you're submitting to a Fork jobmanager. In GT2, the admin had the capability of pointing the default jobmanager to whatever backend jobmanager they wanted. So, if I wanted the default job submission to my site to go to ManagedFork (or PBS, or Condor, ...), I could make that happen. In GT4, the client needs to know what kind of scheduler is behind the GRAM service so they can use the -Ft option. This breaks the fundamental abstraction of GRAM. I propose we make globusrun-ws submit to the "Default" factory, which will be admin-configurable to point to Fork or PBS or ... The mapping of Default should also show up as an RP: like, at this site, Default means "LSF".
This can be accomplished very easy, in fact it's just a few lines of code. ManagedJobFactoryHome must overwrite the ResourceHomeImpl.find(): public Resource find( ResourceKey key) throws ResourceException { if (key == null) { // set key to the configured default. // For illustration purpose a hardcoded default: key = ManagedJobHelper.getResourceKey("PBS"); } return super.find(key); } With this we don't even need a new factory type "Default". All a client must do is to just specify the address and no longer the reference properties/parameters in the factory EPR (which is optional according to the ws-addressing specification) A job like the following would then be submitted to the default local resource manager. <job> <factoryEndpoint xmlns:gram="http://www.globus.org/namespaces/2004/10/gram/job" xmlns:wsa="http://schemas.xmlsoap.org/ws/2004/03/addressing"> <wsa:Address> https://host:port/wsrf/services/ManagedJobFactoryService </wsa:Address> </factoryEndpoint> <executable>/bin/date</executable> </job> ManagedJobFactoryHome.find() is the the only piece of code that enables this and all invocations (job submissions, factory resource properties queries, etc) simply get redirected to the configured default factory type in case the client does only specify the address in the factory EPR. So there's no need for a new factory resource "Default". Also the new RP "defaultLocalResourceManager" is not necessarily needed because we get redirected to the default factory resource here too if no factory type is specified in a factory RP query: wsrf-get-property \ -s https://host:port/wsrf/services/ManagedJobFactoryService \ {http://www.globus.org/namespaces/2004/10/gram/job}localResourceManager gives as response: <ns1:localResourceManager xmlns:ns1="http://www.globus.org/namespaces/2004/10/gram/job"> PBS </ns1:localResourceManager> I think it would be good to offer a factory RP "availableLocalResourceManagers" instead that gives a list of all configured LRMs. AFAIK this change could also be applied to the 4.0 branch. Note however that all tests, GramJob, globusrun-ws must be updated for this. So i'm not sure if we should have this in 4.0.6.
Very Cool! I think adding this for 4.2 is probably good enough. I like the idea of a list of available resource managers too. This is available now from the container's registry service. But I think it would be good to provide this in GRAM and not require that dependency on the registry service for this information.
committed to branch ws-gram-bug5247 which will be merged to trunk.
*** Bug 3829 has been marked as a duplicate of this bug. ***
Committed the following change to ws-gram-bug5247: A client does not need to provide reference parameters for multijobs. Regardless of what is configured as default local resource manager: multi jobs will always go to Multi. The sub-jobs don't need to specify reference parameters, but can. Also: An admin must not configure Multi as default factory. This will result in an error message in container startup and an exception and error message per submitted job.