Bugzilla – Bug 5015
null pointer exception when submitting job to nightly build
Last modified: 2007-02-28 01:28:26
You need to log in before you can comment on or make changes to this bug.
I get a null pointer exception, and a bit before that, there is a warning: 2007-02-16T10:39:52,701-05:00 WARN factory.ManagedJobFactoryService4_0 [ServiceThread-25,createManagedJob:363] ComputingElement/any is null I notice now there is a v4_2 ManagedJobFactoryService, but I'm just hitting up the usual (4_0 ?) one. Anyway, I'll attach the container log with GRAM debug turned on. I have tried submitting a job programmatically, from a remote 4.1.0 host using globusrun-ws, and locally using the nightly build's globusrun-ws program, and I get this exception with each and every method. I have also been wondering, is the globusrun-ws command line tool still the preferred one for job submission? Thanks... Adam
Created an attachment (id=1189) [details] container log w/ the null pointer exception in it
Created an attachment (id=1190) [details] the job description file
The warning 2007-02-16T10:39:52,701-05:00 WARN factory.ManagedJobFactoryService4_0 [ServiceThread-25,createManagedJob:363] ComputingElement/any is null can be ignored; i'll change that to a warning soon. v4_2/ManagedJobFactoryService is the Job Factory Service which accepts JSDL job descriptions. So you're right when you use the usual one to submit jobs described in the WS-GRAM job description. For simple job submission the globusrun-ws command ist the preferred one. For concurrent job submissions Condor-G can be used. But i'm not sure if this can be done with a container from cvs HEAD so far. Now, the nullpointer exception: It's the extensions element in the job description that causes the problem. I guess you submit to Condor if you use the should_transfer_files and when_to_transfer_output extensions? Otherwise you don't need them. I found that the problem seems to occur as soon as more than one extension element is specified in the job description document. We didn't know that so far and I'll have a look at that but can't tell you when right now. Maybe you can avoid using them until we fixed that? Thanks for posting that bug!
Unfortunately we rely on the extensions block for submitting to Condor, PBS, etc... but thank you for the other information. Concurrent job submission with Condor-G is something that interests me and I wouldn't mind knowing if there are any docs about what you are referring to. Finally, if there is any documentation describing what the advantages of switching to JSDL are, perhaps I could start future-proofing us.
Adam, i fixed that NullPointerException some minutes ago. Also the warning in your initial description is a debug statement now. If you want to test it, get WS-GRAM from cvs HEAD (cvs co ws-gram). And unfortunately you should recreate your database for WS-GRAM persistency data once more. Some datatypes had been wrong for MySQL and PostgreSQL which caused serialization problems. Regarding Condor-G: We use it as client to test WS-GRAM in large, concurrent job submission and advise to use it for this scenario. We wrote some of the experiences we gained during these tests to a recommendation-paper. I'll send it to you via email, see if it's useful for you. Regarding JSDL: From my point of view there's no real advantage, but some communities want it, and it's a standard for a job description language. So i guess it's a good thing to support it. I'll mark this bug as fixed and close it. Don't hesitate to open new ones if you find things that don't work; we're glad to have persons like you who point us to things we didn't find so far.