Bugzilla – Bug 4191
globusrun-ws job submission hangs
Last modified: 2006-10-03 09:05:03
You need to
before you can comment on or make changes to this bug.
Here is a report from Charles and Tony Vu for a TeraGrid install.
New kind of failure on an SDSC node. The job isn't even being run, unlike the UC and IU cases we saw.
The relevant line in the container log appears to be
2006-02-01 16:14:59,887 ERROR exec.RunQueue [Thread-20,run:162] Unable to process state
Begin forwarded message:
From: Tony Vu <email@example.com>
Date: February 1, 2006 6:24:31 PM CST
To: Charles Bacon <firstname.lastname@example.org>
Subject: Re: globusrun-ws job submission hangs
Yes.. same case if I try /tmp/touched_it.
I've attached a fresh container.log file that contains only the messages since the last globusrun-ws I
performed after turning on debugging for GRAM and restarting. There are several "User Cancel"
request messages in there probably from previous validation jobs I submitted. I'm not sure how to
clear those out of the container "queue" .
Thanks for your help.?
On Feb 1, 2006, at 4:08 PM, Charles Bacon wrote:
On Feb 1, 2006, at 5:57 PM, Tony Vu wrote:
Thanks for the reply, Charles. The file isn't being touched either actually..
tonyv@tg-login1:~> globusrun-ws -submit -F $CONTACT -c /bin/touch touched_it
Job ID: uuid:cb18a804-937d-11da-ab98-0007e9d81263
Termination time: 02/02/2006 23:52 GMT
Current job state: Unsubmitted
tonyv@tg-login1:~> ls -l ~tonyv/touched_it
ls: /users/tonyv/touched_it: No such file or directory
Created an attachment (id=838) [details]
Part of the problem here is that the 4.0.1 code has some error reporting
problems. The branch code does
a better job of this. We should really get that community branch up and running
so we can have the latest
bug fixes included.
As for the actual problem, it could be anything. I had the same problem with
OSG earlier and it was just a
permissions problem with the $GLOBUS_LOCATION/tmp directory.
Can you merge the updated ws-gram/service/java and ws-gram/utils/source/java into the
globus_4_0_community branch? This is where we'll deliver the TG code from, it would be good to have the
better error reporting merged in.
Ok, those two ws-gram sub dirs are merged. Is there any reason we aren't
the rest of them? I thought that was part of the plan with the community
I'm led to believe this was resolved as an NFS issue. Resolving as INVALID
since it appears no code changes affected the outcome.