Bugzilla – Bug 3449
Last modified: 2005-06-22 17:16:29
You need to
before you can comment on or make changes to this bug.
During intense workflows, I occasionally find the following error message in my
container.log (see: http://griodine.uchicago.edu/ivdgl1/400x1x7200/run0004/):
2005-06-01 13:20:12,132 ERROR container.GSIServiceThread [Thread-80,process:120]
Error processing request
While the workflows runs to successful completion, the above message may be an
indication for some untimely delays we are seeing. I cannot tell, if the client
was globusrun-ws or Condor-G, though. The Condor-G client logs are encapsulated
inside GridmanagerLog.voeckler. globusrun-ws messages are unfortunately
distributed over */ID*.dbg.
The GT checkout is from last Friday.
should this bugs priority be escalated? If this is the reason for the
occasional client delay that jens gets
when running his workflow tests, then i think it should be. The background is
that ~ 1 out of 10 jobs
takes > 5 minutes to execute. For one of the jobs we were able to see the
message delivered to the
client-side (turning on log4j MessageLoggingHandler=DEBUG), but the
notification did not seem to be
recognized/received by the client.
I understand from Peter that he is experimenting with a new Axis jar Jarek is
working on that might fix this issue (The jar basically fixes issues when large
number of notifications are send). Both Peter and Ravi report these exceptions
with large notification. Once I hear from Peter, if there are more issues I'll
look into it.
Rachana, what are you waiting to get from me? I thought I had said on the mud
that I had not seen this since the axis.jar update. Should this be reassigned
I understand from jarek that the jar has been committed to trunk and 4.0 brancj
If the jar update fixes it, we can close the bug. Jens, can you update to
latest jar and confirm this fixes the issue for you ?
Btw, we were debugging this problem with Peter. I am almost certain it is
caused by a timed out exception on the other side of the connection. However, I
was not able to pin point exactly why the time outs happened in the first
place. This still might need to be debugged fruther.
I was given to understand that since the axis update (and hence fixes to delays
in notifications) fixed this issue. I am reopening the bug, but is this
something to investigate with core/notifications or security framework ?
I do not know where the problem is (if there is one really) but I don't think
it's a security-specific issue. This error might be perfectly ok... just needs
to be investiagated more to understand the real issue. The axis update might
have just hidden the issue a bit more.