Bug 3449 - ERROR container.GSIServiceThread
: ERROR container.GSIServiceThread
Status: REOPENED
: Java WS Security
Authentication
: development
: PC Linux
: P3 minor
: ---
Assigned To:
:
:
:
:
  Show dependency treegraph
 
Reported: 2005-06-01 18:14 by
Modified: 2005-06-22 17:16 (History)


Attachments


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2005-06-01 18:14:00
During intense workflows, I occasionally find the following error message in my
container.log (see: http://griodine.uchicago.edu/ivdgl1/400x1x7200/run0004/):

2005-06-01 13:20:12,132 ERROR container.GSIServiceThread [Thread-80,process:120]
Error processing request
java.io.EOFException
        at
org.globus.gsi.gssapi.net.impl.GSIGssInputStream.readHandshakeToken(GSIGssInputStream.java:56)
        at
org.globus.gsi.gssapi.net.impl.GSIGssSocket.readToken(GSIGssSocket.java:60)
        at
org.globus.gsi.gssapi.net.GssSocket.authenticateServer(GssSocket.java:122)
        at org.globus.gsi.gssapi.net.GssSocket.startHandshake(GssSocket.java:142)
        at org.globus.gsi.gssapi.net.GssSocket.getOutputStream(GssSocket.java:160)
        at
org.globus.wsrf.container.GSIServiceThread.process(GSIServiceThread.java:91)
        at org.globus.wsrf.container.ServiceThread.run(ServiceThread.java:300)


While the workflows runs to successful completion, the above message may be an
indication for some untimely delays we are seeing. I cannot tell, if the client
was globusrun-ws or Condor-G, though. The Condor-G client logs are encapsulated
inside GridmanagerLog.voeckler. globusrun-ws messages are unfortunately
distributed over */ID*.dbg.

The GT checkout is from last Friday.
------- Comment #1 From 2005-06-08 10:01:34 -------
should this bugs priority be escalated?  If this is the reason for the
occasional client delay that jens gets 
when running his workflow tests, then i think it should be.  The background is
that ~ 1 out of 10 jobs 
takes > 5 minutes to execute.  For one of the jobs we were able to see the
message delivered to the 
client-side (turning on log4j MessageLoggingHandler=DEBUG), but the
notification did not seem to be 
recognized/received by the client.

Thoughts?
------- Comment #2 From 2005-06-09 08:04:32 -------
I understand from Peter that he is experimenting with a new Axis jar Jarek is 
working on that might fix this issue (The jar basically fixes issues when large 
number of notifications are send). Both Peter and Ravi report these exceptions 
with large notification. Once I hear from Peter, if there are more issues I'll 
look into it.
------- Comment #3 From 2005-06-14 15:22:34 -------
Rachana, what are you waiting to get from me?  I thought I had said on the mud
that I had not seen this since the axis.jar update.  Should this be reassigned
to Jarek?
------- Comment #4 From 2005-06-14 15:34:53 -------
I understand from jarek that the jar has been committed to trunk and 4.0 brancj 
If the jar update fixes it, we can close the bug. Jens, can you update to 
latest jar and confirm this fixes the issue for you ? 
------- Comment #5 From 2005-06-22 15:10:42 -------
Btw, we were debugging this problem with Peter. I am almost certain it is 
caused by a timed out exception on the other side of the connection. However, I 
was not able to pin point exactly why the time outs happened in the first 
place. This still might need to be debugged fruther.
------- Comment #6 From 2005-06-22 17:00:19 -------
I was given to understand that since the axis update (and hence fixes to delays 
in notifications) fixed this issue. I am reopening the bug, but is this 
something to investigate with core/notifications or security framework ?
------- Comment #7 From 2005-06-22 17:16:29 -------
I do not know where the problem is (if there is one really) but I don't think 
it's a security-specific issue. This error might be perfectly ok... just needs 
to be investiagated more to understand the real issue. The axis update might 
have just hidden the issue a bit more.