Bug 4111 - locking / synch problems
: locking / synch problems
Status: RESOLVED FIXED
: GRAM
wsrf managed execution job service
: unspecified
: PC Windows XP
: P3 critical
: 4.0.2
Assigned To:
:
:
: 3840 3959
:
  Show dependency treegraph
 
Reported: 2005-12-15 13:09 by
Modified: 2006-04-07 16:43 (History)


Attachments


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2005-12-15 13:09:59
I think that something doesn't get awaken properly in the gram code in trunk 
(or might be a deadlock problem) causing time outs:

http://viz-login.isi.edu/gram-testing/test-details.php?uuid=f1dad2fb-f5f0-437b-
8bb6-7dec75846bfe

The container.log contains periodic jvm stack traces. Notice that only one job 
can be created at a time.  

Also, the reason might be that .notify() is called before something enters .wait
(). So using Semaphore (from core) will fix it, if that's the problem.
------- Comment #1 From 2005-12-15 17:41:20 -------
I can't reproduce this with throughput tests running simple job submissions.
Staging jobs are hanging, but according to my own thread dumps it looks like RFT
is hanging. I'll try this again once Ravi's related bugs (#3840 and #3959) are
resolved.
------- Comment #2 From 2006-03-30 09:24:57 -------
All the thoroughput tests have been succeeding for a while now. This problem
seems to have been resolved.