Bugzilla – Bug 5930
Make SchedulerEventGenerator.run() (Java) more robust
Last modified: 2008-03-20 16:48:15
You need to log in before you can comment on or make changes to this bug.
Currently the Thread SchedulerEventGenerator stops processing if either the restart of the c-SEG throws an IOException or the communication between c-SEG and SchedulerEventGenerator fails with an IOException. No error is printed and the run() method just terminates silently. This causes that no job status information is available anymore for jobs to the correspondent LRM and no information about problems is printed to the container log. I think we ran into that with LEAD, since they didn't get information about PBS job status and the PBS SEG related thread was missing in the thread dump. Will make this a bit more robust and add more logging so that problems here become more obvious.