Bugzilla – Bug 1566
Mutex not released on error in globus_l_gram_protocol_post , hang after job submission failure
Last modified: 2004-02-25 14:52:14
You need to
before you can comment on or make changes to this bug.
Globus version: 2.4
Flavour: gcc32dbgpthr on Linux/x86
Function: globus_l_gram_protocol_post (around lines 1961-2023)
In the function just mentioned, the mutex named globus_i_gram_protocol_mutex is
not released if an error occurs after the mutex has been locked. This has caused
problems for me when a job submission fails because of wrong (nonexisting) host
name for the GRAM. Attempting to release the callback contact after that hangs,
because the mutex is locked. The error occurs for flavor gcc32dbgpthr but not
for gcc32dbg. It does not occur if I use globusrun, most likely because the
version of globusrun I use uses gcc32dbg.
Steps to reproduce error:
- Register a callback with globus_gram_client_callback_allow
- Try to submit a job to a non-existing host via globus_gram_client_job_request
(returns with failure, expected)
- Try to deactivate the callback with globus_gram_client_callback_disallow
(hangs because the mutex is locked, not expected)
The source I have looked at is from the bundle
Created an attachment (id=319) [details]
Patch for releasing the mutex on error in globus_l_gram_protocol_post
Suggested patch for releasing the mutex on error in
globus_l_gram_protocol_post. I have tested it in my program and it solves the
Thanks for the excellent bug report and patch. That is committed to our CVS and
will be in the 3.2 release (coming soon).
I've also added a test case to the gram protocol library test suite for this.