Bugzilla – Bug 3798
Massive memory leaks in GRAM client library
Last modified: 2006-04-07 16:35:09
You need to log in before you can comment on or make changes to this bug.
Dear all, I have found a number of memory leaks in the GRAM client. To be sure, I have checked again using the Globus Toolkit 4.0.1 source code related to GRAM client tests. During the tests, three additional problems arises. Two of the tests, namely globus_gram_client_stdio_size_test and globus_gram_client_two_phase_commit_test fail reporting a problem in the Globus Thread system. Another one, globus_gram_client_register_status_test simply hangs, so that I was not able to report leaks for the function tested in these three codes (even though I know there is a leak in globus_gram_client_register_job_status() etc. I am attaching a tar.gz file containing leak reports for each one of the tests and the thread problems. I am using the latest GT4.0.1 on Mac OS X 10.4.2, gcc4.0.0 and the latest GNU autotools. The code is compiled using the gcc32dbgpthr flavor. Best regards, Massimo Cafaro
Created an attachment (id=710) [details] archive containing leak reports and thread problems
Do any of the leaks affect a real application you are running? Most of those test programs are very short-lived and are used to test some particular feature of the GRAM implementation, so the leaks aren't terribly important. joe
Subject: Re: Massive memory leaks in GRAM client library On Oct 5, 2005, at 10:41 PM, bugzilla-daemon@mcs.anl.gov wrote: > http://bugzilla.globus.org/bugzilla/show_bug.cgi?id=3798 > > > > > > ------- Additional Comments From bester@mcs.anl.gov 2005-10-05 > 15:41 ------- > Do any of the leaks affect a real application you are running? Most > of those > test programs are very short-lived and are used to test some > particular feature > of the GRAM implementation, so the leaks aren't terribly important. > > joe > Unfortunately, all of the leaks affect a server application. The server is multi-threaded, so even very small leaks affect the overall reliability and stability. I need many GRAM features, including the ability of submitting, checking job status, refreshing job credentials, signaling the gatekeeper, canceling jobs, pinging remote gatekeepers etc. For each one of these capabilities, I need both the synchronous and asynchronous version. In practice, I am taking advantage of almost all of the GRAM client APIs, so I really need to get rid of the many leaks that appear in almost all of the GRAM client APIs. Think that even activating the GRAM client module produces a leak, because the corresponding deactivation call does not frees an globus_io_secure_authorization_data_t variabile malloced by a call to globus_io_secure_authorization_data_initialize() which occurs in the GRAM client module activation! Besides the leaks, what about the thread problems I am experiencing? Massimo -- ************************************************************************ ******************************* Massimo Cafaro, Ph.D. Center for Advanced Computational Technologies (CACT) Assistant Professor National Nanotechnology Laboratory (NNL) of INFM University of Lecce, Italy SPACI Consortium Faculty of Engineering Voice +39 0832 297371 Via per Monteroni Fax +39 0832 297279 73100 Lecce, Italy Web http:// sara.unile.it/~cafaro E-mail massimo.cafaro@unile.it cafaro@cacr.caltech.edu ************************************************************************ *******************************
Subject: Re: Massive memory leaks in GRAM client library Hi, still no reply to my mail... Are you removing the leaks in the GRAM Client library? Any news about the thread problems? Any news about the hanging problems? Massimo On Oct 6, 2005, at 12:40 AM, bugzilla-daemon@mcs.anl.gov wrote: > http://bugzilla.globus.org/bugzilla/show_bug.cgi?id=3798 > > > > > > ------- Additional Comments From massimo.cafaro@unile.it > 2005-10-05 17:40 ------- > Subject: Re: Massive memory leaks in GRAM client library > > > On Oct 5, 2005, at 10:41 PM, bugzilla-daemon@mcs.anl.gov wrote: > > >> http://bugzilla.globus.org/bugzilla/show_bug.cgi?id=3798 >> >> >> >> >> >> ------- Additional Comments From bester@mcs.anl.gov 2005-10-05 >> 15:41 ------- >> Do any of the leaks affect a real application you are running? Most >> of those >> test programs are very short-lived and are used to test some >> particular feature >> of the GRAM implementation, so the leaks aren't terribly important. >> >> joe >> >> > > Unfortunately, all of the leaks affect a server application. The > server is multi-threaded, so even very small leaks affect the overall > reliability and stability. I need many GRAM features, including the > ability of submitting, checking job status, refreshing job > credentials, signaling the gatekeeper, canceling jobs, pinging remote > gatekeepers etc. For each one of these capabilities, I need both the > synchronous and asynchronous version. > > In practice, I am taking advantage of almost all of the GRAM client > APIs, so I really need to get rid of the many leaks that appear in > almost all of the GRAM client APIs. > Think that even activating the GRAM client module produces a leak, > because the corresponding deactivation call does not frees an > globus_io_secure_authorization_data_t variabile malloced by a call > to globus_io_secure_authorization_data_initialize() which occurs in > the GRAM client module activation! > > Besides the leaks, what about the thread problems I am experiencing? > > Massimo > > > > -- > > ********************************************************************** > ** > ******************************* > > Massimo Cafaro, Ph.D. Center for Advanced > Computational Technologies (CACT) > Assistant Professor National > Nanotechnology Laboratory (NNL) of INFM > University of Lecce, Italy SPACI Consortium > Faculty of Engineering Voice +39 0832 297371 > Via per Monteroni Fax +39 0832 > 297279 > 73100 Lecce, Italy Web http:// > sara.unile.it/~cafaro > E-mail massimo.cafaro@unile.it cafaro@cacr.caltech.edu > > ********************************************************************** > ** > ******************************* > > > > > > > ------- You are receiving this mail because: ------- > You reported the bug, or are watching the reporter. > -- ************************************************************************ ******************************* Massimo Cafaro, Ph.D. Center for Advanced Computational Technologies (CACT) Assistant Professor National Nanotechnology Laboratory (NNL) of INFM University of Lecce, Italy SPACI Consortium Faculty of Engineering Voice +39 0832 297371 Via per Monteroni Fax +39 0832 297279 73100 Lecce, Italy Web http:// sara.unile.it/~cafaro E-mail massimo.cafaro@unile.it cafaro@cacr.caltech.edu ************************************************************************ *******************************
Massimo, Joe's wife just had a baby, so he is has been unable to take a look at this. Mike Link is going to step in and take a look. Is this a new service that you are writing? How come these leaks have not shown up before? -Stu
Subject: Re: Massive memory leaks in GRAM client library Congratulations to Joe, I understand perfectly what this means, since I have got my second baby two months ago. The server I am writing is the back-end of a new version of the Grid Resource Broker portal. As such, it will be used by some European grids, including SPACI and SEPAC. The leaks appeared now after switching to the latest GT 4.0.1 Before releasing software for actual production I always make sure that there are no leaks etc. When testing the server I experienced leaks in almost all of the GRAM client function calls, both synchronous and asynchronous. In order to make sure that the leaks were not due to my code, I then decided to use the GRAM client tests that ships together with the GRAM client source code. And I also carefully checked the tests source code to be sure that no obvious bugs were there (such as forgetting to free callback contacts, forgetting to destroy attributes, mutexes, condition variables, etc). I was rather surprised to see that massive amount of memory leaks. They span practically all of the GRAM client APIs. But the really strange thing is related to some of the test which simply hang and some of the tests that fail due to thread problems. Why on Mac OS X 10.4.2 globus threads are not working correctly? To the best of my knowledge pthreads work perfectly on OS X (at least all of the pthreads applications I use or develop). Thanks for looking at this problems. This is really important to me, since leaks are a perfect recipe for disaster in multi-threaded servers like the one I am developing. I am really curious in particular about the hanging and threading problems. Please let me know related details. At least, C/C++ support is not dead in favour of java ;-) Cheers, Massimo On Oct 11, 2005, at 3:04 PM, bugzilla-daemon@mcs.anl.gov wrote: > http://bugzilla.globus.org/bugzilla/show_bug.cgi?id=3798 > > smartin@mcs.anl.gov changed: > > What |Removed |Added > ---------------------------------------------------------------------- > ------ > AssignedTo|bester@mcs.anl.gov |mlink@mcs.anl.gov > > > > ------- Additional Comments From smartin@mcs.anl.gov 2005-10-11 > 08:04 ------- > Massimo, > > Joe's wife just had a baby, so he is has been unable to take a look > at this. Mike Link is going to step in > and take a look. > > Is this a new service that you are writing? How come these leaks > have not shown up before? > > -Stu > > > > ------- You are receiving this mail because: ------- > You reported the bug, or are watching the reporter. > -- ************************************************************************ ******************************* Massimo Cafaro, Ph.D. Center for Advanced Computational Technologies (CACT) Assistant Professor National Nanotechnology Laboratory (NNL) of INFM University of Lecce, Italy SPACI Consortium Faculty of Engineering Voice +39 0832 297371 Via per Monteroni Fax +39 0832 297279 73100 Lecce, Italy Web http:// sara.unile.it/~cafaro E-mail massimo.cafaro@unile.it cafaro@cacr.caltech.edu ************************************************************************ *******************************
I've fixed the leaks and committed to cvs, the attached patch can also be applied to the 4.0.1 source installer. The thread problems were just typo bugs in the test apps, nothing wrong with the libraries thread-wise. The only problem I see is failures in the status and register-status tests, which seems MacOS specific and happens with both threaded and non-threaded builds. These tests are fine on linux. I'll need to check into this further. Are you having any specific problems with the status api or were those problems noted just from running the tests while checking for leaks? Mike
Created an attachment (id=722) [details] patch to 4.0.1 source installer
Subject: Re: Massive memory leaks in GRAM client library Mike, thank you for fixing the leaks. I will patch my globus installation and check my source code again. I have from time to time the following problem with the status APIs: from time to time the return value of the status call is GLOBUS_SUCCESS but the status returned is 0, a value that does not correspond to a valid status. Massimo On Oct 20, 2005, at 6:58 PM, bugzilla-daemon@mcs.anl.gov wrote: > http://bugzilla.globus.org/bugzilla/show_bug.cgi?id=3798 > > > > > > ------- Additional Comments From mlink@mcs.anl.gov 2005-10-20 > 11:58 ------- > I've fixed the leaks and committed to cvs, the attached patch can > also be > applied to the 4.0.1 source installer. The thread problems were > just typo bugs > in the test apps, nothing wrong with the libraries thread-wise. > The only > problem I see is failures in the status and register-status tests, > which seems > MacOS specific and happens with both threaded and non-threaded > builds. These > tests are fine on linux. I'll need to check into this further. > > Are you having any specific problems with the status api or were > those problems > noted just from running the tests while checking for leaks? > > Mike > > > > ------- You are receiving this mail because: ------- > You reported the bug, or are watching the reporter. > -- ************************************************************************ ******************************* Massimo Cafaro, Ph.D. Center for Advanced Computational Technologies (CACT) Assistant Professor National Nanotechnology Laboratory (NNL) of INFM University of Lecce, Italy SPACI Consortium Faculty of Engineering Voice +39 0832 297371 Via per Monteroni Fax +39 0832 297279 73100 Lecce, Italy Web http:// sara.unile.it/~cafaro E-mail massimo.cafaro@unile.it cafaro@cacr.caltech.edu ************************************************************************ *******************************
Subject: Re: Massive memory leaks in GRAM client library MIke, I have patched the GT installer and tested the code. The leaks are gone, thank you again. Looking at the source code, I have noticed that Globus headers never define #define _POSIX_C_SOURCE 200112L this is required on Mac OS X in order to get correct POSIX behaviour from the threads APIs. For instance, threads waiting on a condition variable are not canceled like they are supposed to be unless you define the macro. Of course this has a strong impact on Globus threaded code. Both POSIX and non-POSIX behaviours are available as of Tiger (Mac OS 10.4). If you are running on a newer MacOS X (e.g. you are running Tiger/ Cambridge Update), then before including the headers, simply do a: #define _POSIX_C_SOURCE 200112L to ask for POSIX standard behaviour from the threads APIs. This will also ask all the other API usages for variant APIs in that compilation unit to have more standard POSIX behaviour, and it will prevent the prototypes and manifest constants for non-POSIX APIs being visible, so you may need to break up your compilation units, if you need to use any non-POSIX APIs. Thus, GT threaded code must define the macro to get correct POSIX behaviour on Mac OS 10.4 Tiger. Defining the macro should not cause problems on other unix platforms. Cheers, Massimo On Oct 21, 2005, at 11:10 AM, bugzilla-daemon@mcs.anl.gov wrote: > http://bugzilla.globus.org/bugzilla/show_bug.cgi?id=3798 > > > > > > ------- Additional Comments From massimo.cafaro@unile.it > 2005-10-21 04:10 ------- > Subject: Re: Massive memory leaks in GRAM client library > > Mike, > > thank you for fixing the leaks. I will patch my globus installation > and check my source code again. > I have from time to time the following problem with the status APIs: > > from time to time the return value of the status call is > GLOBUS_SUCCESS but the status returned is 0, a value that does not > correspond to a valid status. > > > Massimo > > > On Oct 20, 2005, at 6:58 PM, bugzilla-daemon@mcs.anl.gov wrote: > > >> http://bugzilla.globus.org/bugzilla/show_bug.cgi?id=3798 >> >> >> >> >> >> ------- Additional Comments From mlink@mcs.anl.gov 2005-10-20 >> 11:58 ------- >> I've fixed the leaks and committed to cvs, the attached patch can >> also be >> applied to the 4.0.1 source installer. The thread problems were >> just typo bugs >> in the test apps, nothing wrong with the libraries thread-wise. >> The only >> problem I see is failures in the status and register-status tests, >> which seems >> MacOS specific and happens with both threaded and non-threaded >> builds. These >> tests are fine on linux. I'll need to check into this further. >> >> Are you having any specific problems with the status api or were >> those problems >> noted just from running the tests while checking for leaks? >> >> Mike >> >> >> >> ------- You are receiving this mail because: ------- >> You reported the bug, or are watching the reporter. >> >> > > > -- > > ********************************************************************** > ** > ******************************* > > Massimo Cafaro, Ph.D. Center for Advanced > Computational Technologies (CACT) > Assistant Professor National > Nanotechnology Laboratory (NNL) of INFM > University of Lecce, Italy SPACI Consortium > Faculty of Engineering Voice +39 0832 297371 > Via per Monteroni Fax +39 0832 > 297279 > 73100 Lecce, Italy Web http:// > sara.unile.it/~cafaro > E-mail massimo.cafaro@unile.it cafaro@cacr.caltech.edu > > ********************************************************************** > ** > ******************************* > > > > > > > ------- You are receiving this mail because: ------- > You reported the bug, or are watching the reporter. > -- ************************************************************************ ******************************* Massimo Cafaro, Ph.D. Center for Advanced Computational Technologies (CACT) Assistant Professor National Nanotechnology Laboratory (NNL) of INFM University of Lecce, Italy SPACI Consortium Faculty of Engineering Voice +39 0832 297371 Via per Monteroni Fax +39 0832 297279 73100 Lecce, Italy Web http:// sara.unile.it/~cafaro E-mail massimo.cafaro@unile.it cafaro@cacr.caltech.edu ************************************************************************ *******************************
Closing this bug as leaks fixed. Opened a new issue at http://bugzilla.globus.org/bugzilla/show_bug.cgi?id=4035 to investigate the viability of setting _POSIX_C_SOURCE as a remedy to MacOS thread issues.