Bug 6787

Summary: credential memory leak on incoming connections in XIO
Product: GSI C Reporter: Jaime Frey <jfrey@cs.wisc.edu>
Component: AuthenticationAssignee: Joe Bester <bester@mcs.anl.gov>
Status: RESOLVED FIXED    
Severity: normal CC: bester@mcs.anl.gov, kettimut@mcs.anl.gov
Priority: P3 Keywords: robustness
Version: 4.2.1   
Target Milestone: 4.2.2   
Hardware: All   
OS: All   

Description From 2009-06-30 15:33:22
We have found a memory leak in the XIO module when using the gram client and
gass server ez libraries. It appears that each time an incoming connection is
authenticated, the local credentials are read from disk but never freed. When
we eliminate all incoming connections, the memory leak vanishes.

We are observing this inside the Condor gahp_server using Globus 4.2.1. The BNL
ATLAS group has gahp_server processes that grow to over 1GB in memory usage
over the course of a day.

Here is an excerpt from valgrind:

==26280== 282,699 (2,020 direct, 280,679 indirect) bytes in 101 blocks are
definitely lost in loss record 1,384 of 2,000
==26280==    at 0x40053C0: malloc (vg_replace_malloc.c:149)
==26280==    by 0x8189E88: default_malloc_ex (mem.c:79)
==26280==    by 0x818A503: CRYPTO_malloc (mem.c:304)
==26280==    by 0x81A9C61: sk_new (stack.c:125)
==26280==    by 0x81A9C33: sk_new_null (stack.c:117)
==26280==    by 0x8131528: globus_gsi_cred_get_cert_chain
(globus_gsi_cred_handle.c:993)
==26280==    by 0x811E57C: globus_i_gsi_gss_create_cred
(globus_i_gsi_gss_utils.c:1628)
==26280==    by 0x811DF9B: globus_i_gsi_gss_cred_read
(globus_i_gsi_gss_utils.c:1401)
==26280==    by 0x811326A: gss_acquire_cred (acquire_cred.c:138)
==26280==    by 0x811C1E4: globus_i_gsi_gss_create_and_fill_context
(globus_i_gsi_gss_utils.c:364)
==26280==    by 0x8112314: gss_accept_sec_context (accept_sec_context.c:127)
==26280==    by 0x80DC927: globus_l_xio_gsi_read_token_cb
(globus_xio_gsi_driver.c:1189)
==26280==    by 0x80C41CB: globus_l_xio_driver_op_read_kickout
(globus_xio_driver.c:639)
==26280==    by 0x80D4AF1: globus_xio_driver_finished_read
(globus_xio_pass.c:1236)
==26280==    by 0x81000C4: globus_l_xio_tcp_finish_read
(globus_xio_tcp_driver.c:2249)
==26280==    by 0x810019E: globus_l_xio_tcp_system_read_cb
(globus_xio_tcp_driver.c:2265)
==26280==    by 0x81076F5: globus_l_xio_system_kickout
(globus_xio_system_select.c:919)
==26280==    by 0x814BD72: globus_callback_space_poll
(globus_callback_nothreads.c:1437)
==26280==    by 0x809381B: main (gahp_server.c:1946)
------- Comment #1 From 2009-07-16 10:29:17 -------
Any update on this ticket? The BNL ATLAS group is keen on seeing a fix.
------- Comment #2 From 2009-07-30 14:09:15 -------
I think I had fixed this in the GRAM5 branch. See if this patch to gssapi
handles it for you:

http://www.mcs.anl.gov/~bester/patches/6787.diff
------- Comment #3 From 2009-07-30 17:37:40 -------
A quick test with valgrind indicates the memory leak is gone. I'll give a copy
of the gahp_server with the patch to the BNL folks to confirm the problem is
solved over a long-term execution.
------- Comment #4 From 2009-08-03 07:48:23 -------
I committed this patch to the 4.2 branch. Reopen this issue if testing shows a
problem.