Bugzilla – Bug 3921
getaddrinfo() Segmentation faults in SuSE IA64 or AMD64 machine.
Last modified: 2007-02-26 19:46:47
You need to log in before you can comment on or make changes to this bug.
getaddrinfo() Segmentation faults in SUSE IA64 or AMD64 machine. I got Segmentation fault in getaddrinfo() on following case: - GT : GT4.0.1 (with PreWS MDS) - flavor : gcc64dbgpthr - compiler : gcc (GCC) 3.2.2 - CPU : AMD Opteron x2 machine, Itanium 2 x4 machine - OS : SuSE Linux Enterprise Server 8 SP (Linux 2.4.21-251-smp, glibc-2.2.5-213) (bind9-utils-9.2.2-92, ...) - other1 : Link the libldap_gcc64dbgpthr.so.2 library - other2 : Specify not /etc/hosts registered host, Use DNS lookup. gataddrinfo(3) (globus_libc_getaddrinfo()) is called from Globus XIO and other modules on Globus Toolkit. Thus, Our middleware also Segmentation fault by this problem. I didn't get the Segmentation fault when I took one of the following: - Replace flavor to gcc64dbg. - Replace GT4 to GT3.0.2. - Remove globus_openldap from globus-makefile-header option. - Change host to IA64 (Red Hat Linux Advanced Workstation release 2.1AW) or other x86 Linux host. I attach the 2 case to occur the bug and its stack trace. One case is short program. One case is program using GT. ------------------------------------------------------------------------ $ cat getaddr-test.c #include <stdlib.h> #include <stdio.h> #include <string.h> #include <sys/types.h> #include <sys/socket.h> #include <netdb.h> int main(void) { struct addrinfo hints, *ret = NULL; int result; memset((char *)&hints, 0, sizeof (hints)); hints.ai_flags = 0; hints.ai_family = PF_INET; hints.ai_socktype = SOCK_STREAM; hints.ai_protocol = 0; result = getaddrinfo("www.globus.org", "80", &hints, &ret); printf("getaddrinfo() = %s\n", ((result == 0) ? "ok" : "err")); return result; } $ globus-version 4.0.1 $ gcc -g -m64 -o getaddr-test getaddr-test.c $ ./getaddr-test getaddrinfo() = ok $ gcc -g -m64 -o getaddr-test getaddr-test.c -lbind -ldl -lpthread $ ldd getaddr-test libbind.so.0 => /usr/lib64/libbind.so.0 (0x0000002a9566d000) libdl.so.2 => /lib64/libdl.so.2 (0x0000002a957c1000) libpthread.so.0 => /lib64/libpthread.so.0 (0x0000002a958c4000) libc.so.6 => /lib64/libc.so.6 (0x0000002a959e0000) /lib64/ld-linux-x86-64.so.2 => /lib64/ld-linux-x86-64.so.2 (0x0000002a95556000) $ ./getaddr-test getaddrinfo() = ok $ gcc -g -m64 -o getaddr-test getaddr-test.c \ -L/home/somewhere/gt4/lib -lldap_gcc64dbgpthr $ ./getaddr-test Segmentation fault (core dumped) $ gdb getaddr-test core GNU gdb 5.3 Copyright 2002 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "x86_64-suse-linux"... Core was generated by `./getaddr-test'. Program terminated with signal 11, Segmentation fault. Reading symbols from /home/somewhere/gt4/lib/libldap_gcc64dbgpthr.so.2...done. Loaded symbols for /home/somewhere/gt4/lib/libldap_gcc64dbgpthr.so.2 Reading symbols from /lib64/libc.so.6...done. Loaded symbols for /lib64/libc.so.6 Reading symbols from /home/somewhere/gt4/lib/liblber_gcc64dbgpthr.so.2...done. Loaded symbols for /home/somewhere/gt4/lib/liblber_gcc64dbgpthr.so.2 Reading symbols from /usr/lib64/libbind.so.0...done. Loaded symbols for /usr/lib64/libbind.so.0 Reading symbols from /lib64/libdl.so.2...done. Loaded symbols for /lib64/libdl.so.2 Reading symbols from /lib64/libpthread.so.0...done. Loaded symbols for /lib64/libpthread.so.0 Reading symbols from /home/somewhere/gt4/lib/libsasl_gcc64dbgpthr.so.7...done. Loaded symbols for /home/somewhere/gt4/lib/libsasl_gcc64dbgpthr.so.7 Reading symbols from /home/somewhere/gt4/lib/libssl_gcc64dbgpthr.so.0...done. Loaded symbols for /home/somewhere/gt4/lib/libssl_gcc64dbgpthr.so.0 Reading symbols from /home/somewhere/gt4/lib/libcrypto_gcc64dbgpthr.so.0...done. Loaded symbols for /home/somewhere/gt4/lib/libcrypto_gcc64dbgpthr.so.0 Reading symbols from /lib64/ld-linux-x86-64.so.2...done. Loaded symbols for /lib64/ld-linux-x86-64.so.2 Reading symbols from /lib64/libcrypt.so.1...done. Loaded symbols for /lib64/libcrypt.so.1 Reading symbols from /lib64/libnss_files.so.2...done. Loaded symbols for /lib64/libnss_files.so.2 Reading symbols from /lib64/libnss_dns.so.2...done. Loaded symbols for /lib64/libnss_dns.so.2 Reading symbols from /lib64/libresolv.so.2...done. Loaded symbols for /lib64/libresolv.so.2 #0 0x0000002a95b23e1a in __res_nsend () from /usr/lib64/libbind.so.0 (gdb) where #0 0x0000002a95b23e1a in __res_nsend () from /usr/lib64/libbind.so.0 #1 0x0000002a95b23239 in __res_nquery () from /usr/lib64/libbind.so.0 #2 0x0000002a95b23795 in __res_nquerydomain () from /usr/lib64/libbind.so.0 #3 0x0000002a95b23494 in __res_nsearch () from /usr/lib64/libbind.so.0 #4 0x0000002a9658a565 in _nss_dns_gethostbyname2_r () from /lib64/libnss_dns.so.2 #5 0x0000002a95879af7 in gethostbyname2_r@@GLIBC_2.2.5 () from /lib64/libc.so.6 #6 0x0000002a958598e4 in gaih_inet () from /lib64/libc.so.6 #7 0x0000002a9585a869 in getaddrinfo () from /lib64/libc.so.6 #8 0x00000000400006ae in main () at getaddr-test.c:20 #9 0x0000002a957ca087 in __libc_start_main () from /lib64/libc.so.6 #10 0x00000000400005ba in _start () (gdb) ------------------------------------------------------------------------ Below is the stack trace of the program using GT. (Attached gt_gram_test.sh is little different from Globus Bug 3917 registered one.) ------------------------------------------------------------------------ $ ./gt_gram_test.sh -f gcc64dbgpthr ... $ ./gram_test my-gram-host.example.org debug: ResourceManager : my-gram-host.example.org debug: executing "/bin/hostname" by GRAM debug: RSL : &(executable=/bin/hostname)(rsl_substitution = (GLOBUSRUN_GASS_URL https://my-client-host.example.org:46310 ) )(stderr = $(GLOBUSRUN_GASS_URL) # /dev/stderr )(stdout = $(GLOBUSRUN_GASS_URL) # /dev/stdout ) Segmentation fault (core dumped) $ gdb gram_test core.18817 ... Core was generated by `./gram_test my-gram-host.example.org'. Program terminated with signal 11, Segmentation fault. ... #0 0x0000002a97c69e1a in __res_nsend () from /usr/lib64/libbind.so.0 (gdb) where #0 0x0000002a97c69e1a in __res_nsend () from /usr/lib64/libbind.so.0 #1 0x0000002a97c69239 in __res_nquery () from /usr/lib64/libbind.so.0 #2 0x0000002a97c69795 in __res_nquerydomain () from /usr/lib64/libbind.so.0 #3 0x0000002a97c69494 in __res_nsearch () from /usr/lib64/libbind.so.0 #4 0x0000002a985c5565 in _nss_dns_gethostbyname2_r () from /lib64/libnss_dns.so.2 #5 0x0000002a97acdaf7 in gethostbyname2_r@@GLIBC_2.2.5 () from /lib64/libc.so.6 #6 0x0000002a97aad8e4 in gaih_inet () from /lib64/libc.so.6 #7 0x0000002a97aae869 in getaddrinfo () from /lib64/libc.so.6 #8 0x0000002a97074976 in globus_libc_getaddrinfo ( node=0x40143810 "my-gram-host.example.org", service=0x401437f0 "2119", hints=0x7fbfffc5d0, res=0x7fbfffc608) at globus_libc.c:3184 #9 0x0000002a95c220cc in globus_l_xio_tcp_connect (op=0x40143500, handle=0x40143a30, attr=0x40143740, host=0x40143810 "my-gram-host.example.or g", port=0x401437f0 "2119") at globus_xio_tcp_driver.c:1946 #10 0x0000002a95c22681 in globus_l_xio_tcp_open (contact_info=0x7fbfffc850, driver_link=0x0, driver_attr=0x40143740, op=0x40143500) at globus_xio_tcp_driver.c:2048 #11 0x0000002a95bea04f in globus_xio_driver_pass_open (in_op=0x40143500, contact_info=0x7fbfffc850, in_cb=0x2a95c003ce <globus_l_xio_gsi_open_cb>, in_user_arg=0x401438b0) at globus_xio_pass.c:88 #12 0x0000002a95c00fbc in globus_l_xio_gsi_open (contact_info=0x7fbfffc850, driver_link=0x0, driver_attr=0x401436e0, op=0x40143500) at globus_xio_gsi.c:1732 #13 0x0000002a95bea079 in globus_xio_driver_pass_open (in_op=0x40143500, contact_info=0x7fbfffc850, in_cb=0x2a95bd099e <globus_l_xio_open_close_callback>, in_user_arg=0x0) at globus_xio_pass.c:96 #14 0x0000002a95bd3d42 in globus_l_xio_register_open (op=0x40143500, contact_string=0x7fbfffc990 "my-gram-host.example.org:2119") at globus_xio_handle.c:1773 #15 0x0000002a95bd54c9 in globus_xio_register_open (handle=0x40143400, contact_string=0x7fbfffc990 "my-gram-host.example.org:2119", attr=0x40142f40 , cb=0x2a95aaeef9 <globus_l_io_bounce_authz_cb>, user_arg=0x40142e00) at globus_xio_handle.c:2157 #16 0x0000002a95ab0acf in globus_l_io_tcp_register_connect ( host=0x40142aa0 "my-gram-host.example.org", port=2119, socket=-1, attr=0x7fb fffcd50, callback=0x2a9587d419 <globus_l_gram_protocol_connect_callback>, callback_arg=0x40142d00, handle=0x40142ae0, blocking=0) at globus_io_xio_compat.c:2049 #17 0x0000002a95ab0be6 in globus_io_tcp_register_connect ( host=0x40142aa0 "my-gram-host.example.org", port=2119, attr=0x7fbfffcd50, callback=0x2a9587d419 <globus_l_gram_protocol_connect_callback>, callback_arg=0x40142d00, handle=0x40142ae0) at globus_io_xio_compat.c:2092 #18 0x0000002a9587e60e in globus_l_gram_protocol_post ( url=0x401424e0 "https://my-gram-host.example.org:2119/jobmanager", handle=0x7fbfffce08, attr=0x7fbfffcd50, message=0x40142890 "protocol-version: 2\r\njob-state-mask: 1048575\r\ncallba ck-url: https://my-client-host.example.org:46311/\r\nrsl: \"&(executable=/bin/ho stnam---Type <return> to continue, or q <return> to quit--- e)(rsl_substitution = (GLOBUSRUN_GASS_URL https://my-client-host.example.org"... , message_size=308, keep_open=0, cred_handle=0x0, restriction_oids=0x0, restriction_buffers=0x0, req_flags=0, time_req=0, callback=0x2a95671fce <globus_l_gram_client_monitor_callback>, callback_arg=0x7fbfffcdc0) at globus_gram_protocol_io.c:2130 #19 0x0000002a9587cc35 in globus_gram_protocol_post ( url=0x401424e0 "https://my-gram-host.example.org:2119/jobmanager", handle=0x7fbfffce08, attr=0x7fbfffcd50, message=0x40142890 "protocol-version: 2\r\njob-state-mask: 1048575\r\ncallba ck-url: https://my-client-host.example.org:46311/\r\nrsl: \"&(executable=/bin/ho stname)(rsl_substitution = (GLOBUSRUN_GASS_URL https://my-client-host.example.or g"..., message_size=308, callback=0x2a95671fce <globus_l_gram_client_monitor_callback>, callback_arg=0x7fbfffcdc0) at globus_gram_protocol_io.c:496 #20 0x0000002a95671c64 in globus_l_gram_client_job_request ( resource_manager_contact=0x7fbffff3b2 "my-gram-host.example.org", description=0x7fbfffdec0 "&(executable=/bin/hostname)(rsl_substitution = (GL OBUSRUN_GASS_URL https://my-client-host.example.org:46310 ) )(stderr = $(GLOBUSR UN_GASS_URL) # /dev/stderr )(stdout = $(GLOBUSRUN_GASS_URL) # /dev/stdout )", job_state_mask=1048575, iattr=0x0, callback_contact=0x4013e430 "https://my-client-host.example.org:46311/", monitor=0x7fbfffcdc0) at globus_gram_client.c:2295 #21 0x0000002a956709a3 in globus_gram_client_job_request ( resource_manager_contact=0x7fbffff3b2 "my-gram-host.example.org", description=0x7fbfffdec0 "&(executable=/bin/hostname)(rsl_substitution = (GL OBUSRUN_GASS_URL https://my-client-host.example.org:46310 ) )(stderr = $(GLOBUSR UN_GASS_URL) # /dev/stderr )(stdout = $(GLOBUSRUN_GASS_URL) # /dev/stdout )", job_state_mask=1048575, callback_contact=0x4013e430 "https://my-client-host.example.org:46311/", job_contact=0x7fbfffceb0) at globus_gram_client.c:900 #22 0x00000000400017de in invokeExecutable (jobState=0x7fbfffef00, gassServerUrl=0x4013ab70 "https://my-client-host.example.org:46310", remoteHostName=0x7fbffff3b2 "my-gram-host.example.org") at gram_test.c:190 #23 0x00000000400014ba in main (argc=2, argv=0x7fbffff038) at gram_test.c:80 #24 0x0000002a97a1e087 in __libc_start_main () from /lib64/libc.so.6 #25 0x000000004000132a in _start () (gdb) ------------------------------------------------------------------------ Thanks.
Created an attachment (id=753) [details] gt_gram_test.sh This is the program using GT.
The kernel was just different from normal one. SCore patch was applied. IA64 machine : Linux 2.4.21-score112-itanium2-smp AMD64 machine : Linux 2.4.21-251-smp
Sorry, OS name was not completed. correct - OS : SuSE Linux Enterprise Server 8 SP4
do you still get this seg fault when you use globus-makefile-header properly? It seems you are using globus_openldap which is part of mds2 and depricated, so you may not want that. I will let MDS folks comment further.
Thanks to response. > do you still get this seg fault when you use > globus-makefile-header properly? Segmentation fault still occur on Globus Toolkit 4.0.1. Did my script has wrong globus-makefile-header usage? On simple case, I don't use globus-makefile-header. Just load the $GLOBUS_LOCATION/etc/globus-user-env.sh and link the ldap_gcc64dbgpthr library and Segmentation fault. Is there any mistake on me? I think my usage is normal. I can report globus-makefile-header result if desired. > It seems you are using globus_openldap which > is part of mds2 and depricated, so you may not want that. > I will let MDS folks comment further. Year, MDS2 is depricated. But Our software supports both MDS2(Pre-WS MDS) and MDS4(WS MDS). On Grid environment, There are various user and usages. MDS2 on GT4 user are possibly available. Because Grid software can collaborate with various sites. Anyway, this problem do not occur on GT3. (I tried Globus Toolkit 2.4.3 on Opteron machine, Globus Toolkit 3.0.2 on Itanium 2 machine.) This problem occur only on GT4.
I have same problem by MDS2 of GT 2.4.3. - GT: 2.4.3 - flavor: gcc64dbgpthr - compiler: gcc-3.2.2 - CPU: AMD Opteron x 2 - OS: SuSE SLES-8, Linux 2.4.21-292-smp
Our machine OS version was updated. I checked this problem again. The problem was not occurred on SLES9. - GT : GT4.0.1 (with PreWS MDS) - flavor : gcc64dbgpthr - compiler : gcc (GCC) 3.3.3 (SuSE Linux) - CPU : AMD Opteron x2 machine, Itanium 2 x4 machine - OS : SuSE Linux Enterprise Server 9 SP3 Linux 2.6.5-7.257-smp x86_64 Linux 2.6.5-7.257-default ia64 bind-utils-9.2.3-76.19 bind-utils-32bit-9-200512021458 (x86_64) bind-utils-x86-9-200512021455 (ia64)
Is this still a valid MDS issue? If this bug was filed to MDS because of the openldap library being linked in can generate the segfault (along with other conditions as you describe), it is highly unlikely that the library will be updated to a newer version since MDS 2.x support is now deprecated. Anyway, it doesnt seem to be a bug that can be isolated to the openldap library exclusively, so I'm closing as an MDS issue.