Bugzilla – Bug 3107
Fatal error: tcp_init(): globus_io_tcp_create_listener() failed
Last modified: 2005-04-28 11:19:27
You need to log in before you can comment on or make changes to this bug.
Running: globus-job-run lake.esc.cam.ac.uk/jobmanager-fork /bin/date simple error output: t0:p7121: Fatal error: tcp_init(): globus_io_tcp_create_listener() failed /usr/local/globus/bin/globus-job-run: line 1: 7121 Abort trap /usr/local/globus/bin/globusrun -q -o -r "lake.esc.cam.ac.uk/jobmanager-fork" -f /tmp/globus_job_run.rbru03.rsl.7081 with the following additional environment variables: export GLOBUS_ERROR_VERBOSE=1 export GLOBUS_TCP_PORT_RANGE=65000,65256 export GLOBUS_ERROR_OUTPUT=1 I get the following error: globus_error_put(): globus_gsi_system_config.c:globus_i_gsi_sysconfig_check_certfile_unix:4815: File is not owned by current user: /etc/grid-security/hostcert.pem is not owned by current user globus_error_put(): globus_gsi_system_config.c:globus_i_gsi_sysconfig_create_cert_string:417: Error with certificate filename globus_gsi_system_config.c:globus_i_gsi_sysconfig_check_certfile_unix:4815: File is not owned by current user: /etc/grid-security/hostcert.pem is not owned by current user globus_error_put(): globus_gsi_system_config.c:globus_gsi_sysconfig_get_host_cert_filename_unix:5405: Error with certificate filename globus_gsi_system_config.c:globus_i_gsi_sysconfig_create_cert_string:417: Error with certificate filename globus_gsi_system_config.c:globus_i_gsi_sysconfig_check_certfile_unix:4815: File is not owned by current user: /etc/grid-security/hostcert.pem is not owned by current user globus_error_put(): globus_gsi_credential.c:globus_gsi_cred_read:443: Error reading host credential globus_gsi_system_config.c:globus_gsi_sysconfig_get_host_cert_filename_unix:5405: Error with certificate filename globus_gsi_system_config.c:globus_i_gsi_sysconfig_create_cert_string:417: Error with certificate filename globus_gsi_system_config.c:globus_i_gsi_sysconfig_check_certfile_unix:4815: File is not owned by current user: /etc/grid-security/hostcert.pem is not owned by current user globus_error_put(): globus_gsi_system_config.c:globus_gsi_sysconfig_dir_exists_unix:4550: File does not exist: /Users/rbru03/.globus/certificates is not a valid directory globus_error_put(): globus_gsi_system_config.c:globus_i_gsi_sysconfig_create_cert_dir_string:367: Could not find a valid trusted CA certificates directory globus_gsi_system_config.c:globus_gsi_sysconfig_dir_exists_unix:4550: File does not exist: /Users/rbru03/.globus/certificates is not a valid directory globus_error_put(): globus_gsi_system_config.c:globus_i_gsi_sysconfig_file_exists_unix:4495: File is a directory: File: /etc/grid-security/certificates/. globus_error_put(): globus_gsi_system_config.c:globus_i_gsi_sysconfig_file_exists_unix:4495: File is a directory: File: /etc/grid-security/certificates/.. globus_error_put(): globus_libc.c:globus_libc_getnameinfo:3292: System error: Unknown error: 0 globus_common: A system call failed: Unknown error: 0 globus_error_put(): globus_xio_tcp_driver.c:globus_l_xio_tcp_contact_string:961: globus_libc_addr_to_contact_string failed. globus_libc.c:globus_libc_getnameinfo:3292: System error: Unknown error: 0 globus_common: A system call failed: Unknown error: 0 globus_error_put(): globus_xio_tcp_driver.c:globus_l_xio_tcp_server_init:1379: globus_l_xio_tcp_contact_string failed. globus_xio_tcp_driver.c:globus_l_xio_tcp_contact_string:961: globus_libc_addr_to_contact_string failed. globus_libc.c:globus_libc_getnameinfo:3292: System error: Unknown error: 0 globus_common: A system call failed: Unknown error: 0 globus_error_put(): globus_xio_tcp_driver.c:globus_l_xio_tcp_server_init:1379: globus_l_xio_tcp_contact_string failed. globus_xio_tcp_driver.c:globus_l_xio_tcp_contact_string:961: globus_libc_addr_to_contact_string failed. globus_libc.c:globus_libc_getnameinfo:3292: System error: Unknown error: 0 globus_common: A system call failed: Unknown error: 0 t0:p7435: Fatal error: tcp_init(): globus_io_tcp_create_listener() failed /usr/local/globus/bin/globus-job-run: line 1: 7435 Abort trap /usr/local/globus/bin/globusrun -q -o -r "lake.esc.cam.ac.uk/jobmanager-fork" -f /tmp/globus_job_run.rbru03.rsl.7395
This is on an Apple, right? Can you send the output of sw_vers?
I'm not able to reproduce this on this machine: $ sw_vers ProductName: Mac OS X Server ProductVersion: 10.3.8 BuildVersion: 7U16 $ uname -a Darwin zoolander.mcs.anl.gov 7.8.0 Darwin Kernel Version 7.8.0: Wed Dec 22 14:26:17 PST 2004; root:xnu/xnu-517.11.1.obj~1/RELEASE_PPC Power Macintosh powerpc There are some known issues with particular versions of MacOS having quirky getnameinfo implementaions, and I believe this is what you're seeing. Without being able to reproduce the problem myself I'll have a hard time working around their bug. My first advice would be to make sure you have all of the latest OS updates installed.
He replied via email: Hi there, Sorry for not replying to the previous email about sw_vers, but we have been having our computer room re-wired this week and so have not been able to turn the machine on. The machine is called lagoon.esc.cam.ac.uk, and the outptu from sw_vers and uname -a are as follows: -sh-2.05b# sw_vers ProductName: Mac OS X Server ProductVersion: 10.3.8 BuildVersion: 7U16 -sh-2.05b# uname -a Darwin lagoon 7.8.0 Darwin Kernel Version 7.8.0: Wed Dec 22 14:26:17 PST 2004; root:xnu/xnu-517.11.1.obj~1/RELEASE_PPC Power Macintosh powerpc When getting the reported error I have set the GLOBUS_HOSTNAME environment variable to be lagoon.esc.cam.ac.uk which I thought was used to get around problems caused by the machine thinking it is called something other than its FQDN. Is that right? running hostname returns 'lagoon' rather than the FQDN, would this matter, especially considering that I am using the GLOBUS_HOSTNAME environment variable as above? Also I would expect that simply having the hostname set wrongly would cause a more standard error message than that which I am getting. Is that a fair assumption or could this problem be linked? Any suggestions you can make would be greatly appreciated, the machine is running the latest OS release though. If you want I can provide you with a username on the machine if this would help to debug the problem? Thanks in advance for your help, Rich
Hello, Do you have any more ideas as to what I can try to get this working? Will I still experience this problem with the final release of GT4? Thanks Rich
4.0 does not have any fixes related to this, we'll release a patch if we're able to work around the problem. It might help to edit your /etc/hosts file and make sure the line with your fqdn doesn't have any aliases: 192.168.100.100 hostname.domain.com instead of the more common 192.168.100.100 hostname.domain.com hostname
Subject: Re: Fatal error: tcp_init(): globus_io_tcp_create_listener() failed Hi there, Sorry for not replying to the previous email about sw_vers, but we have been having our computer room re-wired this week and so have not been able to turn the machine on. The machine is called lagoon.esc.cam.ac.uk, and the outptu from sw_vers and uname -a are as follows: -sh-2.05b# sw_vers ProductName: Mac OS X Server ProductVersion: 10.3.8 BuildVersion: 7U16 -sh-2.05b# uname -a Darwin lagoon 7.8.0 Darwin Kernel Version 7.8.0: Wed Dec 22 14:26:17 PST 2004; root:xnu/xnu-517.11.1.obj~1/RELEASE_PPC Power Macintosh powerpc When getting the reported error I have set the GLOBUS_HOSTNAME environment variable to be lagoon.esc.cam.ac.uk which I thought was used to get around problems caused by the machine thinking it is called something other than its FQDN. Is that right? running hostname returns 'lagoon' rather than the FQDN, would this matter, especially considering that I am using the GLOBUS_HOSTNAME environment variable as above? Also I would expect that simply having the hostname set wrongly would cause a more standard error message than that which I am getting. Is that a fair assumption or could this problem be linked? Any suggestions you can make would be greatly appreciated, the machine is running the latest OS release though. If you want I can provide you with a username on the machine if this would help to debug the problem? Thanks in advance for your help, Rich On Thu, 2005-04-14 at 23:45, bugzilla-daemon@mcs.anl.gov wrote: > http://bugzilla.globus.org/bugzilla/show_bug.cgi?id=3107 > > > > > > ------- Additional Comments From mlink@mcs.anl.gov 2005-04-14 17:45 ------- > I'm not able to reproduce this on this machine: > $ sw_vers > ProductName: Mac OS X Server > ProductVersion: 10.3.8 > BuildVersion: 7U16 > $ uname -a > Darwin zoolander.mcs.anl.gov 7.8.0 Darwin Kernel Version 7.8.0: Wed Dec 22 > 14:26:17 PST 2004; root:xnu/xnu-517.11.1.obj~1/RELEASE_PPC Power Macintosh > powerpc > > There are some known issues with particular versions of MacOS having quirky > getnameinfo implementaions, and I believe this is what you're seeing. Without > being able to reproduce the problem myself I'll have a hard time working around > their bug. My first advice would be to make sure you have all of the latest OS > updates installed. > > > > ------- You are receiving this mail because: ------- > You reported the bug, or are watching the reporter.