Bug 3107 - Fatal error: tcp_init(): globus_io_tcp_create_listener() failed
: Fatal error: tcp_init(): globus_io_tcp_create_listener() failed
Status: ASSIGNED
: Toolkit Internals
globus_common
: development
: Macintosh MacOS
: P3 normal
: ---
Assigned To:
:
:
:
:
  Show dependency treegraph
 
Reported: 2005-04-08 08:54 by
Modified: 2005-04-28 11:19 (History)


Attachments


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2005-04-08 08:54:05

    
------- Comment #1 From 2005-04-08 09:00:40 -------
Running: globus-job-run lake.esc.cam.ac.uk/jobmanager-fork /bin/date
simple error output:

t0:p7121: Fatal error: tcp_init(): globus_io_tcp_create_listener() failed
/usr/local/globus/bin/globus-job-run: line 1:  7121 Abort trap             
/usr/local/globus/bin/globusrun -q -o -r "lake.esc.cam.ac.uk/jobmanager-fork" -f
/tmp/globus_job_run.rbru03.rsl.7081

with the following additional environment variables:
export GLOBUS_ERROR_VERBOSE=1
export GLOBUS_TCP_PORT_RANGE=65000,65256
export GLOBUS_ERROR_OUTPUT=1

I get the following error:
globus_error_put():
globus_gsi_system_config.c:globus_i_gsi_sysconfig_check_certfile_unix:4815:
File is not owned by current user: /etc/grid-security/hostcert.pem is not owned
by current user

globus_error_put():
globus_gsi_system_config.c:globus_i_gsi_sysconfig_create_cert_string:417:
Error with certificate filename
globus_gsi_system_config.c:globus_i_gsi_sysconfig_check_certfile_unix:4815:
File is not owned by current user: /etc/grid-security/hostcert.pem is not owned
by current user

globus_error_put():
globus_gsi_system_config.c:globus_gsi_sysconfig_get_host_cert_filename_unix:5405:
Error with certificate filename
globus_gsi_system_config.c:globus_i_gsi_sysconfig_create_cert_string:417:
Error with certificate filename
globus_gsi_system_config.c:globus_i_gsi_sysconfig_check_certfile_unix:4815:
File is not owned by current user: /etc/grid-security/hostcert.pem is not owned
by current user

globus_error_put(): globus_gsi_credential.c:globus_gsi_cred_read:443:
Error reading host credential
globus_gsi_system_config.c:globus_gsi_sysconfig_get_host_cert_filename_unix:5405:
Error with certificate filename
globus_gsi_system_config.c:globus_i_gsi_sysconfig_create_cert_string:417:
Error with certificate filename
globus_gsi_system_config.c:globus_i_gsi_sysconfig_check_certfile_unix:4815:
File is not owned by current user: /etc/grid-security/hostcert.pem is not owned
by current user

globus_error_put():
globus_gsi_system_config.c:globus_gsi_sysconfig_dir_exists_unix:4550:
File does not exist: /Users/rbru03/.globus/certificates is not a valid directory

globus_error_put():
globus_gsi_system_config.c:globus_i_gsi_sysconfig_create_cert_dir_string:367:
Could not find a valid trusted CA certificates directory
globus_gsi_system_config.c:globus_gsi_sysconfig_dir_exists_unix:4550:
File does not exist: /Users/rbru03/.globus/certificates is not a valid directory

globus_error_put():
globus_gsi_system_config.c:globus_i_gsi_sysconfig_file_exists_unix:4495:
File is a directory: File: /etc/grid-security/certificates/.

globus_error_put():
globus_gsi_system_config.c:globus_i_gsi_sysconfig_file_exists_unix:4495:
File is a directory: File: /etc/grid-security/certificates/..

globus_error_put(): globus_libc.c:globus_libc_getnameinfo:3292:
System error: Unknown error: 0
globus_common: A system call failed: Unknown error: 0

globus_error_put(): globus_xio_tcp_driver.c:globus_l_xio_tcp_contact_string:961:
globus_libc_addr_to_contact_string failed.
globus_libc.c:globus_libc_getnameinfo:3292:
System error: Unknown error: 0
globus_common: A system call failed: Unknown error: 0

globus_error_put(): globus_xio_tcp_driver.c:globus_l_xio_tcp_server_init:1379:
globus_l_xio_tcp_contact_string failed.
globus_xio_tcp_driver.c:globus_l_xio_tcp_contact_string:961:
globus_libc_addr_to_contact_string failed.

globus_libc.c:globus_libc_getnameinfo:3292:
System error: Unknown error: 0
globus_common: A system call failed: Unknown error: 0

globus_error_put(): globus_xio_tcp_driver.c:globus_l_xio_tcp_server_init:1379:
globus_l_xio_tcp_contact_string failed.
globus_xio_tcp_driver.c:globus_l_xio_tcp_contact_string:961:
globus_libc_addr_to_contact_string failed.
globus_libc.c:globus_libc_getnameinfo:3292:
System error: Unknown error: 0
globus_common: A system call failed: Unknown error: 0

t0:p7435: Fatal error: tcp_init(): globus_io_tcp_create_listener() failed
/usr/local/globus/bin/globus-job-run: line 1:  7435 Abort trap             
/usr/local/globus/bin/globusrun -q -o -r "lake.esc.cam.ac.uk/jobmanager-fork" -f
/tmp/globus_job_run.rbru03.rsl.7395


------- Comment #2 From 2005-04-11 09:52:50 -------
This is on an Apple, right?  Can you send the output of sw_vers?
------- Comment #3 From 2005-04-14 17:45:36 -------
I'm not able to reproduce this on this machine:
$ sw_vers
ProductName:    Mac OS X Server
ProductVersion: 10.3.8
BuildVersion:   7U16
$ uname -a
Darwin zoolander.mcs.anl.gov 7.8.0 Darwin Kernel Version 7.8.0: Wed Dec 22 
14:26:17 PST 2004; root:xnu/xnu-517.11.1.obj~1/RELEASE_PPC  Power Macintosh 
powerpc

There are some known issues with particular versions of MacOS having quirky 
getnameinfo implementaions, and I believe this is what you're seeing.  Without 
being able to reproduce the problem myself I'll have a hard time working around 
their bug.  My first advice would be to make sure you have all of the latest OS 
updates installed.
------- Comment #4 From 2005-04-15 07:07:25 -------
He replied via email:
Hi there,

Sorry for not replying to the previous email about sw_vers, but we have
been having our computer room re-wired this week and so have not been
able to turn the machine on.

The machine is called lagoon.esc.cam.ac.uk, and the outptu from sw_vers
and uname -a are as follows:

-sh-2.05b# sw_vers
ProductName:    Mac OS X Server
ProductVersion: 10.3.8
BuildVersion:   7U16
-sh-2.05b# uname -a
Darwin lagoon 7.8.0 Darwin Kernel Version 7.8.0: Wed Dec 22 14:26:17 PST
2004;
root:xnu/xnu-517.11.1.obj~1/RELEASE_PPC Power Macintosh powerpc

When getting the reported error I have set the GLOBUS_HOSTNAME
environment variable to be lagoon.esc.cam.ac.uk which I thought was used
to get around problems caused by the machine thinking it is called
something other than its FQDN. Is that right?

running hostname returns 'lagoon' rather than the FQDN, would this
matter, especially considering that I am using the GLOBUS_HOSTNAME
environment variable as above? Also I would expect that simply having
the hostname set wrongly would cause a more standard error message than
that which I am getting. Is that a fair assumption or could this problem
be linked?

Any suggestions you can make would be greatly appreciated, the machine
is running the latest OS release though. If you want I can provide you
with a username on the machine if this would help to debug the problem?

Thanks in advance for your help,

Rich
------- Comment #5 From 2005-04-28 02:26:28 -------
Hello,

Do you have any more ideas as to what I can try to get this working? Will I
still experience this problem with the final release of GT4?

Thanks

Rich
------- Comment #6 From 2005-04-28 11:19:27 -------
4.0 does not have any fixes related to this, we'll release a patch if we're 
able to work around the problem.

It might help to edit your /etc/hosts file and make sure the line with your 
fqdn doesn't have any aliases:

192.168.100.100  hostname.domain.com
instead of the more common
192.168.100.100  hostname.domain.com  hostname
------- Comment #7 From 2005-05-02 15:29:02 -------
Subject: Re:  Fatal error: tcp_init():
	globus_io_tcp_create_listener() failed

Hi there,

Sorry for not replying to the previous email about sw_vers, but we have
been having our computer room re-wired this week and so have not been
able to turn the machine on.

The machine is called lagoon.esc.cam.ac.uk, and the outptu from sw_vers
and uname -a are as follows:

-sh-2.05b# sw_vers
ProductName:    Mac OS X Server
ProductVersion: 10.3.8
BuildVersion:   7U16
-sh-2.05b# uname -a
Darwin lagoon 7.8.0 Darwin Kernel Version 7.8.0: Wed Dec 22 14:26:17 PST
2004;
root:xnu/xnu-517.11.1.obj~1/RELEASE_PPC Power Macintosh powerpc

When getting the reported error I have set the GLOBUS_HOSTNAME
environment variable to be lagoon.esc.cam.ac.uk which I thought was used
to get around problems caused by the machine thinking it is called
something other than its FQDN. Is that right?

running hostname returns 'lagoon' rather than the FQDN, would this
matter, especially considering that I am using the GLOBUS_HOSTNAME
environment variable as above? Also I would expect that simply having
the hostname set wrongly would cause a more standard error message than
that which I am getting. Is that a fair assumption or could this problem
be linked?

Any suggestions you can make would be greatly appreciated, the machine
is running the latest OS release though. If you want I can provide you
with a username on the machine if this would help to debug the problem?

Thanks in advance for your help,

Rich

On Thu, 2005-04-14 at 23:45, bugzilla-daemon@mcs.anl.gov wrote:
> http://bugzilla.globus.org/bugzilla/show_bug.cgi?id=3107
> 
> 
> 
> 
> 
> ------- Additional Comments From mlink@mcs.anl.gov  2005-04-14 17:45 -------
> I'm not able to reproduce this on this machine:
> $ sw_vers
> ProductName:    Mac OS X Server
> ProductVersion: 10.3.8
> BuildVersion:   7U16
> $ uname -a
> Darwin zoolander.mcs.anl.gov 7.8.0 Darwin Kernel Version 7.8.0: Wed Dec 22 
> 14:26:17 PST 2004; root:xnu/xnu-517.11.1.obj~1/RELEASE_PPC  Power Macintosh 
> powerpc
> 
> There are some known issues with particular versions of MacOS having quirky 
> getnameinfo implementaions, and I believe this is what you're seeing.  Without 
> being able to reproduce the problem myself I'll have a hard time working around 
> their bug.  My first advice would be to make sure you have all of the latest OS 
> updates installed.
> 
> 
> 
> ------- You are receiving this mail because: -------
> You reported the bug, or are watching the reporter.