Bug 2687 - RFT hangs while setting TCP buffer size
: RFT hangs while setting TCP buffer size
Status: VERIFIED FIXED
: GridFTP
GridFTP
: development
: Other Linux
: P3 critical
: ---
Assigned To:
:
:
:
:
  Show dependency treegraph
 
Reported: 2005-02-03 03:53 by
Modified: 2005-03-11 06:49 (History)


Attachments


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2005-02-03 03:53:00
RFT client fails to transfer file on the same m/c. Where as file transfer using
globus-url-copy works fine. I observed this problem on RHEL 3.3 and SLES9 on
Linux on zSeries (s390) m/cs.Please find the steps followed and the server side
logs obtained after enabling debug option.
1.) Started the container as griduser after adding the line
log4j.category.org.globus.ftp=DEBUG in 
$GLOBUS_LOCATION/container-log4j.properties file
2.) Started the GridFTP server as root using 
$GLOBUS_LOCATION/sbin/globus-gridftp-server -s -p 2811
3.) Started Postgres DB as postgres user using 
/usr/local/pgsql/bin/postmaster -i -D /usr/local/pgsql/data/
4.) Execute rft client using below command.
$GLOBUS_LOCATION/bin/rft -z host -h lnxwsrf.ibm.com -f mytransfer.xfr

Got the following error on the server side.

[50]: https://127.0.0.1:8443/wsrf/services/ManagedJobFactoryService
2005-01-30 11:51:35,907 DEBUG vanilla.FTPControlChannel [Thread-7,open:112]
opening control channel to localhost : 2811
2005-01-30 11:51:35,925 DEBUG vanilla.Reply [Thread-7,<init>:58] read 1st line
2005-01-30 11:51:35,970 DEBUG vanilla.Reply [Thread-7,<init>:61] 1st line: 220
GridFTP Server lnxwsrf4.boeblingen.de.ibm.com 0.13 (gcc32dbg, 1103191677-1)
ready. ** Development Release **
2005-01-30 11:51:35,973 INFO  vanilla.FTPControlChannel [Thread-7,read:244]
Control channel received: 220 GridFTP Server lnxwsrf4.boeblingen.de.ibm.com 0.13
(gcc32dbg, 1103191677-1) ready. ** Development Release **
2005-01-30 11:51:35,976 DEBUG vanilla.FTPControlChannel [Thread-7,open:112]
opening control channel to localhost : 2811
2005-01-30 11:51:35,983 DEBUG vanilla.Reply [Thread-7,<init>:58] read 1st line
2005-01-30 11:51:36,026 DEBUG vanilla.Reply [Thread-7,<init>:61] 1st line: 220
GridFTP Server lnxwsrf4.boeblingen.de.ibm.com 0.13 (gcc32dbg, 1103191677-1)
ready. ** Development Release **
2005-01-30 11:51:36,029 INFO  vanilla.FTPControlChannel [Thread-7,read:244]
Control channel received: 220 GridFTP Server lnxwsrf4.boeblingen.de.ibm.com 0.13
(gcc32dbg, 1103191677-1) ready. ** Development Release **
2005-01-30 11:51:36,065 INFO  vanilla.FTPControlChannel [Thread-7,write:262]
Control channel sending: AUTH GSSAPI

2005-01-30 11:51:36,079 DEBUG vanilla.Reply [Thread-7,<init>:58] read 1st line
2005-01-30 11:51:36,081 DEBUG vanilla.Reply [Thread-7,<init>:61] 1st line: 334
Using authentication type; ADAT must follow.
2005-01-30 11:51:36,084 INFO  vanilla.FTPControlChannel [Thread-7,read:244]
Control channel received: 334 Using authentication type; ADAT must follow.
2005-01-30 11:51:38,973 DEBUG vanilla.Reply [Thread-7,<init>:58] read 1st line
2005-01-30 11:51:38,975 DEBUG vanilla.Reply [Thread-7,<init>:61] 1st line: 235
GSSAPI Authentication succeeded
2005-01-30 11:51:38,978 INFO  vanilla.FTPControlChannel [Thread-7,read:244]
Control channel received: 235 GSSAPI Authentication succeeded
2005-01-30 11:51:38,980 INFO  vanilla.FTPControlChannel [Thread-7,write:262]
Control channel sending: USER :globus-mapping:

2005-01-30 11:51:38,986 DEBUG vanilla.Reply [Thread-7,<init>:58] read 1st line
2005-01-30 11:51:39,030 DEBUG vanilla.Reply [Thread-7,<init>:61] 1st line: 331
Password required for :globus-mapping:.
2005-01-30 11:51:39,032 INFO  vanilla.FTPControlChannel [Thread-7,read:244]
Control channel received: 331 Password required for :globus-mapping:.
2005-01-30 11:51:39,035 INFO  vanilla.FTPControlChannel [Thread-7,write:262]
Control channel sending: PASS dummy

2005-01-30 11:51:39,040 DEBUG vanilla.Reply [Thread-7,<init>:58] read 1st line
2005-01-30 11:51:39,090 DEBUG vanilla.Reply [Thread-7,<init>:61] 1st line: 230
User ibmgrid logged in.
2005-01-30 11:51:39,093 INFO  vanilla.FTPControlChannel [Thread-7,read:244]
Control channel received: 230 User ibmgrid logged in.
2005-01-30 11:51:39,095 INFO  vanilla.FTPControlChannel [Thread-7,write:262]
Control channel sending: AUTH GSSAPI

2005-01-30 11:51:39,108 DEBUG vanilla.Reply [Thread-7,<init>:58] read 1st line
2005-01-30 11:51:39,111 DEBUG vanilla.Reply [Thread-7,<init>:61] 1st line: 334
Using authentication type; ADAT must follow.
2005-01-30 11:51:39,114 INFO  vanilla.FTPControlChannel [Thread-7,read:244]
Control channel received: 334 Using authentication type; ADAT must follow.
2005-01-30 11:51:40,244 DEBUG vanilla.Reply [Thread-7,<init>:58] read 1st line
2005-01-30 11:51:40,249 DEBUG vanilla.Reply [Thread-7,<init>:61] 1st line: 235
GSSAPI Authentication succeeded
2005-01-30 11:51:40,251 INFO  vanilla.FTPControlChannel [Thread-7,read:244]
Control channel received: 235 GSSAPI Authentication succeeded
2005-01-30 11:51:40,253 INFO  vanilla.FTPControlChannel [Thread-7,write:262]
Control channel sending: USER :globus-mapping:

2005-01-30 11:51:40,257 DEBUG vanilla.Reply [Thread-7,<init>:58] read 1st line
2005-01-30 11:51:40,297 DEBUG vanilla.Reply [Thread-7,<init>:61] 1st line: 331
Password required for :globus-mapping:.
2005-01-30 11:51:40,300 INFO  vanilla.FTPControlChannel [Thread-7,read:244]
Control channel received: 331 Password required for :globus-mapping:.
2005-01-30 11:51:40,302 INFO  vanilla.FTPControlChannel [Thread-7,write:262]
Control channel sending: PASS dummy

2005-01-30 11:51:40,305 DEBUG vanilla.Reply [Thread-7,<init>:58] read 1st line
2005-01-30 11:51:40,358 DEBUG vanilla.Reply [Thread-7,<init>:61] 1st line: 230
User ibmgrid logged in.
2005-01-30 11:51:40,361 INFO  vanilla.FTPControlChannel [Thread-7,read:244]
Control channel received: 230 User ibmgrid logged in.
Parallelsim:1
tcpBuuferSize:16000
Inside setSourceOptions
2005-01-30 11:51:40,366 DEBUG ftp.FTPClient [Thread-7,setOptions:908] setting
options: RETR Parallelism=1,1,1;
2005-01-30 11:51:40,368 INFO  vanilla.FTPControlChannel [Thread-7,write:262]
Control channel sending: OPTS RETR Parallelism=1,1,1;

2005-01-30 11:51:40,372 DEBUG vanilla.Reply [Thread-7,<init>:58] read 1st line
2005-01-30 11:51:40,406 DEBUG vanilla.Reply [Thread-7,<init>:61] 1st line: 200
OPTS Command Successful.
2005-01-30 11:51:40,409 INFO  vanilla.FTPControlChannel [Thread-7,read:244]
Control channel received: 200 OPTS Command Successful.
2005-01-30 11:51:40,412 DEBUG extended.GridFTPServerFacade
[Thread-7,setOptions:95] parallelism set to 1
After RetrieveOptions set to parallelism
2005-01-30 11:51:40,414 INFO  vanilla.FTPControlChannel [Thread-7,write:262]
Control channel sending: FEAT

2005-01-30 11:51:40,418 DEBUG vanilla.Reply [Thread-7,<init>:58] read 1st line
2005-01-30 11:51:40,476 DEBUG vanilla.Reply [Thread-7,<init>:61] 1st line:
211-Extensions supported
2005-01-30 11:51:40,478 DEBUG vanilla.Reply [Thread-7,<init>:118] multiline
reply; last line should start with ->211 <-
2005-01-30 11:51:40,481 DEBUG vanilla.Reply [Thread-7,<init>:121] lenght of
line.separator on this OS: 1
2005-01-30 11:51:40,483 DEBUG vanilla.Reply [Thread-7,<init>:126] read line
2005-01-30 11:51:40,485 DEBUG vanilla.Reply [Thread-7,<init>:138] line : -> UTF8<-
2005-01-30 11:51:40,488 DEBUG vanilla.Reply [Thread-7,<init>:126] read line
2005-01-30 11:51:40,490 DEBUG vanilla.Reply [Thread-7,<init>:138] line : -> LANG
EN<-
2005-01-30 11:51:40,492 DEBUG vanilla.Reply [Thread-7,<init>:126] read line
2005-01-30 11:51:40,494 DEBUG vanilla.Reply [Thread-7,<init>:138] line : -> DCAU<-
2005-01-30 11:51:40,497 DEBUG vanilla.Reply [Thread-7,<init>:126] read line
2005-01-30 11:51:40,499 DEBUG vanilla.Reply [Thread-7,<init>:138] line : ->
PARALLEL<-
2005-01-30 11:51:40,501 DEBUG vanilla.Reply [Thread-7,<init>:126] read line
2005-01-30 11:51:40,504 DEBUG vanilla.Reply [Thread-7,<init>:138] line : -> SIZE<-
2005-01-30 11:51:40,506 DEBUG vanilla.Reply [Thread-7,<init>:126] read line
2005-01-30 11:51:40,508 DEBUG vanilla.Reply [Thread-7,<init>:138] line : -> MLST
Type*;Size*;Modify*;Perm*;Charset;UNIX.mode*;Unique*;UNIX.slink*;<-
2005-01-30 11:51:40,510 DEBUG vanilla.Reply [Thread-7,<init>:126] read line
2005-01-30 11:51:40,513 DEBUG vanilla.Reply [Thread-7,<init>:138] line : -> ERET<-
2005-01-30 11:51:40,515 DEBUG vanilla.Reply [Thread-7,<init>:126] read line
2005-01-30 11:51:40,517 DEBUG vanilla.Reply [Thread-7,<init>:138] line : -> ESTO<-
2005-01-30 11:51:40,520 DEBUG vanilla.Reply [Thread-7,<init>:126] read line
2005-01-30 11:51:40,522 DEBUG vanilla.Reply [Thread-7,<init>:138] line : -> SPAS<-
2005-01-30 11:51:40,524 DEBUG vanilla.Reply [Thread-7,<init>:126] read line
2005-01-30 11:51:40,526 DEBUG vanilla.Reply [Thread-7,<init>:138] line : -> SPOR<-
2005-01-30 11:51:40,529 DEBUG vanilla.Reply [Thread-7,<init>:126] read line
2005-01-30 11:51:40,531 DEBUG vanilla.Reply [Thread-7,<init>:138] line : -> REST
STREAM<-
2005-01-30 11:51:40,533 DEBUG vanilla.Reply [Thread-7,<init>:126] read line
2005-01-30 11:51:40,536 DEBUG vanilla.Reply [Thread-7,<init>:138] line : -> MDTM<-
2005-01-30 11:51:40,538 DEBUG vanilla.Reply [Thread-7,<init>:126] read line
2005-01-30 11:51:40,540 DEBUG vanilla.Reply [Thread-7,<init>:138] line : ->211
End.<-
2005-01-30 11:51:40,542 DEBUG vanilla.Reply [Thread-7,<init>:143] end reached
2005-01-30 11:51:40,545 INFO  vanilla.FTPControlChannel [Thread-7,read:244]
Control channel received: 211-Extensions supported
 UTF8
 LANG EN
 DCAU
 PARALLEL
 SIZE
 MLST Type*;Size*;Modify*;Perm*;Charset;UNIX.mode*;Unique*;UNIX.slink*;
 ERET
 ESTO
 SPAS
 SPOR
 REST STREAM
 MDTM
211 End.
2005-01-30 11:51:40,552 INFO  vanilla.FTPControlChannel [Thread-7,write:262]
Control channel sending: SITE BUFSIZE 16000

2005-01-30 11:51:40,556 DEBUG vanilla.Reply [Thread-7,<init>:58] read 1st line
2005-01-30 11:51:40,598 DEBUG vanilla.Reply [Thread-7,<init>:61] 1st line: 500
Invalid command.
2005-01-30 11:51:40,600 INFO  vanilla.FTPControlChannel [Thread-7,read:244]
Control channel received: 500 Invalid command.
2005-01-30 11:51:40,602 INFO  vanilla.FTPControlChannel [Thread-7,write:262]
Control channel sending: SITE RETRBUFSIZE 16000

2005-01-30 11:51:40,610 ERROR service.TransferClient
[Thread-7,setRFTOptions:469] Unable to set transfer optionsBroken pipe
java.net.SocketException: Broken pipe
        at java.net.SocketOutputStream.socketWrite0(Native Method)
        at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113)
        at java.net.SocketOutputStream.write(SocketOutputStream.java:145)
        at
org.globus.ftp.extended.GridFTPOutputStream.writeToken(GridFTPOutputStream.java:45)
        at
org.globus.ftp.extended.GridFTPOutputStream.flush(GridFTPOutputStream.java:30)
        at
org.globus.ftp.vanilla.FTPControlChannel.writeStr(FTPControlChannel.java:330)
        at
org.globus.ftp.vanilla.FTPControlChannel.write(FTPControlChannel.java:264)
        at
org.globus.ftp.vanilla.FTPControlChannel.exchange(FTPControlChannel.java:281)
        at org.globus.ftp.GridFTPClient.tryExecutingCommand(GridFTPClient.java:249)
        at
org.globus.ftp.GridFTPClient.tryExecutingTwoCommands(GridFTPClient.java:234)
        at org.globus.ftp.GridFTPClient.setTCPBufferSize(GridFTPClient.java:204)
        at
org.globus.transfer.reliable.service.TransferClient.setSourceOptions(TransferClient.java:480)
        at
org.globus.transfer.reliable.service.TransferClient.setRFTOptions(TransferClient.java:466)
        at
org.globus.transfer.reliable.service.TransferWork.getNewClient(TransferWork.java:379)
        at
org.globus.transfer.reliable.service.TransferWork.run(TransferWork.java:553)
        at
org.globus.wsrf.impl.work.WorkManagerImpl$WorkWrapper.run(WorkManagerImpl.java:342)
        at java.lang.Thread.run(Thread.java:568)
2005-01-30 11:51:40,619 ERROR service.TransferWork [Thread-7,run:555] Terminal
transfer errorUnable to set transfer optionsBroken pipe [Caused by: Broken pipe]
org.globus.transfer.reliable.service.exception.RftException: Unable to set
transfer optionsBroken pipe [Caused by: Broken pipe]. Caused by
java.net.SocketException: Broken pipe
        at java.net.SocketOutputStream.socketWrite0(Native Method)
        at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113)
        at java.net.SocketOutputStream.write(SocketOutputStream.java:145)
        at
org.globus.ftp.extended.GridFTPOutputStream.writeToken(GridFTPOutputStream.java:45)
        at
org.globus.ftp.extended.GridFTPOutputStream.flush(GridFTPOutputStream.java:30)
        at
org.globus.ftp.vanilla.FTPControlChannel.writeStr(FTPControlChannel.java:330)
        at
org.globus.ftp.vanilla.FTPControlChannel.write(FTPControlChannel.java:264)
        at
org.globus.ftp.vanilla.FTPControlChannel.exchange(FTPControlChannel.java:281)
        at org.globus.ftp.GridFTPClient.tryExecutingCommand(GridFTPClient.java:249)
        at
org.globus.ftp.GridFTPClient.tryExecutingTwoCommands(GridFTPClient.java:234)
        at org.globus.ftp.GridFTPClient.setTCPBufferSize(GridFTPClient.java:204)
        at
org.globus.transfer.reliable.service.TransferClient.setSourceOptions(TransferClient.java:480)
        at
org.globus.transfer.reliable.service.TransferClient.setRFTOptions(TransferClient.java:466)
        at
org.globus.transfer.reliable.service.TransferWork.getNewClient(TransferWork.java:379)
        at
org.globus.transfer.reliable.service.TransferWork.run(TransferWork.java:553)
        at
org.globus.wsrf.impl.work.WorkManagerImpl$WorkWrapper.run(WorkManagerImpl.java:342)
        at java.lang.Thread.run(Thread.java:568)
2005-01-30 11:51:45,121 DEBUG vanilla.FTPControlChannel [Finalizer,close:161]
ftp socket closed
2005-01-30 11:51:45,125 DEBUG vanilla.FTPServerFacade [Finalizer,close:294]
close data channels
2005-01-30 11:51:45,126 DEBUG vanilla.FTPServerFacade [Finalizer,close:297]
close server socket
2005-01-30 11:51:45,127 DEBUG vanilla.FTPServerFacade [Finalizer,close:320] stop
master thread
2005-01-30 11:51:45,128 DEBUG vanilla.FTPControlChannel [Finalizer,close:161]
ftp socket closed
2005-01-30 11:51:45,131 DEBUG vanilla.FTPServerFacade [Finalizer,close:294]
close data channels
2005-01-30 11:51:45,132 DEBUG vanilla.FTPServerFacade [Finalizer,close:297]
close server socket
2005-01-30 11:51:45,133 DEBUG vanilla.FTPServerFacade [Finalizer,close:320] stop
master thread

The content of mytransfer.xfr is as below..

#true=binary false=ascii
true
#Block size in bytes
16000
#TCP Buffer size in bytes
16000
#Notpt (No thirdPartyTransfer)
false
#Number of parallel streams
1
#Data Channel Authentication (DCAU)
true
# Concurrency of the request
1
#Grid Subject name of the source gridftp server
/O=Grid/OU=GlobusTest/OU=grid1.com/CN=host/lnxwsrf.ibm.com
#Grid Subject name of the destination gridftp server
/O=Grid/OU=GlobusTest/OU=grid1.com/CN=host/lnxwsrf.ibm.com
#Transfer all or none of the transfers
false
#Maximum number of retries
10
#Source/Dest URL Pairs
gsiftp://localhost:2811/tmp/rftTest.tmp
gsiftp://localhost:2811/tmp/rftTest_Done.tmp
------- Comment #1 From 2005-02-03 11:29:57 -------
Do you see this error every time ?
------- Comment #2 From 2005-02-03 16:09:10 -------
Also, Can you please attach the logs of gridftp server/s involved in the
transfer  to this bug ? 
------- Comment #3 From 2005-02-05 05:04:25 -------
I am able to reproduce the problem on RHEL 3.3 and SLES9.0 on s390 m/c. And i 
am getting this error consistently on both the m/cs.
I tried to get the globus-gridftp logs after setting the options referring to 
http://www-
unix.globus.org/toolkit/docs/development/3.9.3/data/gridftp/GridFTP_Public_Inter
faces.html#config

But i could only see these messages in the log file
I tried with options -log-level 1, 255, ALL, INFO but i get only the below logs 
for all these options, don't know why.
Sat Feb  5 11:24:28 2005 :: Child process 1921 ended with rc = 11
Sat Feb  5 11:24:41 2005 :: Server is shutting down...
Sat Feb  5 11:24:41 2005 :: Child process 1922 ended with rc = 0
Sat Feb  5 11:29:12 2005 :: Child process 1987 ended with rc = 11
Sat Feb  5 11:33:00 2005 :: Child process 1988 ended with rc = 0
Sat Feb  5 11:44:43 2005 :: Child process 2073 ended with rc = 11
Sat Feb  5 11:44:58 2005 :: Server is shutting down...
Sat Feb  5 11:44:58 2005 :: Child process 2074 ended with rc = 0
Sat Feb  5 11:47:49 2005 :: Server is shutting down...

And also every time i run rft there are rows created in request and transfer 
tables in the postgres database.


------- Comment #4 From 2005-02-07 18:02:13 -------
Right, There will be rows created in the database everytime a request is
submitted because thats how 
RFT works. Anyways from the gridftp server log it looks like the server is
shutting itself down when you 
set tcp buffer size. I don't know why though. So I am adding Mike to the cc
list so he can throw some 
light on whats happenning.
------- Comment #5 From 2005-02-07 22:56:58 -------
Actually when it fails while setting the TCP Buffer Size it does not shutdown 
the GridFTP server it just hangs there. The shutdown message what you are 
seeing in the GridFTP logs is because i manually did a restart to try with 
different log level options (intension was to get more detailed logs on the 
GridFTP side).
------- Comment #6 From 2005-02-10 16:37:38 -------
child exiting with rc=11 means there was a segfault.  The only thing odd I see 
with the rft output is that mode E isn't set before the parallelism and tcp 
buffers, but I can't see how that would cause a segfault.

you'll need to set a log file with the -logfile option to see more, the daemon 
process won't print child logs to stdout.

also, try starting the server with the -debug flag and see if you get a core 
file in the directory you started it in. (make sure you're set to dump cores 
with ulimit -a or limit)... if so, email it to me.
------- Comment #7 From 2005-02-14 02:34:06 -------
This was a problem with the gridftp server, which caused a segfault upon the 
SITE BUFSIZE command.

Thanks for your help in tracking down the issue.  This was related to another 
bug in 3.9.4, the fix for which also fixed this problem.  The fix will be 
included in the upcoming 3.9.5 release.
------- Comment #8 From 2005-02-15 11:54:43 -------
*** Bug 2736 has been marked as a duplicate of this bug. ***