Bugzilla – Bug 2687
RFT hangs while setting TCP buffer size
Last modified: 2005-03-11 06:49:20
You need to log in before you can comment on or make changes to this bug.
RFT client fails to transfer file on the same m/c. Where as file transfer using globus-url-copy works fine. I observed this problem on RHEL 3.3 and SLES9 on Linux on zSeries (s390) m/cs.Please find the steps followed and the server side logs obtained after enabling debug option. 1.) Started the container as griduser after adding the line log4j.category.org.globus.ftp=DEBUG in $GLOBUS_LOCATION/container-log4j.properties file 2.) Started the GridFTP server as root using $GLOBUS_LOCATION/sbin/globus-gridftp-server -s -p 2811 3.) Started Postgres DB as postgres user using /usr/local/pgsql/bin/postmaster -i -D /usr/local/pgsql/data/ 4.) Execute rft client using below command. $GLOBUS_LOCATION/bin/rft -z host -h lnxwsrf.ibm.com -f mytransfer.xfr Got the following error on the server side. [50]: https://127.0.0.1:8443/wsrf/services/ManagedJobFactoryService 2005-01-30 11:51:35,907 DEBUG vanilla.FTPControlChannel [Thread-7,open:112] opening control channel to localhost : 2811 2005-01-30 11:51:35,925 DEBUG vanilla.Reply [Thread-7,<init>:58] read 1st line 2005-01-30 11:51:35,970 DEBUG vanilla.Reply [Thread-7,<init>:61] 1st line: 220 GridFTP Server lnxwsrf4.boeblingen.de.ibm.com 0.13 (gcc32dbg, 1103191677-1) ready. ** Development Release ** 2005-01-30 11:51:35,973 INFO vanilla.FTPControlChannel [Thread-7,read:244] Control channel received: 220 GridFTP Server lnxwsrf4.boeblingen.de.ibm.com 0.13 (gcc32dbg, 1103191677-1) ready. ** Development Release ** 2005-01-30 11:51:35,976 DEBUG vanilla.FTPControlChannel [Thread-7,open:112] opening control channel to localhost : 2811 2005-01-30 11:51:35,983 DEBUG vanilla.Reply [Thread-7,<init>:58] read 1st line 2005-01-30 11:51:36,026 DEBUG vanilla.Reply [Thread-7,<init>:61] 1st line: 220 GridFTP Server lnxwsrf4.boeblingen.de.ibm.com 0.13 (gcc32dbg, 1103191677-1) ready. ** Development Release ** 2005-01-30 11:51:36,029 INFO vanilla.FTPControlChannel [Thread-7,read:244] Control channel received: 220 GridFTP Server lnxwsrf4.boeblingen.de.ibm.com 0.13 (gcc32dbg, 1103191677-1) ready. ** Development Release ** 2005-01-30 11:51:36,065 INFO vanilla.FTPControlChannel [Thread-7,write:262] Control channel sending: AUTH GSSAPI 2005-01-30 11:51:36,079 DEBUG vanilla.Reply [Thread-7,<init>:58] read 1st line 2005-01-30 11:51:36,081 DEBUG vanilla.Reply [Thread-7,<init>:61] 1st line: 334 Using authentication type; ADAT must follow. 2005-01-30 11:51:36,084 INFO vanilla.FTPControlChannel [Thread-7,read:244] Control channel received: 334 Using authentication type; ADAT must follow. 2005-01-30 11:51:38,973 DEBUG vanilla.Reply [Thread-7,<init>:58] read 1st line 2005-01-30 11:51:38,975 DEBUG vanilla.Reply [Thread-7,<init>:61] 1st line: 235 GSSAPI Authentication succeeded 2005-01-30 11:51:38,978 INFO vanilla.FTPControlChannel [Thread-7,read:244] Control channel received: 235 GSSAPI Authentication succeeded 2005-01-30 11:51:38,980 INFO vanilla.FTPControlChannel [Thread-7,write:262] Control channel sending: USER :globus-mapping: 2005-01-30 11:51:38,986 DEBUG vanilla.Reply [Thread-7,<init>:58] read 1st line 2005-01-30 11:51:39,030 DEBUG vanilla.Reply [Thread-7,<init>:61] 1st line: 331 Password required for :globus-mapping:. 2005-01-30 11:51:39,032 INFO vanilla.FTPControlChannel [Thread-7,read:244] Control channel received: 331 Password required for :globus-mapping:. 2005-01-30 11:51:39,035 INFO vanilla.FTPControlChannel [Thread-7,write:262] Control channel sending: PASS dummy 2005-01-30 11:51:39,040 DEBUG vanilla.Reply [Thread-7,<init>:58] read 1st line 2005-01-30 11:51:39,090 DEBUG vanilla.Reply [Thread-7,<init>:61] 1st line: 230 User ibmgrid logged in. 2005-01-30 11:51:39,093 INFO vanilla.FTPControlChannel [Thread-7,read:244] Control channel received: 230 User ibmgrid logged in. 2005-01-30 11:51:39,095 INFO vanilla.FTPControlChannel [Thread-7,write:262] Control channel sending: AUTH GSSAPI 2005-01-30 11:51:39,108 DEBUG vanilla.Reply [Thread-7,<init>:58] read 1st line 2005-01-30 11:51:39,111 DEBUG vanilla.Reply [Thread-7,<init>:61] 1st line: 334 Using authentication type; ADAT must follow. 2005-01-30 11:51:39,114 INFO vanilla.FTPControlChannel [Thread-7,read:244] Control channel received: 334 Using authentication type; ADAT must follow. 2005-01-30 11:51:40,244 DEBUG vanilla.Reply [Thread-7,<init>:58] read 1st line 2005-01-30 11:51:40,249 DEBUG vanilla.Reply [Thread-7,<init>:61] 1st line: 235 GSSAPI Authentication succeeded 2005-01-30 11:51:40,251 INFO vanilla.FTPControlChannel [Thread-7,read:244] Control channel received: 235 GSSAPI Authentication succeeded 2005-01-30 11:51:40,253 INFO vanilla.FTPControlChannel [Thread-7,write:262] Control channel sending: USER :globus-mapping: 2005-01-30 11:51:40,257 DEBUG vanilla.Reply [Thread-7,<init>:58] read 1st line 2005-01-30 11:51:40,297 DEBUG vanilla.Reply [Thread-7,<init>:61] 1st line: 331 Password required for :globus-mapping:. 2005-01-30 11:51:40,300 INFO vanilla.FTPControlChannel [Thread-7,read:244] Control channel received: 331 Password required for :globus-mapping:. 2005-01-30 11:51:40,302 INFO vanilla.FTPControlChannel [Thread-7,write:262] Control channel sending: PASS dummy 2005-01-30 11:51:40,305 DEBUG vanilla.Reply [Thread-7,<init>:58] read 1st line 2005-01-30 11:51:40,358 DEBUG vanilla.Reply [Thread-7,<init>:61] 1st line: 230 User ibmgrid logged in. 2005-01-30 11:51:40,361 INFO vanilla.FTPControlChannel [Thread-7,read:244] Control channel received: 230 User ibmgrid logged in. Parallelsim:1 tcpBuuferSize:16000 Inside setSourceOptions 2005-01-30 11:51:40,366 DEBUG ftp.FTPClient [Thread-7,setOptions:908] setting options: RETR Parallelism=1,1,1; 2005-01-30 11:51:40,368 INFO vanilla.FTPControlChannel [Thread-7,write:262] Control channel sending: OPTS RETR Parallelism=1,1,1; 2005-01-30 11:51:40,372 DEBUG vanilla.Reply [Thread-7,<init>:58] read 1st line 2005-01-30 11:51:40,406 DEBUG vanilla.Reply [Thread-7,<init>:61] 1st line: 200 OPTS Command Successful. 2005-01-30 11:51:40,409 INFO vanilla.FTPControlChannel [Thread-7,read:244] Control channel received: 200 OPTS Command Successful. 2005-01-30 11:51:40,412 DEBUG extended.GridFTPServerFacade [Thread-7,setOptions:95] parallelism set to 1 After RetrieveOptions set to parallelism 2005-01-30 11:51:40,414 INFO vanilla.FTPControlChannel [Thread-7,write:262] Control channel sending: FEAT 2005-01-30 11:51:40,418 DEBUG vanilla.Reply [Thread-7,<init>:58] read 1st line 2005-01-30 11:51:40,476 DEBUG vanilla.Reply [Thread-7,<init>:61] 1st line: 211-Extensions supported 2005-01-30 11:51:40,478 DEBUG vanilla.Reply [Thread-7,<init>:118] multiline reply; last line should start with ->211 <- 2005-01-30 11:51:40,481 DEBUG vanilla.Reply [Thread-7,<init>:121] lenght of line.separator on this OS: 1 2005-01-30 11:51:40,483 DEBUG vanilla.Reply [Thread-7,<init>:126] read line 2005-01-30 11:51:40,485 DEBUG vanilla.Reply [Thread-7,<init>:138] line : -> UTF8<- 2005-01-30 11:51:40,488 DEBUG vanilla.Reply [Thread-7,<init>:126] read line 2005-01-30 11:51:40,490 DEBUG vanilla.Reply [Thread-7,<init>:138] line : -> LANG EN<- 2005-01-30 11:51:40,492 DEBUG vanilla.Reply [Thread-7,<init>:126] read line 2005-01-30 11:51:40,494 DEBUG vanilla.Reply [Thread-7,<init>:138] line : -> DCAU<- 2005-01-30 11:51:40,497 DEBUG vanilla.Reply [Thread-7,<init>:126] read line 2005-01-30 11:51:40,499 DEBUG vanilla.Reply [Thread-7,<init>:138] line : -> PARALLEL<- 2005-01-30 11:51:40,501 DEBUG vanilla.Reply [Thread-7,<init>:126] read line 2005-01-30 11:51:40,504 DEBUG vanilla.Reply [Thread-7,<init>:138] line : -> SIZE<- 2005-01-30 11:51:40,506 DEBUG vanilla.Reply [Thread-7,<init>:126] read line 2005-01-30 11:51:40,508 DEBUG vanilla.Reply [Thread-7,<init>:138] line : -> MLST Type*;Size*;Modify*;Perm*;Charset;UNIX.mode*;Unique*;UNIX.slink*;<- 2005-01-30 11:51:40,510 DEBUG vanilla.Reply [Thread-7,<init>:126] read line 2005-01-30 11:51:40,513 DEBUG vanilla.Reply [Thread-7,<init>:138] line : -> ERET<- 2005-01-30 11:51:40,515 DEBUG vanilla.Reply [Thread-7,<init>:126] read line 2005-01-30 11:51:40,517 DEBUG vanilla.Reply [Thread-7,<init>:138] line : -> ESTO<- 2005-01-30 11:51:40,520 DEBUG vanilla.Reply [Thread-7,<init>:126] read line 2005-01-30 11:51:40,522 DEBUG vanilla.Reply [Thread-7,<init>:138] line : -> SPAS<- 2005-01-30 11:51:40,524 DEBUG vanilla.Reply [Thread-7,<init>:126] read line 2005-01-30 11:51:40,526 DEBUG vanilla.Reply [Thread-7,<init>:138] line : -> SPOR<- 2005-01-30 11:51:40,529 DEBUG vanilla.Reply [Thread-7,<init>:126] read line 2005-01-30 11:51:40,531 DEBUG vanilla.Reply [Thread-7,<init>:138] line : -> REST STREAM<- 2005-01-30 11:51:40,533 DEBUG vanilla.Reply [Thread-7,<init>:126] read line 2005-01-30 11:51:40,536 DEBUG vanilla.Reply [Thread-7,<init>:138] line : -> MDTM<- 2005-01-30 11:51:40,538 DEBUG vanilla.Reply [Thread-7,<init>:126] read line 2005-01-30 11:51:40,540 DEBUG vanilla.Reply [Thread-7,<init>:138] line : ->211 End.<- 2005-01-30 11:51:40,542 DEBUG vanilla.Reply [Thread-7,<init>:143] end reached 2005-01-30 11:51:40,545 INFO vanilla.FTPControlChannel [Thread-7,read:244] Control channel received: 211-Extensions supported UTF8 LANG EN DCAU PARALLEL SIZE MLST Type*;Size*;Modify*;Perm*;Charset;UNIX.mode*;Unique*;UNIX.slink*; ERET ESTO SPAS SPOR REST STREAM MDTM 211 End. 2005-01-30 11:51:40,552 INFO vanilla.FTPControlChannel [Thread-7,write:262] Control channel sending: SITE BUFSIZE 16000 2005-01-30 11:51:40,556 DEBUG vanilla.Reply [Thread-7,<init>:58] read 1st line 2005-01-30 11:51:40,598 DEBUG vanilla.Reply [Thread-7,<init>:61] 1st line: 500 Invalid command. 2005-01-30 11:51:40,600 INFO vanilla.FTPControlChannel [Thread-7,read:244] Control channel received: 500 Invalid command. 2005-01-30 11:51:40,602 INFO vanilla.FTPControlChannel [Thread-7,write:262] Control channel sending: SITE RETRBUFSIZE 16000 2005-01-30 11:51:40,610 ERROR service.TransferClient [Thread-7,setRFTOptions:469] Unable to set transfer optionsBroken pipe java.net.SocketException: Broken pipe at java.net.SocketOutputStream.socketWrite0(Native Method) at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113) at java.net.SocketOutputStream.write(SocketOutputStream.java:145) at org.globus.ftp.extended.GridFTPOutputStream.writeToken(GridFTPOutputStream.java:45) at org.globus.ftp.extended.GridFTPOutputStream.flush(GridFTPOutputStream.java:30) at org.globus.ftp.vanilla.FTPControlChannel.writeStr(FTPControlChannel.java:330) at org.globus.ftp.vanilla.FTPControlChannel.write(FTPControlChannel.java:264) at org.globus.ftp.vanilla.FTPControlChannel.exchange(FTPControlChannel.java:281) at org.globus.ftp.GridFTPClient.tryExecutingCommand(GridFTPClient.java:249) at org.globus.ftp.GridFTPClient.tryExecutingTwoCommands(GridFTPClient.java:234) at org.globus.ftp.GridFTPClient.setTCPBufferSize(GridFTPClient.java:204) at org.globus.transfer.reliable.service.TransferClient.setSourceOptions(TransferClient.java:480) at org.globus.transfer.reliable.service.TransferClient.setRFTOptions(TransferClient.java:466) at org.globus.transfer.reliable.service.TransferWork.getNewClient(TransferWork.java:379) at org.globus.transfer.reliable.service.TransferWork.run(TransferWork.java:553) at org.globus.wsrf.impl.work.WorkManagerImpl$WorkWrapper.run(WorkManagerImpl.java:342) at java.lang.Thread.run(Thread.java:568) 2005-01-30 11:51:40,619 ERROR service.TransferWork [Thread-7,run:555] Terminal transfer errorUnable to set transfer optionsBroken pipe [Caused by: Broken pipe] org.globus.transfer.reliable.service.exception.RftException: Unable to set transfer optionsBroken pipe [Caused by: Broken pipe]. Caused by java.net.SocketException: Broken pipe at java.net.SocketOutputStream.socketWrite0(Native Method) at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113) at java.net.SocketOutputStream.write(SocketOutputStream.java:145) at org.globus.ftp.extended.GridFTPOutputStream.writeToken(GridFTPOutputStream.java:45) at org.globus.ftp.extended.GridFTPOutputStream.flush(GridFTPOutputStream.java:30) at org.globus.ftp.vanilla.FTPControlChannel.writeStr(FTPControlChannel.java:330) at org.globus.ftp.vanilla.FTPControlChannel.write(FTPControlChannel.java:264) at org.globus.ftp.vanilla.FTPControlChannel.exchange(FTPControlChannel.java:281) at org.globus.ftp.GridFTPClient.tryExecutingCommand(GridFTPClient.java:249) at org.globus.ftp.GridFTPClient.tryExecutingTwoCommands(GridFTPClient.java:234) at org.globus.ftp.GridFTPClient.setTCPBufferSize(GridFTPClient.java:204) at org.globus.transfer.reliable.service.TransferClient.setSourceOptions(TransferClient.java:480) at org.globus.transfer.reliable.service.TransferClient.setRFTOptions(TransferClient.java:466) at org.globus.transfer.reliable.service.TransferWork.getNewClient(TransferWork.java:379) at org.globus.transfer.reliable.service.TransferWork.run(TransferWork.java:553) at org.globus.wsrf.impl.work.WorkManagerImpl$WorkWrapper.run(WorkManagerImpl.java:342) at java.lang.Thread.run(Thread.java:568) 2005-01-30 11:51:45,121 DEBUG vanilla.FTPControlChannel [Finalizer,close:161] ftp socket closed 2005-01-30 11:51:45,125 DEBUG vanilla.FTPServerFacade [Finalizer,close:294] close data channels 2005-01-30 11:51:45,126 DEBUG vanilla.FTPServerFacade [Finalizer,close:297] close server socket 2005-01-30 11:51:45,127 DEBUG vanilla.FTPServerFacade [Finalizer,close:320] stop master thread 2005-01-30 11:51:45,128 DEBUG vanilla.FTPControlChannel [Finalizer,close:161] ftp socket closed 2005-01-30 11:51:45,131 DEBUG vanilla.FTPServerFacade [Finalizer,close:294] close data channels 2005-01-30 11:51:45,132 DEBUG vanilla.FTPServerFacade [Finalizer,close:297] close server socket 2005-01-30 11:51:45,133 DEBUG vanilla.FTPServerFacade [Finalizer,close:320] stop master thread The content of mytransfer.xfr is as below.. #true=binary false=ascii true #Block size in bytes 16000 #TCP Buffer size in bytes 16000 #Notpt (No thirdPartyTransfer) false #Number of parallel streams 1 #Data Channel Authentication (DCAU) true # Concurrency of the request 1 #Grid Subject name of the source gridftp server /O=Grid/OU=GlobusTest/OU=grid1.com/CN=host/lnxwsrf.ibm.com #Grid Subject name of the destination gridftp server /O=Grid/OU=GlobusTest/OU=grid1.com/CN=host/lnxwsrf.ibm.com #Transfer all or none of the transfers false #Maximum number of retries 10 #Source/Dest URL Pairs gsiftp://localhost:2811/tmp/rftTest.tmp gsiftp://localhost:2811/tmp/rftTest_Done.tmp
Do you see this error every time ?
Also, Can you please attach the logs of gridftp server/s involved in the transfer to this bug ?
I am able to reproduce the problem on RHEL 3.3 and SLES9.0 on s390 m/c. And i am getting this error consistently on both the m/cs. I tried to get the globus-gridftp logs after setting the options referring to http://www- unix.globus.org/toolkit/docs/development/3.9.3/data/gridftp/GridFTP_Public_Inter faces.html#config But i could only see these messages in the log file I tried with options -log-level 1, 255, ALL, INFO but i get only the below logs for all these options, don't know why. Sat Feb 5 11:24:28 2005 :: Child process 1921 ended with rc = 11 Sat Feb 5 11:24:41 2005 :: Server is shutting down... Sat Feb 5 11:24:41 2005 :: Child process 1922 ended with rc = 0 Sat Feb 5 11:29:12 2005 :: Child process 1987 ended with rc = 11 Sat Feb 5 11:33:00 2005 :: Child process 1988 ended with rc = 0 Sat Feb 5 11:44:43 2005 :: Child process 2073 ended with rc = 11 Sat Feb 5 11:44:58 2005 :: Server is shutting down... Sat Feb 5 11:44:58 2005 :: Child process 2074 ended with rc = 0 Sat Feb 5 11:47:49 2005 :: Server is shutting down... And also every time i run rft there are rows created in request and transfer tables in the postgres database.
Right, There will be rows created in the database everytime a request is submitted because thats how RFT works. Anyways from the gridftp server log it looks like the server is shutting itself down when you set tcp buffer size. I don't know why though. So I am adding Mike to the cc list so he can throw some light on whats happenning.
Actually when it fails while setting the TCP Buffer Size it does not shutdown the GridFTP server it just hangs there. The shutdown message what you are seeing in the GridFTP logs is because i manually did a restart to try with different log level options (intension was to get more detailed logs on the GridFTP side).
child exiting with rc=11 means there was a segfault. The only thing odd I see with the rft output is that mode E isn't set before the parallelism and tcp buffers, but I can't see how that would cause a segfault. you'll need to set a log file with the -logfile option to see more, the daemon process won't print child logs to stdout. also, try starting the server with the -debug flag and see if you get a core file in the directory you started it in. (make sure you're set to dump cores with ulimit -a or limit)... if so, email it to me.
This was a problem with the gridftp server, which caused a segfault upon the SITE BUFSIZE command. Thanks for your help in tracking down the issue. This was related to another bug in 3.9.4, the fix for which also fixed this problem. The fix will be included in the upcoming 3.9.5 release.
*** Bug 2736 has been marked as a duplicate of this bug. ***