Bug 3919 - GT 4.0.1 g-u-c dumps core in batch mode with restartability
: GT 4.0.1 g-u-c dumps core in batch mode with restartability
Status: RESOLVED FIXED
: GridFTP
GridFTP
: unspecified
: PC Linux
: P3 normal
: 4.0.2
Assigned To:
:
:
:
:
  Show dependency treegraph
 
Reported: 2005-11-17 16:35 by
Modified: 2006-04-18 14:15 (History)


Attachments
patch to gridftp/client/source (1.70 KB, patch)
2005-11-22 16:20, Mike Link
Details


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2005-11-17 16:35:48
Hi

I am using GT4.0.1 (from CVS as of 2005-11-10), and globus-url-copy 3.18 is
dumping core when invoked thus:

$ globus-url-copy -verbose -fast -c -rst -f tx-27502-5p0aEk 
Source: gsiftp://gainly.uchicago.edu:2812/home/voeckler/tmp/
Dest:   gsiftp://griodine.uchicago.edu:2812/home/voeckler/tmp/
  TEST.1
Source: gsiftp://gainly.uchicago.edu:2812/home/voeckler/tmp/
Dest:   gsiftp://griodine.uchicago.edu:2812/home/voeckler/tmp/
  TEST.3
Segmentation fault (core dumped)

$ cat $ cat tx-27502-5p0aEk 
gsiftp://gainly.uchicago.edu:2812/home/voeckler/tmp/TEST.1
gsiftp://griodine.uchicago.edu:2812/home/voeckler/tmp/TEST.1
gsiftp://gainly.uchicago.edu:2812/home/voeckler/tmp/TEST.3
gsiftp://griodine.uchicago.edu:2812/home/voeckler/tmp/TEST.3
gsiftp://gainly.uchicago.edu:2812/home/voeckler/tmp/TEST.5
gsiftp://griodine.uchicago.edu:2812/home/voeckler/tmp/TEST.5
gsiftp://gainly.uchicago.edu:2812/home/voeckler/tmp/TEST.7
gsiftp://griodine.uchicago.edu:2812/home/voeckler/tmp/TEST.7
gsiftp://gainly.uchicago.edu:2812/home/voeckler/tmp/TEST.9
gsiftp://griodine.uchicago.edu:2812/home/voeckler/tmp/TEST.9

[Note: gainly is always the src, and griodine the dst, one pair per line]

Without the -verbose option, it just sits there with the 2nd transfer: strace
does not show any activity whatsoever, though the CPU adds 100% activity in
userland. 

$ gdb /opt/globus/latest/bin/globus-url-copy core.27642 
This GDB was configured as "i386-redhat-linux-gnu"...
Core was generated by `globus-url-copy -verbose -fast -c -rst -f tx-27502-5p0aEk'.
Program terminated with signal 11, Segmentation fault.
...
(gdb) where
#0  0x403e6970 in malloc_consolidate () from /lib/i686/libc.so.6
#1  0x403e6007 in _int_malloc () from /lib/i686/libc.so.6
#2  0x403e5201 in malloc () from /lib/i686/libc.so.6
#3  0x400762da in globus_l_ftp_control_read_cb (arg=0x8071360, handle=0x8071488,
result=0, 
    buf=0x8080108
"ObAOepJbaiquXwMKawhlq35tNGZPBs\r\n632-FwMAADDa7UZN09c3FPE19GEm7xNBBTS97JQbO09YPpXJVVDf5PAdK/z7H8kQsQH0\021",
nbytes=100)
    at globus_ftp_control_client.c:766
#4  0x400a1318 in globus_l_io_bounce_io_cb (xio_handle=0x8081948, result=0, 
    buffer=0x8080108
"ObAOepJbaiquXwMKawhlq35tNGZPBs\r\n632-FwMAADDa7UZN09c3FPE19GEm7xNBBTS97JQbO09YPpXJVVDf5PAdK/z7H8kQsQH0\021",
len=100, nbytes=100, data_desc=0x80b9b90, 
    user_arg=0x80ca178) at globus_io_xio_compat.c:798
#5  0x400ba654 in globus_l_xio_read_write_callback_kickout (user_arg=0x80b9b90)
at globus_xio_handle.c:1148
#6  0x400ba50e in globus_i_xio_read_write_callback (op=0x80b9b90, result=0,
nbytes=100, user_arg=0x0) at globus_xio_handle.c:1116
#7  0x400c34e0 in globus_l_xio_driver_op_read_kickout (user_arg=0x80b9b90) at
globus_xio_driver.c:605
#8  0x400d2950 in globus_xio_driver_finished_read (in_op=0x80b9b90, result=0,
nbytes=100) at globus_xio_pass.c:1227
#9  0x401019dd in globus_l_xio_tcp_finish_read (handle=0x808fe90, result=0,
nbytes=100) at globus_xio_tcp_driver.c:2196
#10 0x40101a62 in globus_l_xio_tcp_system_read_cb (result=0, nbytes=100,
user_arg=0x808fe90) at globus_xio_tcp_driver.c:2211
#11 0x400d5c92 in globus_l_xio_system_kickout (user_arg=0x8063a28) at
globus_xio_system_select.c:1016
#12 0x401baca3 in globus_callback_space_poll (timestop=0x8050888, space=-2) at
globus_callback_nothreads.c:1430
#13 0x0804b6e2 in globus_l_guc_transfer_files (guc_info=0xbfffcf40,
source_gass_copy_attr=0xbfffd060, dest_gass_copy_attr=0xbfffd050,
gass_copy_handle=0xbfffcfe0)
    at globus_url_copy.c:1205
#14 0x0804c960 in globus_l_guc_expand_urls (guc_info=0xbfffcf40,
gass_copy_attr=0xbfffd060, dest_gass_copy_attr=0xbfffd050,
gass_copy_handle=0xbfffcfe0)
    at globus_url_copy.c:1762
#15 0x0804a9ba in main (argc=7, argv=0xbfffd0d4) at globus_url_copy.c:633
#16 0x40387a67 in __libc_start_main () from /lib/i686/libc.so.6

Simple transfers work (to ensure that I can transfer): 

$ rm ~/tmp/TEST.?
$ globus-url-copy -f tx-27502-5p0aEk
$ echo $?
0
$ ls ~/tmp/TEST.? | cat
/home/voeckler/tmp/TEST.1
/home/voeckler/tmp/TEST.3
/home/voeckler/tmp/TEST.5
/home/voeckler/tmp/TEST.7
/home/voeckler/tmp/TEST.9


So, is the -fast option or are the restart options evil? 

"-f tx-27502-5p0aEk" works.
"-fast -f tx-27502-5p0aEk" works.
"-vb -fast -f tx-27502-5p0aEk" works. 
"-c -rst -f tx-27502-5p0aEk" stalls after the 3rd xfer.
"-fast -c -rst -f tx-27502-5p0aEk" stalls after the 2nd xfer.
"-verbose -fast -c -rst -f tx-27502-5p0aEk" stalls after the 2nd.
"-verbose -c -rst -f tx-27502-5p0aEk" dumps core after 2nd xfer.

Side question: With -fast and the above input (same src/dst host pairs), is the
data channel recycled in batch mode?
------- Comment #1 From 2005-11-21 18:58:16 -------
I haven't heard back on this bug. For us, it is a big deal, as there is another
VDT release in less than a week. We must either generate the right work-around
options, if we use g-u-c wrongly, or rely on a work-around from you guys, if we
use it as it's supposed to be used.
------- Comment #2 From 2005-11-21 19:04:09 -------
I just started looking at this today.  I don't expect any problems and I should 
have a fix tomorrow.

Mike
------- Comment #3 From 2005-11-22 16:20:05 -------
Created an attachment (id=754) [details]
patch to gridftp/client/source
------- Comment #4 From 2005-11-22 16:22:37 -------
The bug was in the gridftp client lib's restart plugin.  Committed patch to 
CVS.  I'll put out an update package later today.

To answer your side question, yes, the data channel is reused for any 
consecutive transfers between the same source and dest, whether they come from 
the urlfile or from wildcard expansion/recursion.
------- Comment #5 From 2005-11-22 17:40:38 -------
http://www.globus.org/toolkit/advisories.html
------- Comment #6 From 2005-11-22 19:48:02 -------
OK, thank you. I've added the patch to my test systems, and it appears to work
well. The gpt-build -upgrade was rather painless :-)