Bugzilla – Bug 5192
Setting gridftp usage_stats_target not working
Last modified: 2007-08-27 18:45:02
You need to log in before you can comment on or make changes to this bug.
With /soft/gridftp-4.0.3-jp1/etc/gridftp.conf: inetd 1 log_module syslog log_level all disable_usage_stats 0 usage_stats_target globus-usage.teragrid.org:4810,usage-stats.globus.org:4810 and inetd.conf gsiftp stream tcp nowait root /usr/bin/env env GLOBUS_LOCATION=/soft/gridftp-4.0.3-jp1 GLOBUS_TCP_PORT_RANGE=50000,51000 LD_LIBRARY_PATH=/soft/gridftp-4.0.3-jp1/lib /soft/gridftp-4.0.3-jp1/sbin/globus-gridftp-server -c /soft/gridftp-4.0.3-jp1/etc/gridftp.conf I've confirmed w/ tcpdump that usage packets are only beng sent to usage-stats.globus.org. This is the VDT 1.6.1x gridftp server based on Globus 4.0.3: 220 tg-grid1.uc.teragrid.org GridFTP Server 2.3 (gcc64dbg, 1144436882-63) ready.
This issue is holding up TeraGrid's plans to establish our own usage stats listener service for GridFTP usage. This is a critical requirement for TG at the moment, and we had been expecting that this version of GridFTP would have the fix to an earlier bug, so we really need help figuring out what the problem is. Thanks!
Created an attachment (id=1224) [details] Patch to gridftp/server/src/globus_i_gfs_log.c It was an obvious bug in the server, this value was never being passed down to the usage sending module. Attached is a patch. As a workaround, you can also use the env: GLOBUS_USAGE_TARGETS. Set it to set the same string you're using now.
Fix committed to trunk and globus_4_0_branch.
It appears that when the remote_nodes config file entry is specified, or the -r command line argument, the GLOBUS_USAGE_TARGETS environment variable is ignored and the server only reports to the default usage-stats.globus.org. This is a critical issue for the TeraGrid. We upgraded GridFTP to be able to collect usage statistics and it now appears we can collect usage statistics for all our striped gridftp servers (which is most of the servers).
(In reply to comment #4) ><snip> > collect usage statistics and it now appears we can collect usage statistics for > all our striped gridftp servers (which is most of the servers). Typo: we can NOT collect usage statistics for all our striped gridftp servers ..
In striped mode only the frontend is supposed to send the usage packets. The -r config option shouldn't affect the usage stats configuration at all, and I don't see an obvious bug. You are setting GLOBUS_USAGE_TARGETS on the frontend, right?
Yes, only on the front-end: I set GLOBUS_USAGE_TARGETS to two targets and then monitored with tcpdump for all outbound usage packets. With non-striped transfers, I saw outbound packets to all targets, with striped transfers, only one packet gets sent to the default globus listener.
To be more precise, the difference isn't between doing a striped vs non-striped transfer, but between configuring the gridftp server with remote_nodes versus not configuring it.
Actually, I think I see what is going on. It looks like (in my test at least) the backends are the ones sending the packets, and the frontend is sending none. So what you're seeing is the backend(s) sending to the default listener, and the frontend not even checking the listener config since it is not sending any packets. This is surely a bug, though I'll have to verify that it wasn't intentional. I'll have a patch soon. For now, you can get the data you want by configuring the listeners on the backend. For multi-stripe transfers you'll get an almost identical packet from each backend.
Setting GLOBUS_USAGE_TARGETS on the data movers will cause the first data mover to send usage records to all the targets. Interestingly, the other data movers DON'T generate usage records. Is the data from the first data mover complete wrt to the data moved by all the data movers?
Ah, right -- I saw two packets in my test and assumed one from each backend, in fact they were both from the first backend and one each for the send/retr of a third party transfer. The data is correct other than the the bytes transferred (NBYTES), which unfortunately is only the count of bytes that pass through that node. If you really need to use this before a bugfix, you can know that total transferred is somewhere between NBYTES and NBYTES*STRIPES, but that could get wildly inaccurate for files that don't divide evenly among stripes.
Thanks Mike. If it's possible it would be great to get the complete set of usage stats bugfixes into the upcoming 4.0.5. I would also be interested in testinng a 4.0.5 release candidate in case there are lingering issues.
This is fixed and will be in 4.0.5. All usage stats will come from the frontend. I believe the nightly installer from midnight tonight will be the first RC, if you wanted to test it out: http://www.globus.org/globus_4_0_branch/nightly.tar.gz
The TeraGrid is still having problems getting multiple usage targets working with GridFTP 4.0.5 on Solaris. We don't know if this problem persists on Linux. We'll let you know as soon as we find out.
I've opened bug 5503 for the Solaris issue which is unrelated to this bug.