Bugzilla – Bug 6294
Non standard URL parsing for file: method (extra / required)
Last modified: 2008-08-28 13:01:51
You need to
before you can comment on or make changes to this bug.
The problem has been encountered in srmcp that uses CoG libraries to parse
File URLs require an extra '/'.
'file:///abs_path/file_name' is failing to refer to '/abs_path/file_name' in a
POSIX file system, 'file:////abs_path/file_name' is referring to it
This is different from the behavior of C clients (like globus-url-copy) and
other user clients (grid related or not)
According to RFC 1738 '/' is a special character into the URLs:
In a URL like file://host/path1/path2/file the first 2 are separating the
method, the 3rd is separating the host, the remaining are part of the file URL
specification and separate hierarchy level (the path is not a POSIX path even
if it looks similar).
(In reply to comment #0)
> File URLs require an extra '/'.
> 'file:///abs_path/file_name' is failing to refer to '/abs_path/file_name' in a
> POSIX file system, 'file:////abs_path/file_name' is referring to it
I thought that both file:///abs_path/file_name and file:/abs_path/file_name
were valid URIs, but that file://abs_path/file_name and
file:////abs_path/file_name were not.
It is not clear to me what you want. Current jglobus has:
refer to: path/file, and:
refer to /path/file.
From the the RFC that you posted this seems correct. 3 slashes are needed, 2
for the scheme seperator :// and then 1 for the empty host separator, from
there you start the path. Sure it is quite a bit of slashes and i imagine that
it why there are so many other convention for file urls, but i think it is
complaint. Is there some optional or additional behavior that you are looking
I think Marco's point is that the RFC mentions that "/" is a separator if
present before the path and that it should not be considered meaningful beyond
that, and that all paths, whether prefixed with a slash or not should be
absolute. Also the FTP URL RFC mentions that an absolute path should be
prefixed with %2f (presumably if the default ftp dir is different from the root
of the FS).
file:///etc/passwd should point to /etc/passwd.
ftp://host/etc/passwd should point to FTP_ROOT/etc/passwd
ftp://host/%2fetc/passwd should point to /etc/passwd
This is somewhat tricky because it makes it more difficult to express "relative
paths" in a uniform way, something that seems not to have been considered much
in the RFCs.
It is difficult to make a change to this parsing code after cog has existed for
so long. Does anyone know the conventions or assumptions most users make? We
could overload the parsing functions such that a new method includes a flag
specifying file url parsing behavior, would that work for your situation?