This is the mail archive of the
gdb-patches@sourceware.org
mailing list for the GDB project.
Re: Better realpath
On Wednesday 18 June 2008 22:54:13 Eli Zaretskii wrote:
> > The first question is about checking for file existance. Assuming we don't want
> > this check, we basically get to rewrite gdb_realpath from scratch, making it
> > operate on a purely syntactic basis.
>
> I don't think such radical measures would be necessary. We could
> either (a) use canonicalize_filename, which doesn't check for
> existence,
Hmm, the documentation at
http://www.gnu.org/software/libc/manual/html_mono/libc.html
say:
Function: char * canonicalize_file_name (const char *name)
If any of the path components is missing the function returns a NULL pointer.
....
Function: char * realpath (const char *restrict name, char *restrict resolved)
A call to realpath where the resolved parameter is NULL behaves exactly like
canonicalize_file_name. The function allocates a buffer for the file name and
returns a pointer to it. If resolved is not NULL it points to a buffer into
which the result is copied. It is the callers responsibility to allocate a
buffer which is large enough. On systems which define PATH_MAX this means
the buffer must be large enough for a pathname of this size. For systems
without limitations on the pathname length the requirement cannot be met
and programs should not call realpath with anything but NULL for the
second parameter.
One other difference is that the buffer resolved (if nonzero) will contain the
part of the path component which does not exist or is not readable if the
function returns NULL and errno is set to EACCES or ENOENT.
>From that, I don't quite understand why you think canonicalize_file_name does not
check for file existance. Is documentation in error?
> or (2) use realpath on the argument's leading directories
> (i.e. call `dirname' to remove the last portion of the file name). Am
> I missing something?
And this will check dirname existance? This semantics is mid-way between checking
everything for existance, and not checking anything. Is this really intuitive and
desirable?
> > Second is down-casing. If we don't want brute down-casing, and we want truly canonic
> > names of paths, then "C:/documents and settings" should become "C:/Documents and Settings",
> > and that requires actually poking at the file system to see what exact spelling is stored.
>
> No, that's not necessary either. All you need is run the result of
> GetFullPathName through GetLongPathName: if it fails, it means the
> file does not exist, and you need to return it in whatever letter-case
> it was passed to us; if it succeeds, it will return the file name as
> it's recorded in the filesystem.
That does not contradict what I say -- it *does* require poking at the file system,
so if some component of path is missing, you get no canonical representation.
The approach of returning paths that don't exist unmodified seem risky, in particular...
> For example, calling GetLongPathName
> with either "C:/documents and settings" or "C:/DOCUME~1" will return
> "C:\Documents and Settings".
... what will be the return value of GetLongPathName on "C:/DOCUME~1/nonexistent/nonexistent2"
and "C:/documents and settings/nonexistent/nonexistent2". Presumably, GetLongPathName will
fail in both cases, and GDB will think those paths are unequal.
GetLongPathName, also, is not available on Windows 95. Is that an issue?
> > So, the approaches are:
> >
> > 1. Make lrealpath always check for file existance, and:
> > - Either revise window case to get spelling from the filesystem, or
> > - Add a flag "I don't care about case differences, don't downcase"
> >
> > 2. Write another function that does purely syntactic normalization of paths.
> > It will not change case of paths on windows, naturally.
> >
> > Which of those approaches you:
> >
> > - Will be willing to accept?
> > - Will be willing to hack on?
>
> I hope I answered those questions now. If not, please tell.
No, I don't think I have found clear answers to either of those questions. You comments
suggested another approach, so here's the updated list of alternatives:
1. Make lrealpath always check for file existance. On Windows, make it use
GetLongPathName, instead of lowercasing, to get canonical path name.
2. Make lrealpath check for dirname existance only. The filename part will have to
be downcased on Windows.
3. Implement syntax-only simplification function; it won't be able to use the exact
spelling as stored in filesystem on Windows, and it won't be able to handle short
names.
It appears to me that (1) is the easiest approach, minimally disturbing for
existing code, and the one that I personally can implement and defend before libibery
maintainers. Now, which of those approaches you:
- Will be willing to accept?
- Will be willing to hack on, and push in libibery?
- Volodya