Add cygwin_internal CW_GET_MODULE_PATH_FOR_ADDR

Charles Wilson
Fri Oct 14 06:09:00 GMT 2011

On 10/13/2011 10:42 AM, Corinna Vinschen wrote:
> On Oct 13 10:20, Charles Wilson wrote:
>>>   - It would be useful to have a Cygwin API that gives me the file
>>>     file name behind one particular address in the current process.
>>>     This should not be that slow.
>> This patch is a proof of concept for the latter.  Naturally, it needs
>> additional work -- updating version.h, real changelog entries,
>> documentation somewhere, etc. it worth the effort?  Is
>> something like this likely to be accepted?
> The first and foremost question is, what is the relocation support
> in libintl trying to accomplish?  Why does a internationalization
> library has to know the path of a module based on an address?
> Is that a functionality required on other POSIX systems?
> Can we discuss this on cygwin-developers first, please?  So far I doubt
> that this makes any sense on Cygwin.

Well, there are two issues separate but related issues:
1) the gnulib relocatable library/application support, as distinct from
2) how it's used (or even /whether/ it is used) in the official cygwin
libiconv and gettext/libintl packages.

Addressing #1 first.  There are two different cases that the relocatable
module has to address:

A) I build a "stack" of packages, all using the same prefix=/tmp/foo,
and then package them up for 'you' to install in prefix=/user/selected/.
 According to Bruno, this can easily work, without "costly" relocation,
because the (same) prefix relocation function can be used by the EXE and
all of its (relocatable) dependent DLLs. Basically, if I understand
correctly, the wrapper handles everything for the app and the dependent
dlls -- the wrapper code calls (in sequence)
lib*_set_relocation_prefix() for all the dependent libs, using the
application's install path.  Since the app and all the dlls were all
compiled with the same /tmp/foo prefix, the "relocation" regex needed by
all of them will be identical.  Problem solved.

B) I build a "stack" of packages, but each one uses its own
prefix=/tmp/reloc-Ag826u2/.  However, 'you' install them all into
prefix=/user/selected.  In this case, the EXE's relocation function
operates differently from DLL #1's relocation function, which is
different yet again from DLL #2's relocation function.  [[This is the
way I build libiconv and gettext/libintl, when I'm testing the
relocation functionality. But forget about libiconv/gettext for right now]]

In this case, the mechanisms employed by the various relocate()
functions ... relocate() [in the app/wrapper], libfoo_relocate(),
libbar_relocate(), etc ... are pretty complicated and all differ fromm
each other.  This is the "expensive" relocation.  Until Bruno's most
recent change, there was no distinction between "cheap" and "expensive"
-- if *_relocate() was present, it did the "expensive" relocation
operation -- even if all you needed was the "cheap" version.

[[ SIDE NOTE: even though our official libintl is built without
--enable-relocate, its Makefile is *hardcoded* to define
ENABLE_RELOCATABLE.  Therefore the libintl_relocate() function IS
present, and IS called -- but because the compile-time prefix (/usr) was
the same as the installed prefix, it very expensively did a no-op.  With
Bruno's recent change, it will do a cheap -- but not entirely free --
no-op. ]]

Now, how the "expensive" relocation works: the "relocate()" function is
re-#defined to ${PKGNAME}_relocate (e.g. libcharset_relocate) -- but
it's still the same source code.  It calls (on linux, and
cygwin-with-expensive-enabled) a static function
find_shared_library_fullname().  So, each (relocatable) module contains
a separate copy of this static function.  The code in that function does

  fp = fopen ("/proc/self/maps", "r");
  if (fp)
      unsigned long address = (unsigned long)
      ## parse fp looking for the vm segment in which 'address' is
      ## located.  Grab the pathname from the entry in fp for that
      ## vm segment
      close (fp);

Then, that string is returned to, e.g. libiconv_relocatable(), where it
can be compared against the compile-time "expected" installation location:
  runtime_path:                      /usr/bin/cygiconv-2.dll
  compile_path:   /tmp/reloc-iconv-261h2i/bin/cygiconv-2.dll
and the proper "replacement" regex [*] can be constructed.  Now, the
original path that it was trying to "relocate" can be...relocated:

     ---> process using regex -->

[*] not really a regex, just a replacement offset

But the key bit is that the libfoo_relocate() function needs to be able
to determine the current runtime path of
the-module-which-contains-'libfoo_relocate()' (well, technically, the
module which contains a specific static (private) copy of

Bruno and others have pointed out that parsing /proc/self/maps (well,
probably *generating* it in the first place) is very slow on cygwin:
10-20 times slower than linux.  It would be a lot faster -- when doing
"expensive" relocation -- if we could just get the data for one specific
entry IN /proc/self/maps without having to generate the whole file.
That's what my patch does.

Addressing #2 -- you're right: the libintl and libiconv packages on
cygwin are not intended to be "relocatable" -- so Bruno's approach, of
just shortcircuiting all the "expensive" stuff in the existing
libintl_relocate() functions exported by those DLLs -- is fine.  [We
can't completely eliminate those functions because of API stability
issues: Bruno wants the same exports in the DLLs regardless of whether
they were built using --enable-relocatable or not; so we just make them
as 'cheap' as possible when there is no real need for the hard stuff.]

But...I assume that libiconv/libintl are not the only packages that will
ever use gnulib's relocatable support.  Some of those might want/need
the "expensive" relocation. It would be nice if it were not painfully slow.


More information about the Cygwin-developers mailing list