This is the mail archive of the cygwin mailing list for the Cygwin project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Clang is using the wrong memory model

Thanks a lot for your help in clarifying this.

When I complained here about the wasteful 64-bit addresses you said that it was an LLVM issue. When I complained to LLVM they said it was a Cygwin issue, and that you were using the wrong memory model.

All this confusion is due to a terrible lack of documentation of everything.
I had to do a lot of reverse engineering to figure out what is happening. What I have found out so far is listed below. Much of this is undocumented. Obviously, I would like to know if any or this is wrong or if specific documentation is available other than the SysV ABI and Windows ABI:

* Cygwin is using its own loader which is different from the Windows loader. * The Cygwin loader emulates the behavior of Linux shared objects. This includes the ability to directly access a variable inside a DLL * Access to a variable in a different DLL requires a 64-bit address. This is obtained by using the medium memory model with a gcc or Clang compiler. * The small memory model works differently on different targets. A -mcmodel=small with a Linux target puts everything below 2GB addresses. 32-bit absolute addresses are allowed. -mcmodel=small with a Windows or Mac target allows addresses above 2GB, but limits the distance between code and data in the same executable to 2GB. 32-bit absolute addresses are not allowed. 32-bit relative addresses are used instead. * The memory models work differently in gcc an Clang. Gcc with a medium or large memory model is using 64-bit address tables to access a variable in a different C/CPP file. Clang with a medium or large memory model is using 64-bit addresses not only for external variables, but also for local static data. This includes floating point constants, string constants, array initializers, jump tables, global variables, and more. * Cygwin uses a medium memory model by default. The medium memory model is necessary only for a program that makes direct access to a variable in a different DLL. The medium memory model is wasteful, and more so with Clang than with gcc.

Now I am speculating what we can do to avoid the wasteful 64-bit address-load instructions to improve the performance of Cygwin programs.

We can improve performance by using the small memory model when possible. The medium memory model is needed only for programs that link to a variable in a different DLL. The DLL that contains the link target does not need the medium memory model.

Direct access to a variable in a different DLL is considered bad programming practice by modern standards. This should occur only in old Linux code.

A link to a variable in a different DLL may be replaced by function calls (this is done with errno). In some cases, static linking can be an efficient alternative.

It would be helpful if the Cygwin loader could print the name of the offending variable when relocation fails with the small memory model. This could help programmers remove any obstacles to using the more efficient small memory model.


On 17/08/2019 10.16, Corinna Vinschen wrote:
Oe Aug 17 07:31, Agner Fog wrote:
So errno was a bad example but you can try accessing e.g. __ctype_ptr__,
__progname, optarg, h_errno, or use FE_DFL_ENV from another DLL, just
for kicks.
__ctype_ptr__ is a function

h_errno works like errno with an imported function

FE_DFL_ENV is a macro

__progname and optarg are local variables to each exe or dll
That would contradict what, e.g., __progname is for.  Here's a test:

$ cat > dll.c <<EOF
#include <stdio.h>

extern char *__progname;

printprog ()
   printf ("progname: %s\n", __progname);
$ cat > main.c <<EOF
extern void printprog();

main ()
   printprog ();
$ uname -a
CYGWIN_NT-10.0 vmbert10 3.1.0(0.340/5/3) 2019-08-16 14:36 x86_64 Cygwin

Lets try the medium model first:

   $ gcc -g -shared -mcmodel=medium -o dll.dll dll.c
   $ gcc -g -mcmodel=medium -o main main.c dll.dll
   $ ./main
   progname: main

Now let's try the small model:

   $ gcc -g -shared -mcmodel=small -o dll.dll dll.c
   $ gcc -g -mcmodel=small -o main main.c dll.dll
   $ ./main
   Cygwin runtime failure: /home/corinna/main.exe: Invalid relocation.  Offset
   0xfffffffd80348989 at address 0x40000103b doesn't fit into 32 bits

Now let's try without explicit mcmodel on the CLI:

   $ gcc -g -shared -o dll.dll dll.c
   $ gcc -g -o main main.c dll.dll
   $ ./main
   progname: main

gcc is using the small memory model by default in Cygwin64, and it works.
No, it's not, see above.

clang is using the small memory by default when cross-compiling for a Cygwin64 target from Linux, and it works. *your* example code.


Problem reports:
Unsubscribe info:

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]