[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

25.3 Writing A Cygwin Friendly Package

One approach to using the Cygwin support offered by GNU Autotools in your own package is to have an eye towards having it compile nicely on Unix and on Windows, or indeed of tweaking the configuration of existing packages which use GNU Autotools but which do not compile under Cygwin, or do not behave quite right after compilation. There are several things you need to be aware of in order to design a package to work seamlessly under Cygwin, and yet several more if portability to DOS and (non-Cygwin) Windows is important too. We discussed many of these issues in Unix/Windows Issues. In this section, we will expand on those issues with ways in which GNU Autotools can help deal with them.

If you only need to build executables and static libraries, then Cygwin provides an environment close enough to Unix that any packages which ship with a relatively recent configuration will compile pretty much out of the box, except for a few peculiarities of Windows which are discussed throughout the rest of this section. If you want to build a package which has not been maintained for a while, and which consequently uses an old Autoconf, then it is usually just a matter of removing the generated files, rebootstrapping the package with the installed (up to date!) Autoconf, and rerunning the ‘configure’ script. On occasion some tweaks will be needed in the ‘configure.in’ to satisfy the newer autoconf, but autoconf will almost always diagnose these for you while it is being run.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

25.3.1 Text vs Binary Modes

As discussed in Text and Binary Files, text and binary files are different on Windows. Lines in a Windows text files end in a carriage return/line feed pair, but a C program reading the file in text mode will see a single line feed.

Cygwin has several ways to hide this dichotomy, and the solution(s) you choose will depend on how you plan to use your program. I will outline the relative tradeoffs you make with each choice:

mounting

Before installing an operating system to your hard drive, you must first organise the disk into partitions. Under Windows, you might only have a single partition on the disk, which would be called ‘C:(63). Provided that some media is present, Windows allows you to access the contents of any drive letter – that is you can access ‘A:’ when there is a floppy disk in the drive, and ‘F:’ provided you divided you available drives into sufficient partitions for that letter to be in use. With Unix, things are somewhat different: hard disks are still divided into partitions (typically several), but there is only a single filesystem mounted under the root directory. You can use the mount command to hook a partition (or floppy drive or CD-ROM, etc.) into a subdirectory of the root filesystem:

 
$ mount /dev/fd0 /mnt/floppy
$ cd /mnt/floppy

Until the directory is unmounted, the contents of the floppy disk will be available as part of the single Unix filesystem in the directory, ‘/mnt/floppy’. This is in contrast with Windows’ multiple root directories which can be accessed by changing filesystem root – to access the contents of a floppy disk:

 
C:\WINDOWS\> A:
A:> DIR
...

Cygwin has a mounting facility to allow Cygwin applications to see a single unified file system starting at the root directory, by mounting drive letters to subdirectories. When mounting a directory you can set a flag to determine whether the files in that partition should be treated the same whether they are TEXT or BINARY mode files. Mounting a file system to treat TEXT files the same as BINARY files, means that Cygwin programs can behave in the same way as they might on Unix and treat all files as equal. Mounting a file system to treat TEXT files properly, will cause Cygwin programs to translate between Windows CR-LF line end sequences and Unix CR line endings, which plays havoc with file seeking, and many programs which make assumptions about the size of a char in a FILE stream. However ‘binmode’ is the default method because it is the only way to interoperate between Windows binaries and Cygwin binaries. You can get a list of which drive letters are mounted to which directories, and the modes they are mounted with by running the mount command without arguments:

 
BASH.EXE-2.04$ mount
Device              Directory            Type        flags
C:\cygwin           /                    user        binmode
C:\cygwin\bin       /usr/bin             user        binmode
C:\cygwin\lib       /usr/lib             user        binmode
D:\home             /home                user        binmode

As you can see, the Cygwin mount command allows you to ‘mount’ arbitrary Windows directories as well as simple drive letters into the single filesystem seen by Cygwin applications.

binmode

The CYGWIN environment variable holds a space separated list of setup options which exert some minor control over the way the ‘cygwin1.dll’ (or ‘cygwinb19.dll’ etc.) behaves. One such option is the ‘binmode’ setting; if CYGWIN contains the ‘binmode’ option, files which are opened through ‘cygwin1.dll’ without an explicit text or binary mode, will default to binary mode which is closest to how Unix behaves.

system calls

cygwin1.dll’, GNU libc and other modern C API implementations accept extra flags for fopen and open calls to determine in which mode a file is opened. On Unix it makes no difference, and sadly most Unix programmers are not aware of this subtlety, so this tends to be the first thing that needs to be fixed when porting a Unix program to Cygwin. The best way to use these calls portably is to use the following macros with a package’s ‘configure.in’ to be sure that the extra arguments are available:

 
# _AB_AC_FUNC_FOPEN(b | t, USE_FOPEN_BINARY | USE_FOPEN_TEXT)
# -----------------------------------------------------------
define([_AB_AC_FUNC_FOPEN],
[AC_CACHE_CHECK([whether fopen accepts "$1" mode], [ab_cv_func_fopen_$1],
[AC_TRY_RUN([#include <stdio.h>
int
main ()
{
   FILE *fp = fopen ("conftest.bin", "w$1");
   fprintf (fp, "\n");
   fclose (fp);
   return 0;
}],
            [ab_cv_func_fopen_$1=yes],
            [ab_cv_func_fopen_$1=no],
            [ab_cv_func_fopen_$1=no])])
if test x$ab_cv_func_fopen_$1 = xyes; then
  AC_DEFINE([$2], 1,
            [Define this if we can use the "$1" mode for fopen safely.])
fi[]dnl
])# _AB_AC_FUNC_FOPEN

# AB_AC_FUNC_FOPEN_BINARY
# -----------------------
# Test whether fopen accepts a "" in the mode string for binary file
# opening.  This makes no difference on most unices, but some OSes
# convert every newline written to a file to two bytes (CR LF), and
# every CR LF read from a file is silently converted to a newline.
AC_DEFUN([AB_AC_FUNC_FOPEN_BINARY], [_AB_AC_FUNC_FOPEN(b, USE_FOPEN_BINARY)])

# AB_AC_FUNC_FOPEN_TEXT
# ---------------------
# Test whether open accepts a "t" in the mode string for text file
# opening.  This makes no difference on most unices, but other OSes
# use it to assert that every newline written to a file writes two
# bytes (CR LF), and every CR LF read from a file are silently
# converted to a newline.
AC_DEFUN([AB_AC_FUNC_FOPEN_TEXT],   [_AB_AC_FUNC_FOPEN(t, USE_FOPEN_TEXT)])


# _AB_AC_FUNC_OPEN(O_BINARY|O_TEXT)
# ---------------------------------
AC_DEFUN([_AB_AC_FUNC_OPEN],
[AC_CACHE_CHECK([whether fcntl.h defines $1], [ab_cv_header_fcntl_h_$1],
[AC_EGREP_CPP([$1],
              [#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
$1
],
              [ab_cv_header_fcntl_h_$1=no],
              [ab_cv_header_fcntl_h_$1=yes])
if test "x$ab_cv_header_fcntl_h_$1" = xno; then
  AC_EGREP_CPP([_$1],
               [#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
_$1
],
                [ab_cv_header_fcntl_h_$1=0],
                [ab_cv_header_fcntl_h_$1=_$1])
fi])
if test "x$ab_cv_header_fcntl_h_$1" != xyes; then
  AC_DEFINE_UNQUOTED([$1], [$ab_cv_header_fcntl_h_$1],
    [Define this to a usable value if the system provides none])
fi[]dnl
])# _AB_AC_FUNC_OPEN


# AB_AC_FUNC_OPEN_BINARY
# ----------------------
# Test whether open accepts O_BINARY in the mode string for binary
# file opening.  This makes no difference on most unices, but some
# OSes convert every newline written to a file to two bytes (CR LF),
# and every CR LF read from a file is silently converted to a newline.
#
AC_DEFUN([AB_AC_FUNC_OPEN_BINARY], [_AB_AC_FUNC_OPEN([O_BINARY])])


# AB_AC_FUNC_OPEN_TEXT
# --------------------
# Test whether open accepts O_TEXT in the mode string for text file
# opening.  This makes no difference on most unices, but other OSes
# use it to assert that every newline written to a file writes two
# bytes (CR LF), and every CR LF read from a file are silently
# converted to a newline.
#
AC_DEFUN([AB_AC_FUNC_OPEN_TEXT],   [_AB_AC_FUNC_OPEN([O_TEXT])])


Add the following preprocessor code to a common header file that will be included by any sources that use fopen calls:

 
#define fopen	rpl_fopen

Save the following function to a file, and link that into your program so that in combination with the preprocessor magic above, you can always specify text or binary mode to open and fopen, and let this code take care of removing the flags on machines which do not support them:

 
#if HAVE_CONFIG_H
#  include <config.h>
#endif

#include <stdio.h>

/* Use the system size_t if it has one, or fallback to config.h */
#if STDC_HEADERS || HAVE_STDDEF_H
#  include <stddef.h>
#endif
#if HAVE_SYS_TYPES_H
#  include <sys/types.h>
#endif

/* One of the following headers will have prototypes for malloc
   and free on most systems.  If not, we don't add explicit
   prototypes which may generate a compiler warning in some
   cases -- explicit  prototypes would certainly cause
   compilation to fail with a type clash on some platforms. */
#if STDC_HEADERS || HAVE_STDLIB_H
#  include <stdlib.h>
#endif
#if HAVE_MEMORY_H
#  include <memory.h>
#endif

#if HAVE_STRING_H
#  include <string.h>
#else
#  if HAVE_STRINGS_H
#    include <strings.h>
#  endif /* !HAVE_STRINGS_H */
#endif /* !HAVE_STRING_H */

#if ! HAVE_STRCHR

/* BSD based systems have index() instead of strchr() */
#  if HAVE_INDEX
#    define strchr index
#  else /* ! HAVE_INDEX */

/* Very old C libraries have neither index() or strchr() */
#    define strchr rpl_strchr

static inline const char *strchr (const char *str, int ch);

static inline const char *
strchr (const char *str, int ch)
{
  const char *p = str;
  while (p && *p && *p != (char) ch)
    {
      ++p;
    }

  return (*p == (char) ch) ? p : 0;
}
#  endif /* HAVE_INDEX */

#endif /* HAVE_STRCHR */

/* BSD based systems have bcopy() instead of strcpy() */
#if ! HAVE_STRCPY
# define strcpy(dest, src)        bcopy(src, dest, strlen(src) + 1)
#endif

/* Very old C libraries have no strdup(). */
#if ! HAVE_STRDUP
# define strdup(str)                strcpy(malloc(strlen(str) + 1), str)
#endif

char*
rpl_fopen (const char *pathname, char *mode)
{
    char *result = NULL;
    char *p = mode;

    /* Scan to the end of mode until we find 'b' or 't'. */ 
    while (*p && *p != 'b' && *p != 't')
      {
        ++p;
      }

    if (!*p)
      {
        fprintf(stderr,
            "*WARNING* rpl_fopen called without mode 'b' or 't'\n");
      }

#if USE_FOPEN_BINARY && USE_FOPEN_TEXT
    result = fopen(pathname, mode);
#else
    {
        char ignore[3]= "bt";
        char *newmode = strdup(mode);
        char *q       = newmode;

        p = newmode;

#  if ! USE_FOPEN_TEXT
        strcpy(ignore, "b")
#  endif
#  if ! USE_FOPEN_BINARY
        strcpy(ignore, "t")
#  endif

        /* Copy characters from mode to newmode missing out
           b and/or t. */
        while (*p)
          {
            while (strchr(ignore, *p))
              {
                ++p;
              }
            *q++ = *p++;
          }
        *q = '\0';

        result = fopen(pathname, newmode);

        free(newmode);
    }
#endif /* USE_FOPEN_BINARY && USE_FOPEN_TEXT */

    return result;
}

The correct operation of the file above relies on several things having been checked by the configure script, so you will also need to ensure that the following macros are present in your ‘configure.in’ before you use this code:

 
# configure.in -- Process this file with autoconf to produce configure
AC_INIT(rpl_fopen.c)

AC_PROG_CC
AC_HEADER_STDC
AC_CHECK_HEADERS(string.h strings.h, break)
AC_CHECK_HEADERS(stdlib.h stddef.h sys/types.h memory.h)

AC_C_CONST
AC_TYPE_SIZE_T

AC_CHECK_FUNCS(strchr index strcpy strdup)
AB_AC_FUNC_FOPEN_BINARY
AB_AC_FUNC_FOPEN_TEXT


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

25.3.2 File System Limitations

We discussed some differences between Unix and Windows file systems in File system Issues. You learned about some of the differences between Unix and Windows file systems. This section expands on that discussion, covering filename differences and separator and drive letter distinctions.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

25.3.2.1 8.3 Filenames

As discussed earlier, DOS file systems have severe restrictions on possible file names: they must follow an 8.3 format. See section DOS Filename Restrictions.

This is quite a severe limitation, and affects some of the inner workings of GNU Autotools in two ways. The first is handled automatically, in that if .libs isn’t a legal directory name on the host system, Libtool and Automake will use the directory _libs instead. The other is that the traditional ‘config.h.in’ file is not legal under this scheme, and it must be worked around with a little known feature of Autoconf:

 
AC_CONFIG_HEADER(config.h:config.hin)

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

25.3.2.2 Separators and Drive Letters

As discussed earlier (see section Windows Separators and Drive Letters), the Windows file systems use different delimiters for separating directories and path elements than their Unix cousins. There are three places where this has an effect:

the shell command line

Up until Cygwin b20.1, it was possible to refer to drive letter prefixed paths from the shell using the ‘//c/path/to/file’ syntax to refer to the directory root at ‘C:\path\to\file’. Unfortunately, the Windows kernel confused this with the its own network share notation, causing the shell to pause for a short while to look for a machine named ‘c’ in its network neighbourhood. Since release 1.0 of Cygwin, the ‘//c/path/to/file’ notation now really does refer to a machine named ‘c’ from Cygwin as well as from Windows. To refer to drive letter rooted paths on the local machine from Cygwin there is a new hybrid ‘c:/path/to/file’ notation. This notation also works in Cygwin b20, and is probably the system you should use.

On the other hand, using the new hybrid notation in shell scripts means that they won’t run on old Cygwin releases. Shell code embedded In ‘configure.in’ scripts, should test whether the hybrid notation works, and use an alternate macro to translate hybrid notation to the old style if necessary.

I must confess that from the command line I now use the longer ‘/cygdrive/c/path/to/file’ notation, since <TAB> completion doesn’t yet work for the newer hybrid notation. It is important to use the new notation in shell scripts however, or they will fail on the latest releases of Cygwin.

shell scripts

For a shell script to work correctly on non-Cygwin development environments, it needs to be aware of and handle Windows path and directory separator and drive letters. The Libtool scripts use the following idiom:

 
case "$path" in
# Accept absolute paths.
[\\/]* | [A-Za-\]:[\\/]*)
  # take care of absolute paths
  insert some code here
  ;;
*)
  # what is left must be a relative path
  insert some code here
  ;;
esac
source code

When porting Unix software to Cygwin, this is much less of an issue because these differences are hidden beneath the emulation layer, and by the mount command respectively; although I have found that GCC, for example, returns a mixed mode ‘/’ and ‘\’ delimited include path which upsets Automake’s dependency tracking on occasion.

Cygwin provides convenience functions to convert back and forth between the different notations, which we call POSIX paths or path lists, and WIN32 paths or path lists:

Function: int posix_path_list_p (const char *path)

Return ‘0’, unless path is a ‘/’ and ‘:’ separated path list. The determination is rather simplistic, in that a string which contains a ‘;’ or begins with a single letter followed by a ‘:’ causes the ‘0’ return.

Function: void cygwin_win32_to_posix_path_list (const char *win32, char *posix)

Converts the ‘\’ and ‘;’ delimiters in win32, into the equivalent ‘/’ and ‘:’ delimiters while copying into the buffer at address posix. This buffer must be preallocated before calling the function.

Function: void cygwin_conv_to_posix_path (const char *path, char *posix_path)

If path is a ‘\’ delimited path, the equivalent, ‘/’ delimited path is written to the buffer at address posix_path. This buffer must be preallocated before calling the function.

Function: void cygwin_conv_to_full_posix_path (const char *path, char *posix_path)

If path is a, possibly relative, ‘\’ delimited path, the equivalent, absolute, ‘/’ delimited path is written to the buffer at address posix_path. This buffer must be preallocated before calling the function.

Function: void cygwin_posix_to_win32_path_list (const char *posix, char *win32)

Converts the ‘/’ and ‘:’ delimiters in posix, into the equivalent ‘\’ and ‘;’ delimiters while copying into the buffer at address win32. This buffer must be preallocated before calling the function.

Function: void cygwin_conv_to_win32_path (const char *path, char *win32_path)

If path is a ‘/’ delimited path, the equivalent, ‘\’ delimited path is written to the buffer at address win32_path. This buffer must be preallocated before calling the function.

Function: void cygwin_conv_to_full_win32_path (const char *path, char *win32_path)

If path is a, possibly relative, ‘/’ delimited path, the equivalent, absolute, ‘\’ delimited path is written to the buffer at address win32_path. This buffer must be preallocated before calling the function.

You can use these functions something like this:

 
void
display_canonical_path(const char *maybe_relative_or_win32)
{
    char buffer[MAX_PATH];
    cygwin_conv_to_full_posix_path(maybe_relative_or_win32,
                                   buffer);
    printf("canonical path for %s:  %s\n",
           maybe_relative_or_win32, buffer);
}

For your code to be fully portable however, you cannot rely on these Cygwin functions as they are not implemented on Unix, or even mingw or DJGPP. Instead you should add the following to a shared header, and be careful to use it when processing and building paths and path lists:

 
#if defined __CYGWIN32__ && !defined __CYGWIN__
   /* For backwards compatibility with Cygwin b19 and
      earlier, we define __CYGWIN__ here, so that
      we can rely on checking just for that macro. */
#  define __CYGWIN__  __CYGWIN32__
#endif
#if defined _WIN32 && !defined __CYGWIN__
   /* Use Windows separators on all _WIN32 defining
      environments, except Cygwin. */
#  define DIR_SEPARATOR_CHAR		'\\'
#  define DIR_SEPARATOR_STR		"\\"
#  define PATH_SEPARATOR_CHAR		';'
#  define PATH_SEPARATOR_STR		";"
#endif
#ifndef DIR_SEPARATOR_CHAR
   /* Assume that not having this is an indicator that all
      are missing. */
#  define DIR_SEPARATOR_CHAR		'/'
#  define DIR_SEPARATOR_STR		"/"
#  define PATH_SEPARATOR_CHAR		':'
#  define PATH_SEPARATOR_STR		":"
#endif /* !DIR_SEPARATOR_CHAR */

With this in place we can use the macros defined above to write code which will compile and work just about anywhere:

 
char path[MAXBUFLEN];
snprintf(path, MAXBUFLEN, "%ctmp%c%s\n",
         DIR_SEPARATOR_CHAR, DIR_SEPARATOR_CHAR, foo);
file = fopen(path, "tw+");

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

25.3.3 Executable Filename Extensions

As I already noted in Package Installation, the fact that Windows requires that all program files be named with the extension ‘.exe’, is the cause of several inconsistencies in package behaviour between Windows and Unix.

For example, where Libtool is involved, if a package builds an executable which is linked against an as yet uninstalled library, libtool puts the real executable in the ‘.libs’ (or ‘_libs’) subdirectory, and writes a shell script to the original destination of the executable(64), which ensures the runtime library search paths are adjusted to find the correct (uninstalled) libraries that it depends upon. On Windows, only a PE-COFF executable is allowed to bear the .exe extension, so the wrapper script has to be named differently to the executable it is substituted for (i.e the script is only executed correctly by the operating system if it does not have an ‘.exe’ extension). The result of this confusion is that the ‘Makefile’ can’t see some of the executables it builds with Libtool because the generated rules assume an ‘.exe’ extension will be in evidence. This problem will be addressed in some future revision of Automake and Libtool. In the mean time, it is sometimes necessary to move the executables from the ‘.libs’ directory to their install destination by hand. The continual rebuilding of wrapped executables at each invocation of make is another symptom of using wrapper scripts with a different name to the executable which they represent.

It is very important to correctly add the ‘.exe’ extension to program file names in your ‘Makefile.am’, otherwise many of the generated rules will not work correctly while they await a file without the ‘.exe’ extension. Fortunately, Automake will do this for you where ever it is able to tell that a file is a program – everything listed in ‘bin_PROGRAMS’ for example. Occasionally you will find cases where there is no way for Automake to be sure of this, in which case you must be sure to add the ‘$(EXEEXT)’ suffix. By structuring your ‘Makefile.am’ carefully, this can be avoided in the majority of cases:

 
TESTS = $(check_SCRIPTS) script-test bin1-test$(EXEEXT)

could be rewritten as:

 
check_PROGRAMS = bin1-test
TESTS = $(check_SCRIPTS) script-test $(check_PROGRAMS)

The value of ‘EXEEXT’ is always set correctly with respect to the host machine if you use Libtool in your project. If you don’t use Libtool, you must manually call the Autoconf macro, ‘AC_EXEEXT’ in your ‘configure.in’ to make sure that it is initialiased correctly. If you don’t call this macro (either directly or implicitly with ‘AC_PROG_LIBTOOL’), your project will almost certainly not build correctly on Cygwin.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]

This document was generated by Ben Elliston on July 10, 2015 using texi2html 1.82.