[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
An application which most developers try their hand at sooner or later is a Unix shell. There is a lot of functionality common to all traditional command line shells, which I thought I would push into a portable library to get you over the first hurdle when that moment is upon you. Before elaborating on any of this I need to name the project. I’ve called it sic, from the Latin so it is, because like all good project names it is somewhat pretentious and it lends itself to the recursive acronym sic is cumulative.
The gory detail of the minutiae of the source is beyond the scope of this book, but to convey a feel for the need for Sic, some of the goals which influenced the design follow:
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
As I explained in Project Directory Structure, I’ll first create the project directories, a toplevel directory and a subdirectory to put the library sources into. I want to install the library header files to ‘/usr/local/include/sic’, so the library subdirectory must be named appropriately. See section C Header Files.
$ mkdir sic $ mkdir sic/sic $ cd sic/sic |
I will describe the files I add in this section in more detail than the project specific sources, because they comprise an infrastructure that I use relatively unchanged for all of my GNU Autotools projects. You could keep an archive of these files, and use them as a starting point each time you begin a new project of your own.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
A good place to start with any project design is the error management facility. In Sic I will use a simple group of functions to display simple error messages. Here is ‘sic/error.h’:
#ifndef SIC_ERROR_H #define SIC_ERROR_H 1 #include <sic/common.h> BEGIN_C_DECLS extern const char *program_name; extern void set_program_name (const char *argv0); extern void sic_warning (const char *message); extern void sic_error (const char *message); extern void sic_fatal (const char *message); END_C_DECLS #endif /* !SIC_ERROR_H */ |
This header file follows the principles set out in C Header Files.
I am storing the program_name
variable in the library that uses
it, so that I can be sure that the library will build on architectures
that don’t allow undefined symbols in libraries(12).
Keeping those preprocessor macro definitions designed to aid code portability together (in a single file), is a good way to maintain the readability of the rest of the code. For this project I will put that code in ‘common.h’:
#ifndef SIC_COMMON_H #define SIC_COMMON_H 1 #if HAVE_CONFIG_H # include <config.h> #endif #include <stdio.h> #include <sys/types.h> #if STDC_HEADERS # include <stdlib.h> # include <string.h> #elif HAVE_STRINGS_H # include <strings.h> #endif /*STDC_HEADERS*/ #if HAVE_UNISTD_H # include <unistd.h> #endif #if HAVE_ERRNO_H # include <errno.h> #endif /*HAVE_ERRNO_H*/ #ifndef errno /* Some systems #define this! */ extern int errno; #endif #endif /* !SIC_COMMON_H */ |
You may recognise some snippets of code from the Autoconf manual here— in particular the inclusion of the project ‘config.h’, which will be generated shortly. Notice that I have been careful to conditionally include any headers which are not guaranteed to exist on every architecture. The rule of thumb here is that only ‘stdio.h’ is ubiquitous (though I have never heard of a machine that has no ‘sys/types.h’). You can find more details of some of these in (autoconf)Existing Tests section ‘Existing Tests’ in The GNU Autoconf Manual.
Here is a little more code from ‘common.h’:
#ifndef EXIT_SUCCESS # define EXIT_SUCCESS 0 # define EXIT_FAILURE 1 #endif |
The implementation of the error handling functions goes in ‘error.c’ and is very straightforward:
#if HAVE_CONFIG_H # include <config.h> #endif #include "common.h" #include "error.h" static void error (int exit_status, const char *mode, const char *message); static void error (int exit_status, const char *mode, const char *message) { fprintf (stderr, "%s: %s: %s.\n", program_name, mode, message); if (exit_status >= 0) exit (exit_status); } void sic_warning (const char *message) { error (-1, "warning", message); } void sic_error (const char *message) { error (-1, "ERROR", message); } void sic_fatal (const char *message) { error (EXIT_FAILURE, "FATAL", message); } |
I also need a definition of program_name
;
set_program_name
copies the filename component of path
into
the exported data, program_name
. The xstrdup
function
just calls strdup
, but abort
s if there is not enough
memory to make the copy:
const char *program_name = NULL; void set_program_name (const char *path) { if (!program_name) program_name = xstrdup (basename (path)); } |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
A useful idiom common to many GNU projects is to wrap the memory
management functions to localise out of memory handling, naming
them with an ‘x’ prefix. By doing this, the rest of the project is
relieved of having to remember to check for ‘NULL’ returns from the
various memory functions. These wrappers use the error
API
to report memory exhaustion and abort the program. I have placed the
implementation code in ‘xmalloc.c’:
#if HAVE_CONFIG_H # include <config.h> #endif #include "common.h" #include "error.h" void * xmalloc (size_t num) { void *new = malloc (num); if (!new) sic_fatal ("Memory exhausted"); return new; } void * xrealloc (void *p, size_t num) { void *new; if (!p) return xmalloc (num); new = realloc (p, num); if (!new) sic_fatal ("Memory exhausted"); return new; } void * xcalloc (size_t num, size_t size) { void *new = xmalloc (num * size); bzero (new, num * size); return new; } |
Notice in the code above, that xcalloc
is implemented in terms of
xmalloc
, since calloc
itself is not available in some
older C libraries. Also, the bzero
function is actually
deprecated in favour of memset
in modern C libraries –
I’ll explain how to take this into account later in Beginnings of a ‘configure.in’.
Rather than create a separate ‘xmalloc.h’ file, which would need to
be #include
d from almost everywhere else, the logical place to
declare these functions is in ‘common.h’, since the wrappers will
be called from most everywhere else in the code:
#ifdef __cplusplus # define BEGIN_C_DECLS extern "C" { # define END_C_DECLS } #else # define BEGIN_C_DECLS # define END_C_DECLS #endif #define XCALLOC(type, num) \ ((type *) xcalloc ((num), sizeof(type))) #define XMALLOC(type, num) \ ((type *) xmalloc ((num) * sizeof(type))) #define XREALLOC(type, p, num) \ ((type *) xrealloc ((p), (num) * sizeof(type))) #define XFREE(stale) do { \ if (stale) { free (stale); stale = 0; } \ } while (0) BEGIN_C_DECLS extern void *xcalloc (size_t num, size_t size); extern void *xmalloc (size_t num); extern void *xrealloc (void *p, size_t num); extern char *xstrdup (const char *string); extern char *xstrerror (int errnum); END_C_DECLS |
By using the macros defined here, allocating and freeing heap memory is reduced from:
char **argv = (char **) xmalloc (sizeof (char *) * 3); do_stuff (argv); if (argv) free (argv); |
to the simpler and more readable:
char **argv = XMALLOC (char *, 3); do_stuff (argv); XFREE (argv); |
In the same spirit, I have borrowed ‘xstrdup.c’ and ‘xstrerror.c’ from project GNU’s libiberty. See section Fallback Function Implementations.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
In many C programs you will see various implementations and re-implementations of lists and stacks, each tied to its own particular project. It is surprisingly simple to write a catch-all implementation, as I have done here with a generalised list operation API in ‘list.h’:
#ifndef SIC_LIST_H #define SIC_LIST_H 1 #include <sic/common.h> BEGIN_C_DECLS typedef struct list { struct list *next; /* chain forward pointer*/ void *userdata; /* incase you want to use raw Lists */ } List; extern List *list_new (void *userdata); extern List *list_cons (List *head, List *tail); extern List *list_tail (List *head); extern size_t list_length (List *head); END_C_DECLS #endif /* !SIC_LIST_H */ |
The trick is to ensure that any structures you want to chain together
have their forward pointer in the first field. Having done that, the
generic functions declared above can be used to manipulate any such
chain by casting it to List *
and back again as necessary.
For example:
struct foo { struct foo *next; char *bar; struct baz *qux; ... }; ... struct foo *foo_list = NULL; foo_list = (struct foo *) list_cons ((List *) new_foo (), (List *) foo_list); ... |
The implementation of the list manipulation functions is in ‘list.c’:
#include "list.h" List * list_new (void *userdata) { List *new = XMALLOC (List, 1); new->next = NULL; new->userdata = userdata; return new; } List * list_cons (List *head, List *tail) { head->next = tail; return head; } List * list_tail (List *head) { return head->next; } size_t list_length (List *head) { size_t n; for (n = 0; head; ++n) head = head->next; return n; } |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
In order to set the stage for later chapter which expand upon this example, in this subsection I will describe the purpose of the sources that combine to implement the shell library. I will not dissect the code introduced here—you can download the sources from the book’s webpages at http://sources.redhat.com/autobook/.
The remaining sources for the library, beyond the support files described in the previous subsection, are divided into four pairs of files:
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Here are the functions for creating and managing sic parsers.
#ifndef SIC_SIC_H #define SIC_SIC_H 1 #include <sic/common.h> #include <sic/error.h> #include <sic/list.h> #include <sic/syntax.h> typedef struct sic { char *result; /* result string */ size_t len; /* bytes used by result field */ size_t lim; /* bytes allocated to result field */ struct builtintab *builtins; /* tables of builtin functions */ SyntaxTable **syntax; /* dispatch table for syntax of input */ List *syntax_init; /* stack of syntax state initialisers */ List *syntax_finish; /* stack of syntax state finalizers */ SicState *state; /* state data from syntax extensions */ } Sic; #endif /* !SIC_SIC_H */ |
This structure has fields to store registered command (builtins
)
and syntax (syntax
) handlers, along with other state information
(state
) that can be used to share information between various
handlers, and some room to build a result or error string (result
).
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Here are the functions for managing tables of builtin commands in each
Sic
structure:
typedef int (*builtin_handler) (Sic *sic, int argc, char *const argv[]); typedef struct { const char *name; builtin_handler func; int min, max; } Builtin; typedef struct builtintab BuiltinTab; extern Builtin *builtin_find (Sic *sic, const char *name); extern int builtin_install (Sic *sic, Builtin *table); extern int builtin_remove (Sic *sic, Builtin *table); |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Having created a Sic
parser, and populated it with some
Builtin
handlers, a user of this library must tokenize and
evaluate its input stream. These files define a structure for storing
tokenized strings (Tokens
), and functions for converting
char *
strings both to and from this structure type:
#ifndef SIC_EVAL_H #define SIC_EVAL_H 1 #include <sic/common.h> #include <sic/sic.h> BEGIN_C_DECLS typedef struct { int argc; /* number of elements in ARGV */ char **argv; /* array of pointers to elements */ size_t lim; /* number of bytes allocated */ } Tokens; extern int eval (Sic *sic, Tokens *tokens); extern int untokenize (Sic *sic, char **pcommand, Tokens *tokens); extern int tokenize (Sic *sic, Tokens **ptokens, char **pcommand); END_C_DECLS #endif /* !SIC_EVAL_H */ |
These files also define the eval
function, which examines a
Tokens
structure in the context of the given Sic parser,
dispatching the argv
array to a relevant Builtin
handler,
also written by the library user.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
When tokenize
splits a char *
string into parts, by
default it breaks the string into words delimited by whitespace. These
files define the interface for changing this default behaviour, by
registering callback functions which the parser will run when it meets
an ‘interesting’ symbol in the input stream. Here are the
declarations from ‘syntax.h’:
BEGIN_C_DECLS typedef int SyntaxHandler (struct sic *sic, BufferIn *in, BufferOut *out); typedef struct syntax { SyntaxHandler *handler; char *ch; } Syntax; extern int syntax_install (struct sic *sic, Syntax *table); extern SyntaxHandler *syntax_handler (struct sic *sic, int ch); END_C_DECLS |
A SyntaxHandler
is a function called by tokenize
as it
consumes its input to create a Tokens
structure; the two
functions associate a table of such handlers with a given Sic
parser, and find the particular handler for a given character in that
Sic
parser, respectively.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Now that I have some code, I can run autoscan
to generate a
preliminary
‘configure.in’. autoscan
will examine all of
the sources in the current directory tree looking for common points of
non-portability, adding macros suitable for detecting the discovered
problems. autoscan
generates the following in
‘configure.scan’:
# Process this file with autoconf to produce a configure script. AC_INIT(sic/eval.h) # Checks for programs. # Checks for libraries. # Checks for header files. AC_HEADER_STDC AC_CHECK_HEADERS(strings.h unistd.h) # Checks for typedefs, structures, and compiler characteristics. AC_C_CONST AC_TYPE_SIZE_T # Checks for library functions. AC_FUNC_VPRINTF AC_CHECK_FUNCS(strerror) AC_OUTPUT() |
Since the generated ‘configure.scan’ does not overwrite your project’s ‘configure.in’, it is a good idea to run
autoscan
periodically even in established project source trees, and compare the two files. Sometimesautoscan
will find some portability issue you have overlooked, or weren’t aware of.
Looking through the documentation for the macros in this
‘configure.scan’,
AC_C_CONST
and AC_TYPE_SIZE_T
will
take care of themselves (provided I ensure that ‘config.h’ is
included into every source file), and AC_HEADER_STDC
and
AC_CHECK_HEADERS(unistd.h)
are already taken care of in
‘common.h’.
autoscan
is no silver bullet! Even here in this
simple example, I need to manually add macros to check for the presence
of ‘errno.h’:
AC_CHECK_HEADERS(errno.h strings.h unistd.h) |
I also need to manually add the Autoconf macro for generating
‘config.h’; a macro to initialise automake
support; and a
macro to check for the presence of ranlib
. These should go
close to the start of ‘configure.in’:
... AC_CONFIG_HEADER(config.h) AM_INIT_AUTOMAKE(sic, 0.5) AC_PROG_CC AC_PROG_RANLIB ... |
Recall that the use of bzero
in Memory Management is not
entirely portable. The trick is to provide a bzero
work-alike,
depending on which functions Autoconf detects, by adding the following
towards the end of ‘configure.in’:
... AC_CHECK_FUNCS(bzero memset, break) ... |
With the addition of this small snippet of code to ‘common.h’, I
can now make use of bzero
even when linking with a C library
that has no implementation of its own:
#if !HAVE_BZERO && HAVE_MEMSET # define bzero(buf, bytes) ((void) memset (buf, 0, bytes)) #endif |
An interesting macro suggested by autoscan
is
AC_CHECK_FUNCS(strerror)
. This tells me that I need to provide a
replacement implementation of strerror
for the benefit of
architectures which don’t have it in their system libraries. This is
resolved by providing a file with a fallback implementation for the
named function, and creating a library from it and any others that
‘configure’ discovers to be lacking from the system library on the
target host.
You will recall that ‘configure’ is the shell script the end user of this package will run on their machine to test that it has all the features the package wants to use. The library that is created will allow the rest of the project to be written in the knowledge that any functions required by the project but missing from the installers system libraries will be available nonetheless. GNU ‘libiberty’ comes to the rescue again – it already has an implementation of ‘strerror.c’ that I was able to use with a little modification.
Being able to supply a simple implementation of strerror
, as the
‘strerror.c’ file from ‘libiberty’ does, relies on there being
a well defined sys_errlist
variable. It is a fair bet that if
the target host has no strerror
implementation, however, that the
system sys_errlist
will be broken or missing. I need to write a
configure macro to check whether the system defines sys_errlist
,
and tailor the code in ‘strerror.c’ to use this knowledge.
To avoid clutter in the top-level directory, I am a great believer in keeping as many of the configuration files as possible in their own sub-directory. First of all, I will create a new directory called ‘config’ inside the top-level directory, and put ‘sys_errlist.m4’ inside it:
AC_DEFUN([SIC_VAR_SYS_ERRLIST], [AC_CACHE_CHECK([for sys_errlist], sic_cv_var_sys_errlist, [AC_TRY_LINK([int *p;], [extern int sys_errlist; p = &sys_errlist;], sic_cv_var_sys_errlist=yes, sic_cv_var_sys_errlist=no)]) if test x"$sic_cv_var_sys_errlist" = xyes; then AC_DEFINE(HAVE_SYS_ERRLIST, 1, [Define if your system libraries have a sys_errlist variable.]) fi]) |
I must then add a call to this new macro in the ‘configure.in’ file being careful to put it in the right place – somewhere between typedefs and structures and library functions according to the comments in ‘configure.scan’:
SIC_VAR_SYS_ERRLIST |
GNU Autotools can also be set to store most of their files in a
subdirectory, by calling the AC_CONFIG_AUX_DIR
macro near the top
of ‘configure.in’, preferably right after AC_INIT
:
AC_INIT(sic/eval.c) AC_CONFIG_AUX_DIR(config) AM_CONFIG_HEADER(config.h) ... |
Having made this change, many of the files added by running
autoconf
and automake --add-missing
will be put in
the aux_dir.
The source tree now looks like this:
sic/ +-- configure.scan +-- config/ | +-- sys_errlist.m4 +-- replace/ | +-- strerror.c +-- sic/ +-- builtin.c +-- builtin.h +-- common.h +-- error.c +-- error.h +-- eval.c +-- eval.h +-- list.c +-- list.h +-- sic.c +-- sic.h +-- syntax.c +-- syntax.h +-- xmalloc.c +-- xstrdup.c +-- xstrerror.c |
In order to correctly utilise the fallback implementation,
AC_CHECK_FUNCS(strerror)
needs to be removed and strerror
added to AC_REPLACE_FUNCS
:
# Checks for library functions. AC_REPLACE_FUNCS(strerror) |
This will be clearer if you look at the ‘Makefile.am’ for the ‘replace’ subdirectory:
## Makefile.am -- Process this file with automake to produce Makefile.in INCLUDES = -I$(top_builddir) -I$(top_srcdir) noinst_LIBRARIES = libreplace.a libreplace_a_SOURCES = dummy.c libreplace_a_LIBADD = @LIBOBJS@ |
The code tells automake
that I want to build a library for use
within the build tree (i.e. not installed – ‘noinst’), and that
has no source files by default. The clever part here is that when
someone comes to install Sic, they will run configure
which
will test for strerror
, and add ‘strerror.o’ to
LIBOBJS
if the target host environment is missing its own
implementation. Now, when ‘configure’ creates
‘replace/Makefile’ (as I asked it to with AC_OUTPUT
),
‘@LIBOBJS@’ is replaced by the list of objects required on the
installer’s machine.
Having done all this at configure time, when my user runs
make
, the files required to replace functions missing
from their target machine will be added to ‘libreplace.a’.
Unfortunately this is not quite enough to start building the project. First I need to add a top-level ‘Makefile.am’ from which to ultimately create a top-level ‘Makefile’ that will descend into the various subdirectories of the project:
## Makefile.am -- Process this file with automake to produce Makefile.in SUBDIRS = replace sic |
And ‘configure.in’ must be told where it can find instances of
Makefile.in
:
AC_OUTPUT(Makefile replace/Makefile sic/Makefile) |
I have written a bootstrap
script for Sic, for details see
Bootstrapping:
#! /bin/sh autoreconf -fvi |
The ‘--foreign’ option to automake
tells it to relax
the GNU standards for various files that should be present in a
GNU distribution. Using this option saves me from having to create
empty files as we did in A Minimal GNU Autotools Project.
Right. Let’s build the library! First, I’ll run bootstrap
:
$ ./bootstrap + aclocal -I config + autoheader + automake --foreign --add-missing --copy automake: configure.in: installing config/install-sh automake: configure.in: installing config/mkinstalldirs automake: configure.in: installing config/missing + autoconf |
The project is now in the same state that an end-user would see, having unpacked a distribution tarball. What follows is what an end user might expect to see when building from that tarball:
$ ./configure creating cache ./config.cache checking for a BSD compatible install... /usr/bin/install -c checking whether build environment is sane... yes checking whether make sets ${MAKE}... yes checking for working aclocal... found checking for working autoconf... found checking for working automake... found checking for working autoheader... found checking for working makeinfo... found checking for gcc... gcc checking whether the C compiler (gcc ) works... yes checking whether the C compiler (gcc ) is a cross-compiler... no checking whether we are using GNU C... yes checking whether gcc accepts -g... yes checking for ranlib... ranlib checking how to run the C preprocessor... gcc -E checking for ANSI C header files... yes checking for unistd.h... yes checking for errno.h... yes checking for string.h... yes checking for working const... yes checking for size_t... yes checking for strerror... yes updating cache ./config.cache creating ./config.status creating Makefile creating replace/Makefile creating sic/Makefile creating config.h |
Compare this output with the contents of ‘configure.in’, and notice
how each macro is ultimately responsible for one or more consecutive
tests (via the Bourne shell code generated in ‘configure’). Now
that the ‘Makefile’s have been successfully created, it is safe to
call make
to perform the actual compilation:
$ make make all-recursive make[1]: Entering directory `/tmp/sic' Making all in replace make[2]: Entering directory `/tmp/sic/replace' rm -f libreplace.a ar cru libreplace.a ranlib libreplace.a make[2]: Leaving directory `/tmp/sic/replace' Making all in sic make[2]: Entering directory `/tmp/sic/sic' gcc -DHAVE_CONFIG_H -I. -I. -I.. -I.. -g -O2 -c builtin.c gcc -DHAVE_CONFIG_H -I. -I. -I.. -I.. -g -O2 -c error.c gcc -DHAVE_CONFIG_H -I. -I. -I.. -I.. -g -O2 -c eval.c gcc -DHAVE_CONFIG_H -I. -I. -I.. -I.. -g -O2 -c list.c gcc -DHAVE_CONFIG_H -I. -I. -I.. -I.. -g -O2 -c sic.c gcc -DHAVE_CONFIG_H -I. -I. -I.. -I.. -g -O2 -c syntax.c gcc -DHAVE_CONFIG_H -I. -I. -I.. -I.. -g -O2 -c xmalloc.c gcc -DHAVE_CONFIG_H -I. -I. -I.. -I.. -g -O2 -c xstrdup.c gcc -DHAVE_CONFIG_H -I. -I. -I.. -I.. -g -O2 -c xstrerror.c rm -f libsic.a ar cru libsic.a builtin.o error.o eval.o list.o sic.o syntax.o xmalloc.o xstrdup.o xstrerror.o ranlib libsic.a make[2]: Leaving directory `/tmp/sic/sic' make[1]: Leaving directory `/tmp/sic' |
On this machine, as you can see from the output of configure
above, I have no need of the fallback implementation of strerror
,
so ‘libreplace.a’ is empty. On another machine this might not be
the case. In any event, I now have a compiled ‘libsic.a’ – so
far, so good.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] |
This document was generated by Ben Elliston on July 10, 2015 using texi2html 1.82.