Differences between revisions 12 and 13
Revision 12 as of 2009-03-02 19:27:23
Size: 11297
Revision 13 as of 2009-03-02 19:27:56
Size: 11294
Deletions are marked like this. Additions are marked like this.
Line 180: Line 180:
                                     struct printf_override_fns *pfo);                                      struct printf_overrides *pfo);

DRAFT: Full Featured Printf Hooks Design


The intention of this page is to serve as a starting point for identifying the scope of the printf-hooks extension design.

Useful Definitions

  • A format string is a combination of valid format specifier characters. In total it is a directive which tells printf how to display an argument in an argument list as a string, e.g.

    • "%0.16llx" - the argument is a long long int that should be output in hexadecimal format with lower case letters.  It should be zero padded and consume 16 columns.  It should be left justified.
  • A format specifier is a generic term for one or more characters which make up an operable grouping in a printf format string, e.g.

    • 'DD', and 'e' separately in "%DDe".
  • A conversion specification is a format specifier which identifies how to convert a data-type into a string. An overridden conversion specification may or may not be tied to an overridden length modifier.

  • A length modifier is a format specifier which identifies the data-type of an argument. A length modifier that is overridden is practically useless without an accompanying overridden convsersion specification.

  • A flag character is a format specifier which effects the output of the string in modifier ways, e.g. it may effect justification, padding, minimum column, etc.


  • Support overridden length modifiers.

    • Support single or multibyte format specifier, e.g. %H, %DD, %llv
    • Override arginfo functions marks flags indicating the argument data-type.
    • Consumes zero through n arguments.
    • I'm not sure what zero would indicate.

  • Support overridden conversion specifications.

    • Support single or multibyte format specifier, e.g. 'e' in %DDe, (no example for multi-byte).

    • Override arginfo function reads flags to detect operable data-type.
    • Override either operates on a data type or doesn't as indicated by arginfo function callback return value '0' or '1'.
    • Perform va_arg peeling.
    • Invoke override_fn for matching length-modifiers and conversion-specifications.
  • Support other overridden flag characters.

    • DFP, VSX, VMX (Altivec), and AVX don't require these as far as I can tell so they're a lower priority.

Design Preclusions

There are a number of preclusions which dictate the direction of the design. They are either definite or questionable. Questionable design preclusions should be finalized before this design document leaves DRAFT.


  • Performance of the fast-path shall not be impacted by the hooks override.
  • Positional arguments indicate positional-path, i.e. non-fast-path branch.
  • Overrides take positional-path and must account for positional arguments.
  • Format identifier overrides may consume zero or more arguments.
    • zero: conversion-specification will not operate and default will take over.
      one: Most likely case for most data-type specifications, e.g. DFP, Altivec.
      n: I'm not sure of the origin of this ability.
  • Allocations for overrides should only happen if an override is registered.
  • Tests for overrides should have branch prediction:
    • use __builtin_expect(<test>,0)
  • Introduction of new data-types and unknown ABI issues which prevent type-punning require the registration of a user va_arg function callback for a conversion specification override, e.g.
    • The PowerPC ABI indicates that _Decimal128 data-types be stored in even-odd register pairs, e.g. f2-f3, f4-f5, f6-f7.
    • IBM long double 128, a congruently sized data-type, does not have such a register requirement.
    • long double userarg = va_arg(*ap,long double); where list contains a _Decimal128 stored in f2-f3 may result in f1-f2 being stored into userarg erroneously.

  • Data type size congruence is not consistent and therefore type punning may not work. For instance, When long double is double:
    • sizeof(_Decimal128) == 16
      sizeof(long double) == 8
  • Overridden conversion specifications should still work for all data-types in the fast-path if the overridden conversion specification doesn't detect an overridden length-modifier ,e.g.
    • register: length-modifier "DD" : marks flag as DECIMAL128
      register: conversion-specification "e" : looks for flag DECIMAL128
      printf("%DDe\n",data) - operate on "data" as a _Decimal128 : processing "DD" set flag DECIMAL128.  Processing 'e' found flag DECIMAL128.
      printf("%e\n",data) - operate on "data" as a "double" since flag does not indicate DECIMAL128.


  • Structure definitions should not change if possible.
  • Format specifiers should only be ASCII range 32 to 127 (all inclusive). This would imply retrograde disabling of existing wchar_t 'spec' character in 'struct printf_info'.


  • An arginfo function may indicate that a data-type consume zero or more arguments. Zero arguments consumed indicates that the override has chosen to not operate on an argument for any number of reasons.
  • Length modifiers do not cause action by the printf internals, they are simply a way to mark an argument and allocate space for a va_arg peeling.
  • Conversion specifications indicate how a data-type is to be converted into a string. It causes the actual action by the printf internals. An accompanying arginfo_fn will look at the identification flags marked for an argument which identify a data-type. A va_arg function will actually peel said data-type off of an argument list and store it into storage indicated by the printf internals. Finally the print internals will call the override_fn that was registered along with the conversion specification in order to convert the data-type to a string in the appropriate manner.
  • A sizeof parameter should be passed when registering a length modifier so that the printf internals know how much space to allocate for each consumed argument.
    •  What kind of bounds checking should this perform, i.e. max size? 

  • Introduction of a user member to struct printf_info will require the following:

    • Since struct printf_info is passed in const the length-modifier arginfo_fn override sets length flags into __argstype in-out argument of arginfo_fn.

    • The printf internals will copy the length flags to the struct printf_info::user member.

    • When an override_fn is invoked for a conversion-specification it can read the flags out of struct printf_info::user to determine what data-type it is operating on.

  • You must be able to have multiple registrations to the override functions. The reason being that you may want your runtime to support both VMX and VSX data-types.
  • The overridden conversion-specifications should not get in the way of the default operability, e.g. the following should work just fine:
    • double d = 1.234;
      _Decimal128 d128 = 3.45DL;



struct printf_info
  int prec;                     /* Precision.  */
  int width;                    /* Width.  */
  wchar_t spec;                 /* Format letter.  */
  unsigned int is_long_double:1;/* L flag.  */
  unsigned int is_short:1;      /* h flag.  */
  unsigned int is_long:1;       /* l flag.  */
  unsigned int alt:1;           /* # flag.  */
  unsigned int space:1;         /* Space flag.  */
  unsigned int left:1;          /* - flag.  */
  unsigned int showsign:1;      /* + flag.  */
  unsigned int group:1;         /* ' flag.  */
  unsigned int extra:1;         /* For special use.  */
  unsigned int is_char:1;       /* hh flag.  */
  unsigned int wide:1;          /* Nonzero for wide character streams.  */
  unsigned int i18n:1;          /* I flag.  */
  wchar_t pad;                  /* Padding character.  */
  unsigned int user;            /* 'flag character' or 'length modifier' override flags.  */

struct printf_overrides
  /* flag-character: Unknown.  */
  /* length-modifier: Used for setting user data-type flags.  */
  /* conversion-specification: Used for checking for user data-type flags.  */
  printf_arginfo_function     *arginfo_fn;

  /* flag-character: Unknown.  */
  /* length-modifier: sizeof(data-type).  Indicates how much space will be allocated
   *                  prior to a va_arg call-back invocation from a companion conv spec.  */
  /* conversion-specification: Un-used.  */
  size_t                      size;

  /* flag-character: Unknown.  */
  /* length-modifier: Un-used.  */
  /* conversion-specification: Used to peel user data-type from argument list.  */
  printf_va_arg_function      *va_arg_fn;

  /* flag-character: Unknown.  */
  /* length-modifier: Un-used.  */
  /* conversion-specification: Invoked to convert user data-type to string.  */
  printf_function             *override_fn;

/* List of supported format overrides.  */
  PF_NONE, /* Don't use.  */
  PF_LAST /* Don't use.  This is a place holder.  */

/* Flag bits that can be set by a 'flag character' or 'length modifier' override.  
 * Corresponding bits are set into the arginfo function's __argstype parameter.
 * and are copied into the `struct printf_info::user' member after a valid override
 * is detected.  */

#define PA_USER_MASK            0xffff0000

/* SPEC_CHARS: string of characters denoting the 'format specifier' that is to be overriden.
 * NCHARS: the number of characters in the 'format specifier'.
 * TYPE: the type of 'format specifier' as indicated by the enums enumerated above.
 * PFO: a table of override data (which may or may not be applicable to a
 *          particular 'format specifier') and data-type size (if applicable).  */

extern int register_printf_override (int *spec_chars,
                                     int nchars,
                                     int type,
                                     struct printf_overrides *pfo);

Issues and Questions

  • Will we preserve the existing printf hooks registration method?
  • Would changing the definition of struct printf_info by extending it with a flags 'word' and changing spec from wchar_t to int require a struct versioning interface for the registration functions?

  • Due to the __const label on struct printf_info}} when calling the length-modifier arginfo_fn the user can-not set the {{{struct printf_info::flags member directly, but must set the length modifier flags into int * __argtypes and the printf_parsemb internals must copy the user flags to the struct printf_info::flags member. Since the user flags must lie in the mask 0xffff0000 do we want & them directly into struct printf_info::flags or do we want to shift them 16 right?

  • By only allowing user flags in 0xffff0000 we limit the number of user flags to 16. This is probably not adequate. Perhaps the addition of an "int *__user" parameter to the arginfo function would work.

None: PrintfHooksDesign (last edited 2009-03-02 19:27:56 by RyanScottArnold)