This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Host endian independence

From: Damien Zammit <damien at zamaudio dot com>
To: Joseph Myers <joseph at codesourcery dot com>
Cc: libc-alpha <libc-alpha at sourceware dot org>
Date: Thu, 29 Aug 2019 14:45:11 +1000
Subject: Re: Host endian independence
References: <18c8a820-b2c4-8ab2-58d1-8d8c851dbf01@zamaudio.com> <alpine.DEB.2.21.1908271544290.18629@digraph.polyomino.org.uk>

Hi Joseph,

Thanks for your reply.

My goal is to introduce the endian-helpers and remove all code relating to byte-swapping,
then the existing byte-swapping interfaces may not be needed and could potentially be removed.

On 28/8/19 1:54 am, Joseph Myers wrote:
> I don't think these changes are appropriate.
> 
> I think we should make more use of the *existing* byte-swap interfaces, 
> such as be32toh and be64toh in <endian.h>, rather than inventing new ones.

Firstly, endian-helpers.h is not a byte-swapping interface, it is a collection of functions
that read/write streams in desired endian, which is something currently missing from glibc.  
I can see that your preference is to reuse existing byte-swap interfaces.
However, I am suggesting that byte-swapping, in general, is unnecessary and a kludge.
It also makes it very confusing for someone reading the code which endian a stream is
stored in when you are swapping based on the host machine's byte order.
Knowing the endianness of a stream in advance gives you the ability to write
one parsing function for each integer type within a stream that works on any host
and the compiler can optimize it.  The endianness of the stream can (and should only)
be handled at the interface where the stream is being converted from bytes to integer types
and vice versa, rather than having structs containing conditionally swapped bytes.

> By using those interfaces, tzfile.c, for example, could lose some of its 
> existing endian checks (that would be a very small local change to the 
> implementations of the decode and decode64 functions, larger changes are 
> not needed and make the code less clean because the logical information 
> that certain data is stored in the files in big-endian format is best kept 
> local to the implementations of those two functions, rather than 
> hardcoding that information in lots of places with read_be32 and read_be64 
> names).

If a stream coming from a file is stored in big endian, why not be explicit in
the naming of functions used to decode it so that it is clear which endian it is?

> ... (I'm not convinced that any changes in this area beyond very 
> minimal use of bswap_32 would improve the catgets code.)

I mostly agree, but I picked off two directories that were easy targets for my
host endian independent demonstration.  The purpose was to keep the code mostly the same,
*except* you can see there is now no more dependence on host BYTE_ORDER at all in the code.

--
Damien Zammit

Follow-Ups:
- Re: Host endian independence
  - From: Joseph Myers
- Re: Host endian independence
  - From: Yann Droneaud

References:
- Host endian independence
  - From: Damien Zammit
- Re: Host endian independence
  - From: Joseph Myers

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]