This is the mail archive of the
binutils@sourceware.org
mailing list for the binutils project.
Re: ELF octets_per_byte
- From: Dan <dgisselq at verizon dot net>
- To: "Maciej W. Rozycki" <macro at linux-mips dot org>
- Cc: dgisselq at ieee dot org, binutils at sourceware dot org
- Date: Wed, 24 Feb 2016 21:40:38 -0500
- Subject: Re: ELF octets_per_byte
- Authentication-results: sourceware.org; auth=none
- References: <1456242622 dot 30661 dot 448 dot camel at jericho> <alpine dot LFD dot 2 dot 20 dot 1602232312590 dot 7431 at eddie dot linux-mips dot org>
- Reply-to: dgisselq at ieee dot org
Maciej,
+AD4 Please also note that the ELF gABI+AFs-1+AF0 is very explicit about a byte being
+AD4 8-bits wide:
+AD4
+AD4 +ACI-As described here, the object file format supports various processors
+AD4 with 8-bit bytes and either 32-bit or 64-bit architectures.
+AD4 Nevertheless, it is intended to be extensible to larger (or smaller)
+AD4 architectures. Object files therefore represent some control data with a
+AD4 machine-independent format, making it possible to identify object files
+AD4 and interpret their contents in a common way. Remaining data in an object
+AD4 file use the encoding of the target processor, regardless of the machine
+AD4 on which the file was created.+ACI
+AD4
+AD4 so whenever it refers to a +ACI-byte+ACI I think it really means an octet,
+AD4 although I do see an ambiguity here as sometimes it uses the term to mean
+AD4 a target byte.
+AD4
Sigh. You don't need to convince me that +ACI-bytes+ACI versus +ACI-octets+ACI make
for a rather confusing nomenclature. They wouldn't be my first choice
of terms. I am all open to a better choice. In my own experience, as
in the above citation, +ACI-byte+ACI is used to mean 8-bits. It just appears
to be the term used within binutils, and gas in particular, to reference
the minimum addressable unit.
+AD4 +AD4 I also propose that the following values are in units of target address
+AD4 +AD4 space +ACI-bytes+ACI:
+AD4 +AD4
+AD4 +AD4 ELF header +ACI-entry+ACI address
+AD4 +AD4 section header address
+AD4 +AD4 symbol value
+AD4 +AD4 symbol size
+AD4 +AD4 relocation offset
+AD4 +AD4 relocation addend
+AD4
+AD4 These express target addresses or are directly related to them (e.g.
+AD4 offsets) and therefore I'm sure they're best expressed in whatever format
+AD4 your target uses. These IMHO certainly qualify as +ACI-remaining data+ACI
+AD4 referred to in the gABI citation included above.
+AD4
+AD4 So with the entry point for example I'd expect whatever representation a
+AD4 function pointer stored in memory would have on your target if the
+AD4 function pointed was the intended entry point. Likewise with VMAs and
+AD4 LMAs used in program headers, section headers, symbol tables, etc.
+AD4
+AD4 These do not necessarily have to be +ACI-proper+ACI memory addresses even, for
+AD4 example the MIPS processor encodes the execution mode in bit +ACM-0 of code
+AD4 addresses, so in certain cases the entry point in MIPS ELF binaries will
+AD4 have bit +ACM-0 set even though the memory location referred will have this
+AD4 bit clear. So it's really up to you to decide whatever encoding is the
+AD4 most appropriate for your architecture.
+AD4
+AD4 As to the symbol size I think it needs to be set to whatever the
+AD4 C-language's +AGA-sizeof' operator would return for a unit of storage of the
+AD4 same size.
+AD4
As I mentioned earlier this evening, the Zip CPU features:
sizeof(char)+AD0-sizeof(short)+AD0-sizeof(int)+AD0-sizeof(void +ACo)+AD0-1 // 32-bits
It's a ... unique architectural feature. :)
Dan
+AD4 References:
+AD4
+AD4 +AFs-1+AF0 +ACI-System V Application Binary Interface+ACI - DRAFT - 10 June 2013,
+AD4 Section +ACI-Data Representation+ACI
+AD4 +ADw-http://www.sco.com/developers/gabi/latest/ch4.intro.html+ACM-data+AF8-representation+AD4
+AD4
+AD4 HTH,
+AD4
+AD4 Maciej