6. Opcodes support

Opcodes support comes in the form of machine generated opcode tables as well as supporting routines.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

6.1 Generated files

The basic interface is defined by ‘include/opcode/cgen.h’ which is included by the machine generated ‘<arch>-desc.h’. ‘opcode/cgen.h’ can stand on its own for the target independent stuff, but to get target specific parts of the interface use ‘<arch>-desc.h’.

The generated files are:

‘<arch>-desc.h’: Defines macros, enums, and types used to describe the chip.
‘<arch>-desc.c’: Tables of various things describing the chip. This does not include assembler syntax nor semantic information.
‘<arch>-ibld.c’: Routines for constructing and deconstructing instructions.
‘<arch>-opc.h’: Declarations necessary for assembly/disassembly that aren't used elsewhere and thus left out of ‘<arch>-desc.h’.
‘<arch>-opc.c’: Assembler syntax tables.
‘<arch>-asm.c’: Assembler support routines.
‘<arch>-dis.c’: Disassembler support routines.
‘<arch>-opinst.c’: Operand instance tables. These describe which hardware elements are read and which are written for each instruction. This file isn't generated for all architectures, only ones that can make use of the data. For example the M32R uses them to emit warnings if the output of one parallel instruction is the input of another, and to control creating parallel instructions during optimizing assembly.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

6.2 The .opc file

Files with suffix ‘.opc’ (e.g. ‘m32r.opc’) contain target specific C code that accompanies the cpu description file. The ‘.opc’ file is split into 4 sections:

- opc.h

This section contains additions to the generated ‘$target-opc.h’ file.

Typically defined here are these macros:

#define CGEN_DIS_HASH_SIZE N
Specifies the size of the hash table to use during disassembly. A hash table is built of the selected mach's instructions in order to speed up disassembly.

#define CGEN_DIS_HASH(buffer, value)

Given BUFFER, a pointer to the instruction being disassembled and VALUE, the value of the instruction as a host integer, return an index into the hash chain for the instruction. The result must be in the range 0 to CGEN_DIS_HASH_SIZE-1.

VALUE is only usable if all instructions fit in a portable integer (32 bits).

N.B. The result must depend on opcode portions of the instruction only. Normally one wants to use between 6 and 8 bits of opcode info for the hash table. However, some instruction sets don't use the same set of bits for all insns. Certainly they'll have at least one opcode bit in common with all insns, but beyond that it can vary. Here's a possible definition for sparc.

#undef CGEN_DIS_HASH_SIZE
#define CGEN_DIS_HASH_SIZE 256
#undef CGEN_DIS_HASH
extern const unsigned int sparc_cgen_opcode_bits[];
#define CGEN_DIS_HASH(buffer, insn) \
((((insn) >> 24) & 0xc0) \
 | (((insn) & sparc_cgen_opcode_bits[((insn) >> 30) & 3]) >> 19))

sparc_cgen_opcode_bits would be defined in the ‘asm.c’ section as

/* It is important that we only look at insn code bits
   as that is how the opcode table is hashed.
   OPCODE_BITS is a table of valid bits for each of the
   main types (0,1,2,3).  */
const unsigned int sparc_cgen_opcode_bits[4] = {
  0x01c00000, 0x0, 0x01f80000, 0x01f80000
};

- opc.c
- asm.c
This section contains additions to the generated ‘$target-asm.c’ file. Typically defined here are functions used by operands with a parse define-operand handler spec.
- dis.c
This section contains additions to the generated ‘$target-dis.c’ file.

Typically defined here these macros:
- #define CGEN_PRINT_NORMAL(cd, info, value, attrs, pc, length)
- #define CGEN_PRINT_ADDRESS(cd, info, value, attrs, pc, length)
- #define CGEN_PRINT_INSN function_name
- #define CGEN_BFD_ARCH bfd_arch_<name>
- #define CGEN_COMPUTE_ISA(info)

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

6.3 Special assembler parsing needs

Often parsing of assembly instructions requires more than what a program-generated assembler can handle. For example one version of an instruction may only accept certain registers, rather than the entire set.

Here's an example taken from the ‘m32r’ architecture.

32 bit addresses are built up with a two instruction sequence: one to load the high 16 bits of a register, and another to or-in the lower 16 bits.

seth r0,high(some_symbol)
or3  r0,r0,low(some_symbol)

When assembling, special code must be called to recognize the high and low pseudo-ops and generate the appropriate relocations. This is indicated by specifying a "parse handler" for the operand in question. Here is the define-operand for the lower 16 bit operand.

(define-operand
  (name ulo16)
  (comment "16 bit unsigned immediate, for low()")
  (attrs)
  (type h-ulo16)
  (index f-uimm16)
  (handlers (parse "ulo16"))
)

The generated parser will call a function named parse_ulo16 for the immediate operand of the or3 instruction. The name of the function is constructed by prepended "parse_" to the argument of the parse spec.

errmsg = parse_ulo16 (cd, strp, M32R_OPERAND_ULO16, &fields->f_uimm16);

But where does one put the parse_ulo16 function? Answer: in the ‘asm.c’ section of ‘m32r.opc’.

[ << ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

This document was generated by Doug Evans on January, 28 2010 using texi2html 1.78.

6.1 Generated files		List of generated files
6.2 The .opc file		Target specific C code
6.3 Special assembler parsing needs		Support for unusual syntax