This is the mail archive of the
cgen@sources.redhat.com
mailing list for the CGEN project.
[patch][rfa]: Decoding (not-so) ambiguous insns in sid/sim
- From: Dave Brolley <brolley at redhat dot com>
- To: sid at sources dot redhat dot com, cgen at sources dot redhat dot com
- Date: Wed, 02 Jan 2002 18:03:51 -0500
- Subject: [patch][rfa]: Decoding (not-so) ambiguous insns in sid/sim
Hi,
You may recall a patch which I submitted (and committed, after approval)
which allowed the decoder for the opcodes-based disassembler to to
distinguish between insns when the set of decodable bits of one insn is
a superset of the decodable bits of another.
http://sources.redhat.com/ml/binutils/2001-11/msg00406.html
This situation occurs when one insn is a specialization of another. This
patch adds the same capability to the decoders used by the cgen-based
simulators in the sim and sid source trees. While the disassembler
solution was as simple as sorting a list of insns, the simulators use a
switch statement which is generated by a tree constructed by a cgen
application to decode the insns. This patch modifes the cgen application
which generates the tree.
The patch can be divided into several pieces:
1) The current tree generator calls a function,
filter-harmlessly-ambiguous-insns, on each tree node before creating
further sub-trees from the insns remaining in the node. This function
removes any insn whose decodable bits are a strict superset of another
insn in the list. It turns out that these insns are not so "harmlessly
ambiguous" for all ISAs. For example, an architecture may define memory
access using a base register, but may also define that the base register
be incremented if a particular register (stack pointer) is used. That
particular instance (specialization) of the insn would have the same
decodable bits as the more general insn and would also require a
particular bit pattern for the field representing the base register. The
current tree generator removes the specialized insns completely by
calling "filter-harmlessly-ambiguous-insns". This patch removes that
call (decode.scm:516) and adds additional code to handle the apparently
ambiguous insns (decode.scm:574, decode.scm:595). This additional code
filters identical insns and chooses a specialized insn over its more
general cousin in the same tree node (The same preference used in the
previous opcodes patch). The more general insn will still be decodable,
since it will appear in other tree nodes which do not contain the
specialized insn.
2) Supporting code for the above is in the new functions in insn.scm
which perform the filtering. Note that
"filter-harmlessly-ambiguous-insns" is no longer used, but I did not
remove it (yet). I will remove it, if no one thinks it could possibly be
useful.
These changes alone result in a tree generator that works, however, not
filtering the previously-believed-to-be-harmlessly-ambiguous insns
exposes a problem in the tree generator which is that, for these insns
it goes on to attempt to use every single insn bit in an effort to
distinguish the insns before reaching the point where it applies the
additional logic that I added. For a 32 bit ISA, this results in 2^n
tree nodes generated (where n is the number of non-decodable bits),
which is clearly unacceptable, not to mention that the additional tree
nodes are all identical and resolve nothing! For one particular port,
the tree generation, which previously took a few minutes now took over
12 hours.
The root of the problem is that the heuristic which chooses bits to use
will eventually choose bits which have no effect on the decoding of the
insn. There is a heuristic function which computes a value representing
the usefulness of each bit to the decoding process. It does correctly
rate these bits as the least useful, but the fact remains that they will
eventually be chosen anyway, as the other usful bits are exhausted. It
turns out that the heuristic function assigns these bits a value of
zero. Unfortunately, the existing heuristic function also assigns the
value zero to the bits representing the specialized insn fields that
were interested in. Two changes were necessary:
1) The change to decode.scm:317 ignores bits which are assigned the
heuristic value of zero.
2) The heuristic function was changed to prioritize bits into 3 categories:
a) bits which are true decodable bits
b) bits which are used in specialization of an insn
c) useless bits
The previous heuristic counted the number of times in the ISA that a bit
was required to be zero (p0) and the number of times it was required to
be one (p1). The function then computed the geometric mean (sqrt (p0 *
p1)). You can see that the more often a bit is constant in an insn, the
higher this value will be. You can also see that if a bit is never
constant in an insn (useless for decoding), the result will be zero. For
a bit which is in a specialized field of an insn, however, one of p0 or
p1 will be zero, resulting in an overal heurstic value of zero. We need
to modify the function so that these bits receive a value greater than
zero, but less that the value assigned to a truly decodable bit. The new
function (decode.scm:241) works as follows:
if (p0 + p1 == 0) // useless for decoding
result = 0;
else if (p0 * p1 == 0) // specialization bit: heuristic range: 0 <=
h < num_insns
result = num_insns - (p0 + p1);
else // decodable bit -- heuristic range: num_insns < h
result = num_insns + (sqrt (p0 * p1));
So the heuristic now chooses decodable bits first, followed by
specialization bits and ignores the useless bits. Note also that a bit
which is always 1 or always 0 is also useless for decoding and will be
assigned a value of zero (since p0 or p1 will be equal to num_insns).
This brought the build time for the port which had ballooned to 12 hours
back down to its normal few minutes.
The result of this patch should be:
o No effect for ports with no specialized insns. The new heuristic will
result in the same prioritization of bits as before.
o Slightly larger decoder tree for ports with specialized insns. One
extra tree level for each node containing a specialized insn. I know of
only one such port. For that port, the specializations are meaningless
since the the semantics of the specialized insn are the as those of the
general insn. This is why the previous filtering implementation was not
a problem. I have tested this port with no regressions.
o The port I'm developing now works as expected :-)
I know this is a complex patch, so please feel free to ask questions.
I'm seeking approval to commit this.
Dave
2002-01-02 Dave Brolley <brolley@redhat.com>
* decode.scm (-distinguishing-bit-population): Compute num-insns, the
number of insns in the list. Update the population count function to
identify and prioritize 3 catgories of useful bits.
(-population-top-few): Don't consider bits with a population count of
zero.
(-build-decode-table-entry): Don't call
filter-harmlessly-ambiguous-insns. Filter out non-specialized and
identical insns at the next tree level.
* insn.scm (filter-harmlessly-ambiguous-insns): Note in a comment that
this function is no longer used.
(filter-non-specialized-ambiguous-insns): New function.
(filter-identical-ambiguous-insns): New function.
(find-identical-insn): New function.
Index: cgen/decode.scm
===================================================================
RCS file: /cvs/src/src/cgen/decode.scm,v
retrieving revision 1.3
diff -c -p -b -r1.3 decode.scm
*** cgen/decode.scm 2000/11/10 16:43:21 1.3
--- cgen/decode.scm 2002/01/02 20:58:04
***************
*** 222,228 ****
(define (-distinguishing-bit-population masks mask-lens values lsb0?)
(let* ((max-length (apply max mask-lens))
(0-population (make-vector max-length 0))
! (1-population (make-vector max-length 0)))
; Compute the 1- and 0-population vectors
(for-each (lambda (mask len value)
(logit 5 " population count mask=" (number->hex mask) " len=" len "\n")
--- 222,229 ----
(define (-distinguishing-bit-population masks mask-lens values lsb0?)
(let* ((max-length (apply max mask-lens))
(0-population (make-vector max-length 0))
! (1-population (make-vector max-length 0))
! (num-insns (length masks)))
; Compute the 1- and 0-population vectors
(for-each (lambda (mask len value)
(logit 5 " population count mask=" (number->hex mask) " len=" len "\n")
***************
*** 240,247 ****
(list->vector
(map (lambda (p0 p1)
(logit 4 p0 "/" p1 " ")
! ; (sqrt (+ p0 p1 (* p0 p1))) ; funny function - nice curve
! (sqrt (* p0 p1))) ; geometric mean
(vector->list 0-population) (vector->list 1-population))))
)
--- 241,262 ----
(list->vector
(map (lambda (p0 p1)
(logit 4 p0 "/" p1 " ")
! ; The most useful bits for decoding are those with counts in both
! ; p0 and p1. These are the bits which distinguish one insn from
! ; another. Assign these bits a high value (greater than num-insns).
! ;
! ; The next most useful bits are those with counts in either p0
! ; or p1. These bits represent specializations of other insns.
! ; Assign these bits a value between 0 and (num-insns - 1). Note that
! ; p0 + p1 is guaranteed to be <= num-insns.
! ;
! ; Bits with no count in either p0 or p1 are useless for decoding
! ; and should never be considered. Assigning these bits a value of
! ; 0 ensures this.
! (cond
! ((= (+ p0 p1) 0) 0)
! ((= (* p0 p1) 0) (- num-insns (+ p0 p1)))
! (else (+ num-insns (sqrt (* p0 p1))))))
(vector->list 0-population) (vector->list 1-population))))
)
***************
*** 302,311 ****
" picks=(" old-picks ") pop=(" remaining-population ")"
" threshold=" count-threshold " new-picks=(" new-picks ")\n")
(cond
; No new matches?
((null? new-picks)
(if (null? old-picks)
! (logit 1 "-population-top-few: No bits left to pick from!\n"))
old-picks)
; Way too many matches?
((> (+ (length new-picks) (length old-picks)) (+ size 3))
--- 317,332 ----
" picks=(" old-picks ") pop=(" remaining-population ")"
" threshold=" count-threshold " new-picks=(" new-picks ")\n")
(cond
+ ; No point picking bits with population count of zero. This leads to
+ ; the generation of layers of subtables which resolve nothing. Generating
+ ; these tables can slow the build by several orders of magnitude.
+ ((= 0 count-threshold)
+ (logit 2 "-population-top-few: count-threshold is zero!\n")
+ old-picks)
; No new matches?
((null? new-picks)
(if (null? old-picks)
! (logit 2 "-population-top-few: No bits left to pick from!\n"))
old-picks)
; Way too many matches?
((> (+ (length new-picks) (length old-picks)) (+ size 3))
***************
*** 495,501 ****
(define -build-decode-table-entry-args #f)
(define (-build-decode-table-entry insn-vec startbit decode-bitsize index index-list lsb0? invalid-insn)
! (let ((slot (filter-harmlessly-ambiguous-insns (vector-ref insn-vec index))))
(logit 2 "Processing decode entry "
(number->string index)
" in "
--- 516,522 ----
(define -build-decode-table-entry-args #f)
(define (-build-decode-table-entry insn-vec startbit decode-bitsize index index-list lsb0? invalid-insn)
! (let ((slot (vector-ref insn-vec index)))
(logit 2 "Processing decode entry "
(number->string index)
" in "
***************
*** 553,559 ****
; If bitnums is still nil there is an ambiguity.
(if (null? bitnums)
!
(begin
; If all insns are marked as DECODE-SPLIT, don't warn.
(if (not (all-true? (map (lambda (insn)
--- 574,589 ----
; If bitnums is still nil there is an ambiguity.
(if (null? bitnums)
! (begin
! ; Try filtering out insns which are more general cases of
! ; other insns in the slot. The filtered insns will appear
! ; in other slots as appropriate.
! (set! slot (filter-non-specialized-ambiguous-insns slot))
!
! (if (= 1 (length slot))
! ; Only 1 insn left in the slot, so take it.
! (dtable-entry-make index 'insn (car slot))
! ; There is still more than one insn in 'slot', so there is still an ambiguity.
(begin
; If all insns are marked as DECODE-SPLIT, don't warn.
(if (not (all-true? (map (lambda (insn)
***************
*** 565,571 ****
(string-append ", " (obj:name insn)))
slot))
"\n"))
! ; Things aren't entirely hopeless. See if any ifield-assertion
; specs are present.
; FIXME: For now we assume that if they all have an
; ifield-assertion spec, then there is no ambiguity (it's left
--- 595,608 ----
(string-append ", " (obj:name insn)))
slot))
"\n"))
! ; Things aren't entirely hopeless. We've warned about the ambiguity.
! ; Now, if there are any identical insns, filter them out. If only one
! ; remains, then use it.
! (set! slot (filter-identical-ambiguous-insns slot))
! (if (= 1 (length slot))
! ; Only 1 insn left in the slot, so take it.
! (dtable-entry-make index 'insn (car slot))
! ; Otherwise, see if any ifield-assertion
; specs are present.
; FIXME: For now we assume that if they all have an
; ifield-assertion spec, then there is no ambiguity (it's left
***************
*** 588,594 ****
(dtable-entry-make index 'expr
(exprtable-make
(-gen-exprtable-name exprtable-entries)
! exprtable-entries)))))
; There is no ambiguity so generate the subtable.
; Need to build `subtable' separately because we
--- 625,631 ----
(dtable-entry-make index 'expr
(exprtable-make
(-gen-exprtable-name exprtable-entries)
! exprtable-entries))))))))
; There is no ambiguity so generate the subtable.
; Need to build `subtable' separately because we
Index: cgen/insn.scm
===================================================================
RCS file: /cvs/src/src/cgen/insn.scm,v
retrieving revision 1.5
diff -c -p -b -r1.5 insn.scm
*** cgen/insn.scm 2001/09/18 03:17:12 1.5
--- cgen/insn.scm 2002/01/02 20:58:04
***************
*** 708,715 ****
; Filter out instructions whose ifield patterns are strict subsets of
! ; another. For decoding purpose, it is sufficient to consider the
; more general cousin.
(define (filter-harmlessly-ambiguous-insns insn-list)
(logit 3 "Filtering " (length insn-list) " instructions.\n")
--- 708,720 ----
; Filter out instructions whose ifield patterns are strict subsets of
! ; another. For decoding purposes, it is sufficient to consider the
; more general cousin.
+ ;
+ ; NOTE: Not currently used. This filtering is not harmless for ISAs
+ ; in which certain values of a given ifield change the semantics of
+ ; the insn. For example, encoding register 15 in a field normally
+ ; used for specifying a register may result in a different addressing mode.
(define (filter-harmlessly-ambiguous-insns insn-list)
(logit 3 "Filtering " (length insn-list) " instructions.\n")
***************
*** 735,740 ****
--- 740,799 ----
insn-list)
)
+ ; Filter out instructions whose ifield patterns are strict supersets of
+ ; another, keeping the less general cousin. Used to resolve ambiguity
+ ; when there are no more bits to consider.
+
+ (define (filter-non-specialized-ambiguous-insns insn-list)
+ (logit 3 "Filtering " (length insn-list) " instructions for non specializations.\n")
+ (find (lambda (insn)
+ (let* ((i-mask (insn-base-mask insn))
+ (i-mask-len (insn-base-mask-length insn))
+ (i-value (insn-value insn))
+ (subset-insn (find-first
+ (lambda (insn2) ; insn2: possible submatch (more mask bits)
+ (let ((i2-mask (insn-base-mask insn2))
+ (i2-mask-len (insn-base-mask-length insn2))
+ (i2-value (insn-value insn2)))
+ (and (not (eq? insn insn2))
+ (= i-mask-len i2-mask-len)
+ (mask-superset? i-mask i-value i2-mask i2-value))))
+ insn-list))
+ (keep? (not subset-insn)))
+ (if (not keep?)
+ (logit 2
+ "Instruction " (obj:name insn) " specialization-filtered by "
+ (obj:name subset-insn) "\n"))
+ keep?))
+ insn-list)
+ )
+
+ ; Filter out instructions whose ifield patterns are identical.
+
+ (define (filter-identical-ambiguous-insns insn-list)
+ (logit 3 "Filtering " (length insn-list) " instructions for identical variants.\n")
+ (let loop ((l insn-list) (result nil))
+ (cond ((null? l) (reverse! result))
+ ((find-identical-insn (car l) (cdr l)) (loop (cdr l) result))
+ (else (loop (cdr l) (cons (car l) result)))
+ )
+ )
+ )
+
+ (define (find-identical-insn insn insn-list)
+ (let ((i-mask (insn-base-mask insn))
+ (i-mask-len (insn-base-mask-length insn))
+ (i-value (insn-value insn)))
+ (find-first
+ (lambda (insn2)
+ (let ((i2-mask (insn-base-mask insn2))
+ (i2-mask-len (insn-base-mask-length insn2))
+ (i2-value (insn-value insn2)))
+ (and (= i-mask-len i2-mask-len)
+ (= i-mask i2-mask)
+ (= i-value i2-value))))
+ insn-list))
+ )
; Helper function for above: does (m1,v1) match a STRICT superset of (m2,v2) ?
;