Bug 29494 - Trailing jump table leads to "Error: unaligned opcodes detected in executable segment" on ARM thumb
Summary: Trailing jump table leads to "Error: unaligned opcodes detected in executable...
Alias: None
Product: binutils
Classification: Unclassified
Component: gas (show other bugs)
Version: 2.39
: P2 normal
Target Milestone: ---
Assignee: Nick Clifton
Depends on:
Reported: 2022-08-15 11:31 UTC by Angus Gratton
Modified: 2022-08-30 12:47 UTC (History)
2 users (show)

See Also:
Last reconfirmed: 2022-08-23 00:00:00

ltrans assembler listing that exhibits the problem (25.24 KB, text/plain)
2022-08-15 11:31 UTC, Angus Gratton
Proposed Patch (27.64 KB, patch)
2022-08-22 15:18 UTC, Nick Clifton
Details | Diff
Simpler patch (719 bytes, patch)
2022-08-23 07:00 UTC, Alan Modra
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Angus Gratton 2022-08-15 11:31:52 UTC
Created attachment 14281 [details]
ltrans assembler listing that exhibits the problem

When LTO is enabled then it's possible for gcc to generate a jump table where the table data is the last thing in its executable section. (i.e. all of the jump table entries point to offsets earlier in the section, and no other instructions appear after it.)

On ARM thumb (min 2 byte instructions), if a jump table with single byte entries has an odd number of entries then the section ends at an odd numbered offset.

If also generating DWARF debug info, it appears the check in gas/dwarf2db.c scale_addr_delta() will then fail with "unaligned opcodes detected in executable segment".

However, the assembly listing is otherwise correct.

I ran across this in the MicroPython project, with a code snippet in objint_mpz.c:315 as follows:

        switch (op) {
            case MP_BINARY_OP_LESS:
                return mp_obj_new_bool(cmp < 0);
            case MP_BINARY_OP_MORE:
                return mp_obj_new_bool(cmp > 0);
            case MP_BINARY_OP_LESS_EQUAL:
                return mp_obj_new_bool(cmp <= 0);
            case MP_BINARY_OP_MORE_EQUAL:
                return mp_obj_new_bool(cmp >= 0);
            case MP_BINARY_OP_EQUAL:
                return mp_obj_new_bool(cmp == 0);

                return MP_OBJ_NULL; // op not supported

With -Os and no LTO, gcc 12.1 generates a jump table, and emits more assembly after it:

        .loc 1 315 9 view .LVU1204
        bl      __gnu_thumb1_case_uqi
        .byte   (.L109-.L105)/2
        .byte   (.L108-.L105)/2
        .byte   (.L107-.L105)/2
        .byte   (.L106-.L105)/2
        .byte   (.L104-.L105)/2
        .p2align 1
        .loc 1 317 17 is_stmt 1 view .LVU1205
        .loc 3 781 24 view .LVU1206
        .loc 3 782 5 view .LVU1207
        .loc 3 782 30 is_stmt 0 view .LVU1208
        cmp     r3, #0
        blt     .LCB2415
        b       .L70    @long jump

With LTO enabled, the local translations further optimise and rearrange until the jump table appears at the end of the executable listing for the enclosing function and section:

	.loc 2 315 9 is_stmt 1 view .LVU1105
	ldr	r2, [sp, #16]
	cmp	r2, #4
	bhi	.L13
	movs	r0, r2
	bl	__gnu_thumb1_case_sqi
	.byte	(.L109-.L111)/2
	.byte	(.L108-.L111)/2
	.byte	(.L118-.L111)/2
	.byte	(.L106-.L111)/2
	.byte	(.L104-.L111)/2
        .size   mp_obj_int_binary_op, .-mp_obj_int_binary_op
        .file 10 "<built-in>"
        .section        .debug_info,"",%progbits

As a result, the executable section ends with an odd length, and linking fails due to the assembler DWARF generation error:

build-NUCLEO_F091RC/firmware.elf.ltrans1932.ltrans.s: Assembler messages:
build-NUCLEO_F091RC/firmware.elf.ltrans1932.ltrans.s: Error: unaligned opcodes detected in executable segment
make[1]: *** [/tmp/ccfzQYxO.mk:3866: build-NUCLEO_F091RC/firmware.elf.ltrans1932.ltrans.o] Error 1

Manually running "arm-none-eabi-as firmware.elf.ltrans1932.ltrans.s" produces the same error, and inserting a padding ".byte 0" in the listing after the jump table will fix the error.

It might be the case that this is gcc's fault for generating an executable section with an odd length, but given the table is the last thing in the section it seems reasonable not to pad it.

Indeed, if "-g" isn't passed to the compiler then the assembler produces valid binary code and everything works, it's only DWARF generation which fails.

I am wondering if a suitable patch would be to ignore this check if there are no additional opcodes following the odd offset, i.e. ignore the failure condition if it's positioned at the end of a section. If so, I'm happy to try and write that patch. However, I'm not very familiar with gas or DWARF so I'm guessing it might be much more complex than this!

Looks like I can only attach one file, so rather than an archive I've attached the ltrans assembler listing which exhibits the issue. If it's helpful then I can provide more example files, or write a simpler assembler listing to reproduce.

Thanks in advance for any insights!
Comment 1 Nick Clifton 2022-08-17 14:30:55 UTC
Hi Angus,

  Hmmm, this is a tricky one.

  The easiest solution for us would be to have gcc always ensure that the end of the jump table is aligned to a 2-byte boundary.  (It does this if the table is in the middle of the code, so why not at the end as well ?)

  Since that is unlikely to happen ... the next easiest thing to do would be to downgrade the error message to a warning message.  But users will still complain, even if they are now getting working binaries.

  The problem is - how can the assembler determine that this is the last byte of a code section, and that it is an offset byte and not an instruction.  Not a simple task.  Also we really need a generic solution since there is no reason why this should just be an ARM specific problem.

  Hmmm - do we know if .cfi_endproc will always appear at the end of a function ?  If so then maybe that pseudo-op could be extended to align the end of the function up to the next suitable boundary ?  I think that I need to think on this some more...

Comment 2 Nick Clifton 2022-08-22 15:18:23 UTC
Created attachment 14287 [details]
Proposed Patch

Hi Angus,

  Would you mind trying out the uploaded patch ?

  I think that I have found a way to suppress the check at the end of an executable section, which should be safe since there will not be any following code.  (And when the linker combines input executable sections it will make sure that they start on an suitable alignment boundary, so padding will be inserted automatically if needed).

Comment 3 Alan Modra 2022-08-23 07:00:25 UTC
Created attachment 14290 [details]
Simpler patch

Nick, I think you can just make use of line_delta == INT_MAX as a flag rather than setting a new one everywhere.
Comment 4 Nick Clifton 2022-08-23 08:50:39 UTC
(In reply to Alan Modra from comment #3)
> Nick, I think you can just make use of line_delta == INT_MAX as a flag
> rather than setting a new one everywhere.

Yeah - that is a better solution, thanks.

Angus - would you be able to confirm that this patch solves your problem ?  I am pretty confident that it works, but I would like some real-world assurance before committing it.
Comment 5 Angus Gratton 2022-08-23 11:18:49 UTC
Thanks heaps Nick for the patch. It's going to be a couple of days until I can test this, sorry. But I will definitely do so!
Comment 6 Angus Gratton 2022-08-28 03:39:42 UTC
Thanks heaps for being patient and for looking into this so quickly.

I just tested Alan's "Simpler patch" and it seems to solve the issue entirely.
Comment 7 cvs-commit@gcc.gnu.org 2022-08-28 11:38:39 UTC
The master branch has been updated by Alan Modra <amodra@sourceware.org>:


commit 6f6f5b0adc9efd103c434fd316e8c880a259775d
Author: Alan Modra <amodra@gmail.com>
Date:   Tue Aug 23 16:18:25 2022 +0930

    PR29494 Trailing jump table on ARM
    out_inc_line_addr and relax_inc_line_addr are passed INT_MAX as
    line_delta to flag end of section.  This filters its way down to
    size_inc_line_addr and emit_inc_line_addr.  Pass line_delta on to
    scale_addr_delta where it can be used to omit an unaligned opcode
            PR 29494
            * dwarf2dbg.c (scale_addr_delta): Delete unnecessary forward decl.
            Add line_delta param.  Don't print error at end of section, just
            round the address down.
            (size_inc_line_addr, emit_inc_line_addr): Adjust calls.
Comment 8 Alan Modra 2022-08-28 11:39:52 UTC
Patch pushed, I'll leave it to Nick to push the testcase.
Comment 9 cvs-commit@gcc.gnu.org 2022-08-30 12:46:37 UTC
The master branch has been updated by Nick Clifton <nickc@sourceware.org>:


commit 6f4eb56ec7d986030ea4f81f3d9f05180964bbc1
Author: Nick Clifton <nickc@redhat.com>
Date:   Tue Aug 30 13:46:11 2022 +0100

    Add a testcase for PR 29494.
            PR 29494
            * testsuite/gas/arm/pr29494.s: New test source file.
            * testsuite/gas/arm/pr29494.d: New test driver.
Comment 10 Nick Clifton 2022-08-30 12:47:39 UTC
Test case added.