Bug 25677 - Changes to --*magic options for pdp11-aout target
Summary: Changes to --*magic options for pdp11-aout target
Status: RESOLVED FIXED
Alias: None
Product: binutils
Classification: Unclassified
Component: ld (show other bugs)
Version: 2.35
: P2 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-03-15 00:00 UTC by Stephen Casner
Modified: 2020-04-14 13:45 UTC (History)
1 user (show)

See Also:
Host:
Target: pdp11-aout
Build:
Last reconfirmed: 2020-03-25 00:00:00


Attachments
Proposed enhancement (3.92 KB, patch)
2020-03-15 00:00 UTC, Stephen Casner
Details | Diff
make check summary log (1.17 KB, text/plain)
2020-03-28 17:04 UTC, Stephen Casner
Details
ld make check summary log (2.58 KB, text/plain)
2020-03-29 00:17 UTC, Stephen Casner
Details
Completed patch including tests and doc (7.02 KB, patch)
2020-04-09 01:24 UTC, Stephen Casner
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Stephen Casner 2020-03-15 00:00:37 UTC
Created attachment 12378 [details]
Proposed enhancement

This enhancement consists of two changes related to the ld magic
number options.

First, the default format for the pdp11-aout target should be --omagic
rather than --nmagic to be consistent with the behavior of the ld
programs both in Unix v6 at the beginning of Unix on the PDP11 and in
2.11 BSD at its end.  That format does not page-align the .data
section and generates magic number 0407 which indicates to the runtime
system that the .text section should not be read-only.  I've changed
the default only for the pdp11 emulation.

Second, I have added a new option --imagic to generate output with
magic number 0411 (defined as IMAGIC in Unix v6 ld.c) that puts both
the .text and .data sections starting at address 0 for loading into
the separate instruction and data address spaces of larger PDP11
models.  For ld on Unix v6 this format was selected with option -i,
but that option already has another meaning in binutils ld.  In 2.11
BSD the option -z was added as a synonym for -i, so within the pdp11
emulation I also implemented that option as a synonym for --imagic.

These changes affect only the pdp11 emulation and they are implemented
all within an expansion of the customization files for the pdp11
emulation except for two one-line changes in the common code as
explained below.

The expansion of the pdp11 customization consists of adding a pdp11.em
file containing _before_parse() to adjust the default config settings
consistent with --omagic.  It also contains functions _add_options(),
_list_options() and _handle_option() to implement the additional
options including printing pdp11-specific help for all of the --*magic
options.  I also needed a new _get_script() function that compiles-in
a sixth linker script where the .data section starts at address 0.
Since genscripts does not produce such a script, when compiling-in the
scripts I reuse the script that aligns the data to a page boundary but
edit it to set address 0 instead.

Related to this, I observe that the code in the adjust_*_magic()
functions in bfd/aoutx.h and bfd/pdp11.c that sets the VMA for data is
not effective because the alignment directive in the linker script
overrides it.

I also added a pdp11.sc file instead of using aout.sc so I could
remove some sections that are extraneous for the pdp11 emulation.

Comments are requested regarding the following design choices:

  - I have used the existing boolean ld_config_type::separate_code to
    indicate when --imagic is seen in the option parsing.  To me its
    meaning seemed consistent with the new way I'm using it, but I
    could add another boolean if that would be preferred.

  - Rather than define another flag bit in bfd_target::object_flags,
    I'm copying the state of the separate_code boolean into a local
    variable in bfd/pdp11.c at _final_link() time for reference later
    in adjust_sizes_and_vmas() as the sections are written.

  - Since the --imagic format intentionally causes the .text and .data
    sections to occupy the same memory addresses, I had to set
    command_line.check_section_addresses = 0.  Would that cause any
    other problems?

  - In my generated epdp11.c, the function name prefix is gldpdp11
    rather than gld_pdp11 since that is how the code was written in
    the emulation from which I borrowed code.  But I also saw the
    underscore included in a different emulation.  Is one of these
    considered correct?

The two one-line changes that I needed to make in common code are as
follows:

  - I needed to add i_magic to enum aout_magic in bfd/libaout.h.

  - In ld/lexsup.c, I needed to set config.text_read_only to TRUE in
    case 'n'.  This is the same as the default value of that boolean
    as initialized in ldmain.c so I assume my change won't affect any
    non-pdp11 emulation.  I need this change so that my change of the
    default value for pdp11 will work.  That boolean is initialized to
    FALSE or TRUE in case 'N' and case OPTION_NO_OMAGIC.

I have not yet implemented changes to the documentation or test
cases pending review of the code.  The diff is attached.
Comment 1 Nick Clifton 2020-03-25 17:42:41 UTC
Hi Stephen,

  Thanks very much for creating this patch.  The changes themselves
  look fine, although there is one issue - a linker testsuite
  regression:

    FAIL: extract symbols

  This is from the ld/testsuite/ld-scripts section of the tests and
  happened for me when testing with a toolchain configured for
  "pdp11-dec-aout".  An examination of the test and the log file
  suggests that the problem is that file containing the extracted
  symbol also contains some data.  I am not sure why this would be
  happening, but please could you investigate.

  Another important issue is that in order to be able to accept the
  patch we will need an FSF Copyright assignment from you.  You can
  start this process by filling out the form here and then emailing
  it off:

http://git.savannah.gnu.org/gitweb/?p=gnulib.git;a=blob_plain;f=doc/Copyright/request-assign.changes;hb=HEAD

  The patch will definitely need to be extended to include some 
  documentation in the ld/ld.texi file.  It would also help if you
  added a sentence or two to the ld/NEWS file mentioning the new
  feature.

  Ideally we would also like to see a new linker test to check the
  behaviour of the new code.  I am not sure how easy this will be
  to write, but please do try.

Cheers
  Nick
Comment 2 Stephen Casner 2020-03-28 00:37:09 UTC
I could use some guidance regarding running the testsuite.  After reading the binutils porting document I have installed dejagnu, but I don't know what information is needed in the global config, for example.  Although I have a few simulated PDP11 environments set up with different operating systems, none of them have FTP and TELNET servers operating, as required for remote target testing.  Was the test failure you observed found just in host-based validity checks on the output files?
Comment 3 Stephen Casner 2020-03-28 17:04:56 UTC
Created attachment 12412 [details]
make check summary log

I should add that I did run "make check" without any other preparation with the results as in the attached summary.  More than one error was reported, which is different than the one error you cited.  Also, I don't know if "Running target unix" and "Assuming target board is the local machine" are the desired conditions.
Comment 4 Stephen Casner 2020-03-29 00:17:29 UTC
Created attachment 12414 [details]
ld make check summary log
Comment 5 Nick Clifton 2020-03-30 12:12:50 UTC
Hi Stephen,

  Judging by the attached log, you seem to have gotten the hang of running the linker testsuite.  Do not worry that there are quite a few failures for the pdp11-aout target.  This is fairly normal with older targets that do not have active maintainers.

  The important issue is the extract symbols failure in the script.exp tests:

    Running ../../ld/testsuite/ld-scripts/script.exp ...
    [...]
    FAIL: extract symbols

According to my logs this failure is new.  Ie it only happens after applying your patch.  To find out more about the failure, have a look in the ld.log file which should be in the ld directory.  When the test passes the log should contain lines like this:

     [...]
    /home/nickc/work/builds/binutils/current/pdp11-dec-aout/ld/../binutils/size  tmpdir/extract
    Executing on host: sh -c {/home/nickc/work/builds/binutils/current/pdp11-dec-aout/ld/../binutils/size  tmpdir/extract 2>&1}  /dev/null ld.tmp (timeout = 300)
    spawn [open ...]
       text	   data	    bss	    dec	    hex	filename
          0	      0	      0	      0	      0	tmpdir/extract
       text	   data	    bss	    dec	    hex	filename
          0	      0	      0	      0	      0	tmpdir/extract
    PASS: extract symbols

Note how the data field is zero.  What I found with your patch applied is that this field is now 4.  So presumably some new data has been added to the "extract" binary.

It may be that the new behaviour is correct, the data should be there.  Note however that the extract binary has been created by running "objcopy --extract-symbol" on another file, (see earlier in ld.log), so normally you would only expect symbols to be present in the output file, not data.  So you need to work out if the presence of the data represents a bug or not.

If the data should be there you will need to update the ld/testsuite/ld-scripts/script.exp file.  The extract_symbol_test procedure (starting at line 110) is where you need to look, and the specific test starts at line 174.  There are some examples above of adding "expected failure" results for other targets (and for earlier parts of the test), so you can use those as a guide.

Cheers
  Nick
Comment 6 Stephen Casner 2020-03-30 16:10:40 UTC
The key point I did not know is that failures were expected.  I have been studying the detail log and the test scripts (I have programmed in Tcl, but it was about 30 years ago) to see why FAIL: extract scripts and the two FAIL: MEMORY failures preceding it occur.  The symbol extraction is probably caused by my introduction of pdp11.sc so I can investigate that.  The other two occur because the tests assume a word size and address space larger than 16 bits.  For example, in ld-scripts/script.exp the "MEMORY with symbols" test sets DATA_LENGTH=0x10000 but I think the choice of a number larger that 16 bits is arbitrary.  Would it be reasonable to introduce changes to make the tests compatible with a 16-bit address space?

For the test "MEMORY", the failure occurs because nm is not properly treating addresses as 16 bits, and that might be considered a real bug.  Here's the output:

00001004 D data_end
00001000 T data_start
00001000 D data_symbol
00001000 D fred
00000104 T text_end
00000100 T text_start
00000100 T text_symbol
00000100 t tmpdir/script.o
ffffffffffff8100 D tred

For the PDP-11 with 16-bit address space this should more properly be:

1004 D data_end
1000 T data_start
1000 D data_symbol
1000 D fred
0104 T text_end
0100 T text_start
0100 T text_symbol
0100 t tmpdir/script.o
8100 D tred

There is pretty clearly a bug where 0x8100 gets sign extended to 64 bits.

One side question you might be able to answer for me: where is "fail" implemented?  I searched but could not find it.
Comment 7 Nick Clifton 2020-03-30 16:30:01 UTC
Hi Stephen

(In reply to Stephen Casner from comment #6)
  
> One side question you might be able to answer for me: where is "fail"
> implemented?  I searched but could not find it.

Look in: /usr/share/dejagnu/framework.exp

Cheers
  Nick
Comment 8 Stephen Casner 2020-04-05 05:03:24 UTC
(In reply to Nick Clifton from comment #5)
>   The important issue is the extract symbols failure in the script.exp tests:
> 
>     Running ../../ld/testsuite/ld-scripts/script.exp ...
>     [...]
>     FAIL: extract symbols
> 
> According to my logs this failure is new.  Ie it only happens after applying
> your patch.

The aspect of my patch that exposed this failure is the change of the default output format to --omagic.  The test would have failed for the code without my patch if the previous test had done "ld --omagic -o script script.o".

The code that produces the erroneous data section size 4 rather than 0 is in bfd/pdp11.c function adjust_o_magic().  It was changed as part of the recent commit dda2980f54a by Alan Modra to fix PR 25569, but I'm not asserting that this commit was incorrect.  I'm not sure what should change.

For the failing test in question, objcopy is invoked with option --extract-symbol which should reduce the text and data sections to zero length.  In the output bfd section structures the size is set to zero but the VMAs are not.  Because objcopy also sets the user_set_vma flag to true, that causes the code in adjust_o_magic() to see that the difference between the vmas for data and bss is 4 while the size of data is 0, so it adds a padding of 4 to the data size.  That's what comes out as 4 in the a.out header.

Should objcopy also set the VMAs to zero?  Or should it not set the user_set_vma flags?
Comment 9 Stephen Casner 2020-04-05 17:27:11 UTC
Another solution, perhaps the most straightforwad one, would be to change the code in adjust_o_magic() to not pad the size of the data section if the size of the bss section is zero.
Comment 10 Nick Clifton 2020-04-07 10:08:40 UTC
(In reply to Stephen Casner from comment #9)
Hi Stephen,

> Another solution, perhaps the most straightforwad one, would be to change
> the code in adjust_o_magic() to not pad the size of the data section if the
> size of the bss section is zero.

That makes sense to me.  Would you care to submit a patch ?

Cheers
  Nick
Comment 11 Stephen Casner 2020-04-09 01:24:45 UTC
Created attachment 12448 [details]
Completed patch including tests and doc

Here is a patch against a current git pull that I believe includes all of the requested changes.
Comment 12 Nick Clifton 2020-04-09 11:11:50 UTC
(In reply to Stephen Casner from comment #11)
Hi Stephen

> Here is a patch against a current git pull that I believe includes all of
> the requested changes.

You appear to be submitting the same patch as Jan... :-)

The patch has the same problem as outlined here:

  https://sourceware.org/pipermail/binutils/2020-April/110597.html

Basically the definitions of coff_swap_scnhdr_[in|out] need to
be changed so that other COFF based targets will build.

Cheers
  Nick
Comment 13 Stephen Casner 2020-04-09 14:46:36 UTC
(In reply to Nick Clifton from comment #12)
> You appear to be submitting the same patch as Jan... :-)
> 
> The patch has the same problem as outlined here:
> 
>   https://sourceware.org/pipermail/binutils/2020-April/110597.html
> 
> Basically the definitions of coff_swap_scnhdr_[in|out] need to
> be changed so that other COFF based targets will build.

Nick, perhaps some wires got crossed here?  Perhaps you had two patches applied when you checked mine?  My patch should not have any effect on any COFF based targets because the changes are almost entirely within the files specific to the pdp11-aout target.  The only changes in common code are:

  - I needed to add i_magic to enum aout_magic in bfd/libaout.h.

  - In ld/lexsup.c, I needed to set config.text_read_only to TRUE in
    case 'n'.  This is the same as the default value of that boolean
    as initialized in ldmain.c so I assume my change won't affect any
    non-pdp11 emulation.  I need this change so that my change of the
    default value for pdp11 will work.  That boolean is initialized to
    FALSE or TRUE in case 'N' and case OPTION_NO_OMAGIC.

  - In include/aout/aout64.h I needed to define IMAGIC and add two lines
    in N_DATADDR to return 0 when the magic number is IMAGIC.
Comment 14 Sourceware Commits 2020-04-14 13:42:50 UTC
The master branch has been updated by Nick Clifton <nickc@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=fa1477dc34e6ce19b90ff0171074c295133730a3

commit fa1477dc34e6ce19b90ff0171074c295133730a3
Author: Stephen Casner <casner@acm.org>
Date:   Tue Apr 14 14:41:27 2020 +0100

    Fixes for the magic number used in PDP11 AOUT binaries.
    
            PR ld/25677
    include * aout/aout64.h (N_DATADDR): Add IMAGIC case.
    
    bfd     * pdp11.c: Add implementation of --imagic option.
            (adjust_o_magic): Fix objcopy --extract-symbol test.
            * libaout.h (enum aout_magic): Add i_magic.
    
    ld      * emulparams/pdp11.sh (SCRIPT_NAME): Change to pdp11.
            (EXTRA_EM_FILE): New, add emulation file pdp11.
            * scripttempl/pdp11.sc: New, derived from aout.sc without
            irrelevant input sections.
            * emultempl/pdp11.em (_add_options, _handle_option)
            (_list_options): New. Add options -z, --imagic for pdp11-aout.
            (_before_parse): Make --omagic be default instead of --nmagic.
            (_get_script): Modify special-case linker script for --imagic.
            * lexsup.c (parse_args): Explictly set config.text_read_only for -n.
            * ld.texi (Options): Add documentation of PDP11-specific options.
            (Options): Fix unrelated typo to --no-compact-branches.
            * gen-doc.texi: @set PDP11.
            * testsuite/ld-pdp11/pdp11.exp: New, start pdp11 testing.
            * testsuite/ld-pdp11/sections.s: New, source for options tests.
            * testsuite/ld-pdp11/imagic.d: New, test --imagic format.
            * testsuite/ld-pdp11/imagicz.d: New, test -z (imagic) format.
            * testsuite/ld-pdp11/nmagic.d: New, test --nmagic format.
            * testsuite/ld-pdp11/omagic.d: New, test --omagic format.
Comment 15 Nick Clifton 2020-04-14 13:45:56 UTC
Hi Stephen,

  Yes you were totally correct - I had the patches mixed up.  Sorry.

  I have now gone through your patch, found no problems and applied it.

Cheers
  Nick