Bug 32580 - [2.44 regression] Non-bash shell breaks many default linker scripts
Summary: [2.44 regression] Non-bash shell breaks many default linker scripts
Status: ASSIGNED
Alias: None
Product: binutils
Classification: Unclassified
Component: ld (show other bugs)
Version: 2.44
: P2 normal
Target Milestone: 2.44
Assignee: Nick Clifton
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2025-01-21 13:18 UTC by Rainer Orth
Modified: 2025-02-02 11:29 UTC (History)
2 users (show)

See Also:
Host:
Target: *-*-solaris2.11
Build:
Last reconfirmed:


Attachments
proposed patch (hack?) (452 bytes, patch)
2025-01-21 15:26 UTC, Rainer Orth
Details | Diff
Another proposed patch (411 bytes, patch)
2025-01-28 14:29 UTC, Nick Clifton
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Rainer Orth 2025-01-21 13:18:16 UTC
When trying the binutils 2.44 branch on Solaris, I found that roughly half the
ld test FAIL with

./ld-new:built in linker script:0: syntax error

While this error is as useless as it gets, it turns out that the same problem
happens with the default linker script files, too.  In particular, a link with
gcc -shared fails as above, while without it it works.

Comparing the generated linker scripts between 2.43 and 2.44, I see that e.g
elf_i386_sol2.xs is heavily truncated, like

@@ -94,144 +94,3 @@
   /* Adjust the address for the data segment.  We want to adjust up to
      the same address within the page on the next page up.  */
   . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE));
-  /* Exception handling.  */
-  .eh_frame       : ONLY_IF_RW { KEEP (*(.eh_frame)) *(.eh_frame.*) }
-  .sframe         : ONLY_IF_RW { *(.sframe) *(.sframe.*) }
[...]

As it turns out, this only happens when CONFIG_SHELL is /bin/ksh (or /bin/sh),
which is ksh93.  With /bin/bash, the linker scripts are generated correctly.

This is a regression from 2.43.

I'm still trying to figure out what exactly caused this.  So far I've found
that the the problem started with

commit fe217087a4b8aa214a221ca9f033c5fcdbcee90e
Author: Nick Clifton <nickc@redhat.com>
Date:   Wed Nov 27 11:23:38 2024 +0000

    Tidy up the default ELF linker script

One thing I noticed is that in two places (emit_noinit, emit_persistent)
the cat <<EOF construct has trailing whitespace.   However, this doesn't seem
to be the problem.
Comment 1 Rainer Orth 2025-01-21 15:26:04 UTC
> I'm still trying to figure out what exactly caused this.  So far I've found
> that the the problem started with
>
> commit fe217087a4b8aa214a221ca9f033c5fcdbcee90e
> Author: Nick Clifton <nickc@redhat.com>
> Date:   Wed Nov 27 11:23:38 2024 +0000
>
>     Tidy up the default ELF linker script
>
> One thing I noticed is that in two places (emit_noinit, emit_persistent)
> the cat <<EOF construct has trailing whitespace.   However, this doesn't seem
> to be the problem.

Found it, though I don't understand why this is a problem: by bisecting
the contents of emit_data, which emits the missing script snippet
starting from /* Exception handling.  */, I found that the call to
align_to_default_symbol_alignment at the end of the function causes the
problem.

Either inlining it here (ugly) or (better) moving it to the caller fixes
the problem.  I see that this has already been done for the second call
to align_to_default_symbol_alignment in the body of elf.sc, without an
explanation for the difference.
Comment 2 Rainer Orth 2025-01-21 15:26:58 UTC
Created attachment 15894 [details]
proposed patch (hack?)
Comment 3 Nick Clifton 2025-01-22 11:04:52 UTC
Hi Rainer,

  Hmm, there seems to be some inconsistency here.  If $(align_to_default_symbol_alignment) does not work inside a cat <<EOF..EOF block then why do $(align_to_default_section_alignment) or $(emit_large_bss 0) or any other other function invocations ?  (Maybe they don't ?)

  Patch approved, especially for the 2.44 branch, because I would to have the release work with Solaris.  But I think that it might need some more investigation.

  Would a workaround be to always require bash to process the script(s) ?  I am not sure if this is actually feasible as I bet that there are systems out there that do not have bash available. :-(

Cheers
  Nick
Comment 4 Sam James 2025-01-22 20:51:53 UTC
> 
>   Hmm, there seems to be some inconsistency here.  If
> $(align_to_default_symbol_alignment) does not work inside a cat <<EOF..EOF
> block then why do $(align_to_default_section_alignment) or $(emit_large_bss
> 0) or any other other function invocations ?  (Maybe they don't ?)
> 

I'm not following this bit either yet.
Comment 5 Sam James 2025-01-22 20:55:43 UTC
I see one other issue so far (== is not POSIX):
```
--- a/ld/scripttempl/elf.sc
+++ b/ld/scripttempl/elf.sc
@@ -912,7 +912,7 @@ emit_large_bss()
     return
   fi

-  if test "$1" == "0"; then
+  if test "$1" = "0"; then
     if test -n "${LARGE_BSS_AFTER_BSS}"; then
       return
     fi
```
Comment 6 Sourceware Commits 2025-01-23 00:05:03 UTC
The master branch has been updated by Sam James <sjames@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=6999916e6c7fe6ba3a7661d852757f59223416a3

commit 6999916e6c7fe6ba3a7661d852757f59223416a3
Author: Sam James <sam@gentoo.org>
Date:   Thu Jan 23 00:03:07 2025 +0000

    ld: fix bashism in scripttempl/elf.sc
    
    ld/
            PR ld/32580
    
            * scripttempl/elf.sc: Fix '==' bashism.
Comment 7 Sourceware Commits 2025-01-23 00:06:48 UTC
The binutils-2_44-branch branch has been updated by Sam James <sjames@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=3f1585d9c8e809cbdfc725583f361d51f2d50744

commit 3f1585d9c8e809cbdfc725583f361d51f2d50744
Author: Sam James <sam@gentoo.org>
Date:   Thu Jan 23 00:03:07 2025 +0000

    ld: fix bashism in scripttempl/elf.sc
    
    ld/
            PR ld/32580
    
            * scripttempl/elf.sc: Fix '==' bashism.
    
    (cherry picked from commit 6999916e6c7fe6ba3a7661d852757f59223416a3)
Comment 8 Alan Modra 2025-01-23 03:29:15 UTC
Why are we using "$(func)" rather than just plain "func"?  Maybe that tickles a ksh bug?
Comment 9 Andreas Schwab 2025-01-23 08:41:40 UTC
Because this is part of a here doc where func would be just a literal.
Comment 10 Andreas Schwab 2025-01-23 08:59:15 UTC
But of course, if it is at the end of the here doc anyway, it could also be moved out and called as a regular command after it.
Comment 11 Nick Clifton 2025-01-23 11:08:39 UTC
(In reply to Andreas Schwab from comment #10)
> But of course, if it is at the end of the here doc anyway, it could also be
> moved out and called as a regular command after it.

True - but there are other function invocations inside the here doc, not just at the end....
Comment 12 Andreas Schwab 2025-01-23 11:55:04 UTC
But the others are apparently not triggering the ksh bug.
Comment 13 Nick Clifton 2025-01-28 13:50:11 UTC
(In reply to Andreas Schwab from comment #12)
> But the others are apparently not triggering the ksh bug.

How strange.  But if not ending a here block with a function-like invocation is the solution then lets go with that.
Comment 14 Nick Clifton 2025-01-28 14:29:07 UTC
Created attachment 15903 [details]
Another proposed patch

Hi Rainer,

  Please could you try out this alternative patch and let
  me know if it works ?

Cheers
  Nick
Comment 15 Rainer Orth 2025-01-30 12:46:34 UTC
> --- Comment #14 from Nick Clifton <nickc at redhat dot com> ---
> Created attachment 15903 [details]
>   --> https://sourceware.org/bugzilla/attachment.cgi?id=15903&action=edit
> Another proposed patch

Hi Nick,

sorry for the delay: I was swamped.

>   Please could you try out this alternative patch and let
>   me know if it works ?

It does for the 32-bit scripts, but a large number of the 64-bit ones
are now cut short like this:

diff -ruwp ldscripts/elf_x86_64_sol2.xc ldscripts.ksh2/elf_x86_64_sol2.xc
--- ldscripts/elf_x86_64_sol2.xc        2025-01-30 13:21:50.118764636 +0100
+++ ldscripts.ksh2/elf_x86_64_sol2.xc   2025-01-30 13:14:46.555082683 +0100
@@ -203,67 +203,3 @@ SECTIONS
   . = ALIGN(64 / 8);
   /* Start of the Large Data region.  */
   . = SEGMENT_START("ldata-segment", .);
-  .lrodata   ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) :
-  {
-    *(.lrodata .lrodata.* .gnu.linkonce.lr.*)
-  }
[...]
Comment 16 Nick Clifton 2025-01-30 16:53:45 UTC
OK, I am baffled.  The missing text seems to have nothing to do with function-like invocations either inside or outside here blocks.  Could there be two ksh bugs ?
Comment 17 Sourceware Commits 2025-02-02 11:29:59 UTC
The binutils-2_44-branch branch has been updated by Nick Clifton <nickc@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=da36679b3240a7ffe9497ed63f3f522823b65d52

commit da36679b3240a7ffe9497ed63f3f522823b65d52
Author: Nick Clifton <nickc@redhat.com>
Date:   Sun Feb 2 11:29:51 2025 +0000

    PR 32580: Partial fix for problems with the ksh shell and the elf linker script