[PATCH] Introduce struct packed template, fix -fsanitize=thread for per_cu fields

Thu Jul 7 15:26:02 GMT 2022

On 2022-07-07 11:18 a.m., Tom de Vries wrote:
> On 7/6/22 21:20, Pedro Alves wrote:
>> On 2022-07-04 8:45 p.m., Tom de Vries via Gdb-patches wrote:
>>> On 7/4/22 20:32, Tom Tromey wrote:
>>>>>>>>> "Tom" == Tom de Vries <tdevries@suse.de> writes:
>>>>
>>>> Tom>  /* The number of bits needed to represent all languages, with enough
>>>> Tom>     padding to allow for reasonable growth.  */
>>>> Tom> -#define LANGUAGE_BITS 5
>>>> Tom> +#define LANGUAGE_BITS 8
>>>>
>>>> This will negatively affect the size of symbols and so I think it should
>>>> be avoided.
>>>>
>>>
>>> Ack, Pedro suggested a way to avoid this:
>>> ...
>>> +  struct {
>>> +    /* The language of this CU.  */
>>> +    ENUM_BITFIELD (language) m_lang : LANGUAGE_BITS;
>>> +  };
>>> ...
>>>
>>
>> It actually doesn't avoid it in this case,
> 
> We were merely discussing the usage of LANGUAGE_BITS for general_symbol_info::m_language, and indeed using the "struct { ... };" approach avoids changing the LANGUAGE_BITS and introducing a penalty on symbol size (which is a more numerous entity than CUs).
> 

Yeah, sorry, I realized it after sending and decided I'd deserve the incoming cluebat.  :-)

> Still, of course it's also good to keep the dwarf2_per_cu_data struct as small as possible, so thanks for looking into this.

It's that, but also the desire to settle on some infrastructure or approach that we can reuse
going forward.

>> I have not actually tested this with -fsanitize=thread, though.  Would you
>> be up for testing that, Tom, if this approach looks reasonable?
>>
> 
> Yes, of course.
> 
> I've applied the patch and then started with my latest approach which avoid locks and uses atomics:

Thanks.

> ...
> diff --git a/gdb/dwarf2/read.h b/gdb/dwarf2/read.h
> index f98d8b27649..bc1af0ec2d3 100644
> --- a/gdb/dwarf2/read.h
> +++ b/gdb/dwarf2/read.h
> @@ -108,6 +108,7 @@ struct dwarf2_per_cu_data
>        m_header_read_in (false),
>        mark (false),
>        files_read (false),
> +      m_lang (language_unknown),
>        scanned (false)
>    {
>    }
> @@ -180,7 +181,7 @@ struct dwarf2_per_cu_data
>    packed<dwarf_unit_type, 1> m_unit_type = (dwarf_unit_type) 0;
> 
>    /* The language of this CU.  */
> -  packed<language, LANGUAGE_BYTES> m_lang = language_unknown;
> +  std::atomic<language> m_lang __attribute__((packed));
> 
>  public:
>    /* True if this CU has been scanned by the indexer; false if
> @@ -332,11 +333,13 @@ struct dwarf2_per_cu_data
> 
>    void set_lang (enum language lang)
>    {
> -    /* We'd like to be more strict here, similar to what is done in
> -       set_unit_type,  but currently a partial unit can go from unknown to
> -       minimal to ada to c.  */
> -    if (m_lang != lang)
> -      m_lang = lang;
> +    enum language nope = language_unknown;
> +    if (m_lang.compare_exchange_strong (nope, lang))
> +      return;
> +    nope = lang;
> +    if (m_lang.compare_exchange_strong (nope, lang))
> +      return;
> +    gdb_assert_not_reached ();
>    }
> 
>    /* Free any cached file names.  */
> ...
> 
> I've tried both:
> ...
>   packed<std::atomic<language>, LANGUAGE_BYTES> m_lang
>     = language_unknown;
> ...
> and:
> ...
>   std::atomic<packed<language, LANGUAGE_BYTES>> m_lang
>     = language_unknown;
> ...
> and both give compilation errors:
> ...
> src/gdb/dwarf2/read.h:184:58: error: could not convert ‘language_unknown’ from ‘language’ to ‘std::atomic<packed<language, 1> >’
>    std::atomic<packed<language, LANGUAGE_BYTES>> m_lang = language_unknown;
>                                                           ^~~~~~~~~~~~~~~~
> ...
> and:
> ...
> src/gdb/../gdbsupport/packed.h:84:47: error: bit-field ‘std::atomic<language> packed<std::atomic<language>, 1>::m_val’ with non-integral type
> ...
> 
> Maybe one of the two should work and the pack template needs further changes, I'm not sure.

Yes, I think std::atomic<packed<language, LANGUAGE_BYTES>> should work.  We need to write
the initialized using {}, like this:

  std::atomic<packed<language, LANGUAGE_BYTES>> m_lang {language_unknown};

and then we run into errors when comparing m_lang with enum language.  That is because
the preexisting operator==/operator!= would require converting from enum language to 
packed<language, LANGUAGE_BYTES>, and then from packed<language, LANGUAGE_BYTES> to
std::atomic<packed<language, LANGUAGE_BYTES>>.  That is two implicit conversions, but
C++ only does one automatically.  We can fix that by adding some operator==/operator!=
implementations.

I've done that in patch #1 attached.  I've also ditched the non-attribute-packed implementation.

> 
> Note btw that the attribute packed works here:
> ...
> +  std::atomic<language> m_lang __attribute__((packed));
> ...
> in the sense that it's got alignment 1:
> ...
>         struct atomic<language>    m_lang \
>           __attribute__((__aligned__(1))); /*    16     4 */
> ...
> but given that there's no LANGUAGE_BITS/BYTES, we're back to size 4 for the m_lang field, and size 128 overall.
> 
> So for now I've settled for:
> ...
> +  std::atomic<LANGUAGE_CONTAINER> m_lang;
> ...
> which does get me back to size 120.
> 
> WIP patch attached.

Please find attached 3 patches:

#1 - Introduce struct packed template
#2 - your original patch, but using struct packed, split to a separate patch. commit log updated.
#3 - a version of your std::atomic WIP patch that uses std::atomic<packed>

Patches #1 and #2 pass the testsuite cleanly for me.  Patch #3 compiles, but
runs into a couple regressions due to the gdb_assert_not_reached in set_lang
being reached.  I am not surprised since that set_lang code in your patch
looked WIP and I just blindly converted to the new approach to show the code
compiles.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0003-std-atomic-packed.patch
Type: text/x-patch
Size: 1597 bytes
Desc: not available
URL: <https://sourceware.org/pipermail/gdb-patches/attachments/20220707/18014e77/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-gdb-symtab-Fix-fsanitize-thread-for-per_cu-fields.patch
Type: text/x-patch
Size: 5151 bytes
Desc: not available
URL: <https://sourceware.org/pipermail/gdb-patches/attachments/20220707/18014e77/attachment-0001.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Introduce-struct-packed-template.patch
Type: text/x-patch
Size: 7099 bytes
Desc: not available
URL: <https://sourceware.org/pipermail/gdb-patches/attachments/20220707/18014e77/attachment-0002.bin>