[PING] [PATCH] gdb/linespec: Improve sliding breakpoints
Six, Lancelot
Lancelot.Six@amd.com
Wed Jul 27 13:11:39 GMT 2022
[Public]
Hi
I was wondering if anyone has comments and or opinions about the change proposed in this patch.
Best,
Lancelot.
> -----Original Message-----
> From: Six, Lancelot <Lancelot.Six@amd.com>
> Sent: 17 June 2022 17:08
> To: gdb-patches@sourceware.org
> Cc: lsix@lancelotsix.com; Six, Lancelot <Lancelot.Six@amd.com>
> Subject: [PATCH] gdb/linespec: Improve sliding breakpoints
>
> Hi,
>
> Here is proposition to slightly update the breakpoint sliding algorithm
> in GDB.
>
> This patch must be applied on top of
> https://sourceware.org/pipermail/gdb-patches/2022-April/187686.html
> which was submitted by Simon Marchi a couple of months. This needs
> small updates as the BLOCK_ENTRY_PC and SYMBOL_BLOCK_VALUE are no
> longer used. As a consequence
>
> BLOCK_ENTRY_PC (SYMBOL_BLOCK_VALUE (sym))
>
> should be replaced with
>
> sym->value_block ()->entry_pc ()
>
> This patch have been regression tested on x86_64-Linux. All comments
> and feedbacks are welcome.
>
> Best,
> Lancelot.
>
> ---
>
> The algorithm used to place a breakpoint by line approximately goes as
> follow:
>
> - Look for all the PCs which maps directly to the desired line
> - If any PC is found, place a breakpoint at each found PC.
> - If no PC maps to the desired line:
> - Search for the lower line number in the line table which is higher
> than the desired line, and has some object code associated to it.
> - List all PCs which maps to this line number.
> - Place a breakpoint at each PC in the list above.
>
> There are situations where this heuristic is not sufficient. If we
> consider the following c++ code:
>
> 12 template<typename T>
> 13 T func (T a, T b)
> 14 {
> 15 /* Break here. */
> 16 if constexpr (std::is_integral_t<T>)
> 17 return a;
> 18 else
> 19 return b;
> 20 }
>
> Given the current algorithm, if the user tries to set a breakpoint at
> line 15, here is what happens:
>
> - Search PCs mapping to L15, there is none ;
> - Search for the first line number above L16 with code, this is L17 ;
> - List all PCs mapping to L17 and a breakpoints at insert a breakpoint at
> each ;
>
> Now lets suppose that this templated function has been instantiated both
> for an integral type (int) and non-integral type (float). If we follow
> the algorithm, no breakpoint will ever be inserted in func<float>,
> while there will be a breakpoint in func<int>.
>
> I came across a less obvious variation of this case where an inlined
> function had different optimization opportunities for different instances.
> As a consequence, some instances had the first actual line of code
> different than the other, making the breakpoint be missed and the user
> confused. To add to the confusion, the instances which have code mapped
> to the lowest line above the user specified linespec only appeared in a
> shared library conditionally loaded, making the breakpoint stop depend on
> whether or not the said shared library was loaded or not.
>
> This patch is a proposition to adapt the algorithm in order to cover the
> scenario exposed above. To do so, this patch ends up scanning
> independently for the best candidate in the various blocks composing
> the line table structures (a block being a succession of
> linetable_entries terminated by an entry with line == 0).
>
> The line table for the example program above looks something like:
>
> (gdb) maintenance info line-table missed
> symtab: /[...]/missed_pb.cc ((struct symtab *) 0x6210000f0500)
> linetable: ((struct linetable *) 0x6210001a1220):
> INDEX LINE ADDRESS IS-STMT PROLOGUE-END
> ...
> 17 END 0x00000000000011d3 Y
> 18 13 0x00000000000011d3 Y <-- func<float>
> 19 19 0x00000000000011e5 Y
> 20 20 0x00000000000011ea Y
> 21 END 0x00000000000011ec Y
> 22 13 0x00000000000011ec Y <-- func<int>
> 23 17 0x00000000000011fa Y
> 24 20 0x00000000000011fd Y
> 25 END 0x00000000000011ff Y
>
> The first block listed (starting at 0x11d3) is func<float>, and the
> second starting at 0x11ec is func<int>.
>
> The proposed algorithm analyzes each block independently. The first step
> is to decide if a block contains the line the user is interested in (a.k.a
> the target). In our example, this is L15. A block is considered to match
> if it contains at least one instruction mapped to a line before the target
> and at least one instruction mapped to a line after the target. In other
> words this means that will select all blocks where:
>
> - there is at least one instruction mapped to a line L such as L <= target
> - there is at least one instruction mapped to a line M such as M >= target
>
> For each block which satisfies this condition, we select the lowest line
> number above our target, and select all instructions mapping to this line
> number (there might be multiple). In our example the func<float> block
> satisfies the condition (we have line 13 and 19), so we insert breakpoints
> at the PCs mapped to L19. Similarly, the func<int> block is selected
> (thanks to L13 and L17), and we insert breakpoints at all PCs mapped to
> L17.
>
> The core of this heuristic is implemented in find_best_le_for_symtab_line.
>
> There are however some cases which are still problematic, where the
> first line after the target is the first line of the block we are
> looking for. In such situation the algorithm would find nothing where
> we would want to select 0x11d3 and 0x11ec (both mapping to L13). In
> order to solve this, the algorithm first looks for the lowest line
> number following the user specified target as this is done before this
> patch (here the first line with associated code following L12 which is
> L13). Once this is figured out, this new target (L13) is passed to the
> algorithm above (find_best_le_for_symtab_line).
>
> This is safe to do because we know that there is no line between the user
> target (included) and this new target. Therefore, whatever
> find_best_le_for_symtab_line returns, it must map to lines numbers above
> (inclusive) the new target.
>
> Finally, the testsuite showed an interesting regression. Let's suppose
> we have:
>
> 1 struct Foo { Foo () {} };
> 2 Foo f1;
> 3 void bar ()
> 4 {
> 5 /* break here. */
> 6 do_something ();
> 7 }
> 8 Foo f2;
>
> Inspecting the linetable, we have a code block containing L2 and L8 (so
> over line 5). If a user was to insert a breakpoint somewhere in L5, the
> algorithm shown above would also insert a breakpoint at L8. To solve
> this, this patch uses the fact that L2 and L8 are, from dwarf's
> perspective in a artificial block which corresponds to the static
> initialization. So if among multiple candidates, some are part of a
> virtual function and some are not, we only keep the ones in the non
> virtual function.
>
> This proposed new algorithm will not solve all possible cases. For
> example, if the func function is inlined multiple times in the same block
> for different template parameters, the lookup will process all of those as
> part of the same scope. Solving such situation would require heavier
> works. I think that this change, while not covering every scenarios is
> still a step in the right direction.
>
> All feedbacks and thoughts are welcome.
>
> Regression tested on x86_64 GNU/Linux.
>
> Change-Id: I8291f9e1a3522934a0f418dfc565cebb86cff730
> ---
> gdb/linespec.c | 93 ++++++++++++++++++-
> gdb/symtab.c | 45 +++++++++
> gdb/symtab.h | 14 +++
> .../gdb.linespec/breakpoint-sliding.cc | 70 ++++++++++++++
> .../gdb.linespec/breakpoint-sliding.exp | 82 ++++++++++++++++
> 5 files changed, 301 insertions(+), 3 deletions(-)
> create mode 100644 gdb/testsuite/gdb.linespec/breakpoint-sliding.cc
> create mode 100644 gdb/testsuite/gdb.linespec/breakpoint-sliding.exp
>
> diff --git a/gdb/linespec.c b/gdb/linespec.c
> index da9e35c5035..1e8679cd027 100644
> --- a/gdb/linespec.c
> +++ b/gdb/linespec.c
> @@ -2092,9 +2092,42 @@ create_sals_line_offset (struct linespec_state
> *self,
> if (intermediate_results.empty () && best_entry != NULL)
> {
> was_exact = false;
> - intermediate_results = decode_digits_ordinary (self, ls,
> - best_entry->line,
> - &best_entry);
> + /* We did not find an exact match for the line we are looking for.
> + As a consequence, we are now looking for the next best line.
> + However, the next best line does not have to be the same across
> + the entire program. If we consider the following code:
> +
> + 12 template<typename T>
> + 13 T select (T a, T b)
> + 14 {
> + 15 // Break here.
> + 16 if constexpr (std::is_integral_t<T>)
> + 17 return a;
> + 18 else
> + 19 return b;
> + 20 }
> +
> + For a function like the above, the next best line is L17 if T is
> + an integral type, but should be L19 if T is not. As a
> + consequence we try to search for the next best line independently
> + in each code block (a code block being a contiguous section of a
> + line table terminated by an entry with line == 0). */
> + for (const auto &elt : ls->file_symtabs)
> + {
> + program_space *pspace = elt->compunit ()->objfile ()->pspace;
> + for (const linetable_entry *le
> + : find_best_le_for_symtab_line (elt->linetable (),
> + best_entry->line))
> + {
> + symtab_and_line sal;
> + sal.pspace = pspace;
> + sal.symtab = elt;
> + sal.line = le->line;
> + sal.explicit_line = true;
> + sal.pc = le->pc;
> + intermediate_results.push_back (std::move (sal));
> + }
> + }
> }
>
> /* For optimized code, the compiler can scatter one source line
> @@ -2110,6 +2143,25 @@ create_sals_line_offset (struct linespec_state
> *self,
> gdb::def_vector<int> filter (intermediate_results.size ());
> gdb::def_vector<const block *> blocks (intermediate_results.size ());
>
> + /* Artificial code blocks can produce false positives. Consider
> +
> + 1 struct Foo { Foo () {} };
> + 2 Foo f1;
> + 3 void bar ()
> + 4 {
> + 5 // break here.
> + 6 do_something ();
> + 7 }
> + 8 Foo f2;
> +
> + In this example, if the user asked for a breakpoint in L5, the above
> + algorithm will select L6 and L8. L6 is selected because the we have
> + an artificial block (marked by DW_AT_artificial) generated to cover
> + the static initializations and this block covers code from line 2 to
> + line 8. To avoid this false positive, reject candidates in artificial
> + blocks if we also have candidates in non artificial blocks. */
> + bool any_nonartificial = false;
> +
> for (i = 0; i < intermediate_results.size (); ++i)
> {
> set_current_program_space (intermediate_results[i].pspace);
> @@ -2117,6 +2169,13 @@ create_sals_line_offset (struct linespec_state
> *self,
> filter[i] = 1;
> blocks[i] = block_for_pc_sect (intermediate_results[i].pc,
> intermediate_results[i].section);
> +
> + if (!any_nonartificial && blocks[i] != nullptr)
> + {
> + struct symbol *func = block_containing_function (blocks[i]);
> + any_nonartificial
> + = (func != nullptr && !func->is_artificial ());
> + }
> }
>
> for (i = 0; i < intermediate_results.size (); ++i)
> @@ -2171,6 +2230,34 @@ create_sals_line_offset (struct linespec_state
> *self,
> && val.line < sym->line ())
> continue;
>
> + /* If we did not find an exact match, the above heuristic might
> + find some false positives.
> +
> + Lets consider (with Foo having a non trivial ctor):
> +
> + 9 struct Foo { Foo () {} };
> + 10 Foo f1;
> + 11 int bar ()
> + 12 {
> + 13 // Break here.
> + 14 return 42;
> + 15 }
> + 16 Foo f2;
> +
> + When the user tries to insert a breakpoint it L13, the
> + algorithm sees a code path from L10 to L16 (in static
> + initialization), and will consider L16 as a candidate.
> + However the user clearly intended to break in bar.
> Therefore,
> + if we see that there is at least 1 breakpoint in a non
> + artificial function (bar), ignore the hit in the static
> + initialization.
> + */
> + if (!was_exact
> + && any_nonartificial
> + && sym != nullptr
> + && sym->is_artificial ())
> + continue;
> +
> if (self->funfirstline)
> skip_prologue_sal (sal);
>
> diff --git a/gdb/symtab.c b/gdb/symtab.c
> index 03aa2a96b87..1b03e6fce8e 100644
> --- a/gdb/symtab.c
> +++ b/gdb/symtab.c
> @@ -3541,6 +3541,51 @@ find_pcs_for_symtab_line (struct symtab
> *symtab, int line,
> return result;
> }
>
> +/* See symtab.h. */
> +
> +std::vector<linetable_entry *>
> +find_best_le_for_symtab_line (struct linetable *lt, int line)
> +{
> + std::vector<linetable_entry *> res;
> +
> + if (lt == nullptr || line < 0)
> + return res;
> +
> + /* Tells if in the current block we found at least one PC mapping to a line
> + before LINE. */
> + bool found_preceding = false;
> + const int nitems = lt->nitems;
> + std::vector<linetable_entry *> best_candidates;
> +
> + for (int i = 0; i < nitems; ++i)
> + {
> + struct linetable_entry &item = lt->item[i];
> + found_preceding
> + = found_preceding || (item.line > 0 && item.line <= line);
> +
> + if (item.is_stmt && item.line >= line
> + && (best_candidates.empty ()
> + || item.line <= best_candidates.front ()->line))
> + {
> + if (!best_candidates.empty ()
> + && item.line < best_candidates.front ()->line)
> + best_candidates.clear ();
> + best_candidates.push_back (&item);
> + }
> + else if (item.line == 0)
> + {
> + if (found_preceding)
> + res.insert (res.end (), best_candidates.begin (),
> + best_candidates.end ());
> +
> + found_preceding = false;
> + best_candidates.clear ();
> + }
> + }
> +
> + return res;
> +}
> +
>
>
>
> /* Set the PC value for a given source file and line number and return true.
> Returns false for invalid line number (and sets the PC to 0).
> diff --git a/gdb/symtab.h b/gdb/symtab.h
> index ac902a4cbbe..94bea02cb43 100644
> --- a/gdb/symtab.h
> +++ b/gdb/symtab.h
> @@ -2624,6 +2624,20 @@ void iterate_over_symtabs (const char *name,
> std::vector<CORE_ADDR> find_pcs_for_symtab_line
> (struct symtab *symtab, int line, struct linetable_entry **best_entry);
>
> +/* Given a linetable, look for entries which can map to line LINE.
> +
> + The result is the set of entries which:
> + 1) Are part of a block (i.e. a sequence of entries terminated by an entry
> + with line == 0) containing at least one statement entry with line above
> + or equal to LINE, and at least one statement entry with line below or
> + equal to LINE.
> + 2) Are statement entries (is_stmt == 1)
> + 3) Have line equal to the lowest statement line number greater or equal to
> + LINE within the block they are part of. */
> +
> +std::vector<linetable_entry *> find_best_le_for_symtab_line
> + (struct linetable *lt, int line);
> +
> /* Prototype for callbacks for LA_ITERATE_OVER_SYMBOLS. The callback
> is called once per matching symbol SYM. The callback should return
> true to indicate that LA_ITERATE_OVER_SYMBOLS should continue
> diff --git a/gdb/testsuite/gdb.linespec/breakpoint-sliding.cc
> b/gdb/testsuite/gdb.linespec/breakpoint-sliding.cc
> new file mode 100644
> index 00000000000..3ae6b563cdf
> --- /dev/null
> +++ b/gdb/testsuite/gdb.linespec/breakpoint-sliding.cc
> @@ -0,0 +1,70 @@
> +/* This testcase is part of GDB, the GNU debugger.
> +
> + Copyright (C) 2022 Free Software Foundation, Inc.
> +
> + This program is free software; you can redistribute it and/or modify
> + it under the terms of the GNU General Public License as published by
> + the Free Software Foundation; either version 3 of the License, or
> + (at your option) any later version.
> +
> + This program is distributed in the hope that it will be useful,
> + but WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + GNU General Public License for more details.
> +
> + You should have received a copy of the GNU General Public License
> + along with this program. If not, see <http://www.gnu.org/licenses/>. */
> +
> +#if __cplusplus >= 201703L
> +#include <type_traits>
> +
> +template<typename T>
> +T
> +foo (T a, T b)
> +{
> + T ret;
> + /* In foo break here. */
> + if constexpr (std::is_integral_v<T>)
> + ret = a; /* break in foo<int>. */
> + else
> + ret = b; /* break in foo<NonIntegral>. */
> +
> + return ret;
> +}
> +
> +struct NonIntegral
> +{
> + int m;
> +};
> +
> +#endif
> +
> +struct St
> +{
> + St ()
> + { /* Empty. */ }
> +};
> +
> +St st1;
> +
> +static int
> +between_static_init ()
> +{
> + /* between_static_init break here. */
> + return 42; /* break in between_static_init. */
> +}
> +
> +St st2;
> +
> +int main ()
> +{
> +
> +#if __cplusplus >= 201703L
> + foo<int> (1, 2);
> + foo<NonIntegral> ({ 1 }, { 2 });
> +#endif
> +
> + between_static_init ();
> +
> + return 0;
> +}
> diff --git a/gdb/testsuite/gdb.linespec/breakpoint-sliding.exp
> b/gdb/testsuite/gdb.linespec/breakpoint-sliding.exp
> new file mode 100644
> index 00000000000..9b79c419e9b
> --- /dev/null
> +++ b/gdb/testsuite/gdb.linespec/breakpoint-sliding.exp
> @@ -0,0 +1,82 @@
> +# Copyright (C) 2022 Free Software Foundation, Inc.
> +
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program. If not, see <http://www.gnu.org/licenses/>.
> +
> +# This testcase exercises breakpoint sliding when the user tries to set
> +# a breakpoint at a line number which has no code associated to it. GDB
> +# should then place breakpoint at the first line following the one given
> +# by the user in each code block composing the program space, given that
> +# the code block "covers" the line asked by the user.
> +
> +if { [skip_cplus_tests] } {
> + return -1
> +}
> +
> +standard_testfile .cc
> +
> +# Part of the test require c++17. If the used compiler does not support
> +# it, fallback only doing part of the tests.
> +set skip_cpp17 0
> +
> +if {[prepare_for_testing "failed to prepare with c++17" $testfile \
> + $srcfile {debug c++ additional_flags=-std=c++17}]} {
> + set skip_cpp17 1
> + if {[prepare_for_testing "failed to prepare" $testfile $srcfile \
> + {debug c++}]} {
> + return -1
> + }
> +}
> +
> +proc_with_prefix test_sliding_per_block {} {
> + if {$::skip_cpp17} {
> + return
> + }
> +
> + set loc [gdb_get_line_number "In foo break here."]
> + gdb_test "break $loc" \
> + "Breakpoint $::decimal.* \\(2 locations\\)" \
> + "slide to if constexpr"
> +
> + set loc_in_int_instance [gdb_get_line_number "break in foo<int>"]
> + set loc_in_struct_instance [gdb_get_line_number "break in
> foo<NonIntegral>"]
> + set seen_int_pb 0
> + set seen_struct_bp 0
> + gdb_test_multiple "info breakpoint" "info breakpoint" {
> + -re "in foo<int>\\(int, int\\) at
> \[^\n\r\]*$::srcfile:$loc_in_int_instance" {
> + set seen_int_pb 1
> + verbose -log "found int"
> + exp_continue
> + }
> + -re "in foo<NonIntegral>\\(NonIntegral, NonIntegral\\) at
> \[^\n\r\]*$::srcfile:$loc_in_struct_instance" {
> + set seen_struct_bp 1
> + verbose -log "found struct"
> + exp_continue
> + }
> + -re -wrap "" {
> + gdb_assert {$seen_int_pb && $seen_struct_bp} $gdb_test_name
> + }
> + }
> +}
> +
> +proc_with_prefix test_ignore_artificial_block {} {
> + set loc [gdb_get_line_number "between_static_init break here"]
> + set expected_loc [gdb_get_line_number "break in between_static_init."]
> + gdb_test "break $loc" \
> + "Breakpoint $::decimal at $::hex: file .*/$::srcfile, line
> $expected_loc\\." \
> + "ignore static block"
> +}
> +
> +clean_restart $::binfile
> +test_sliding_per_block
> +test_ignore_artificial_block
>
> base-commit: 5fb28d2607a8325559b44a5dc0c8760236c81218
> prerequisite-patch-id: 695c74160156ff2c66b3a88462be22a36c32f141
> --
> 2.25.1
More information about the Gdb-patches
mailing list