get_pc_function_start(CORE_ADDR pc) try to get the function start for a special pc, but the function lookup_minimal_symbol_by_pc(CORE_ADDR pc) may return a minimal_symbol, which is not a function(e.g. a label in assembler code). So the fstart is not a function start address, too. This may cause a problem: in following code, GDB can not stop when try to next over Line 1.(lop2 and lop3 are mistaken for a function, so GDB thinks that it step into a new function, set a breakpoint at the address stored in register $ra, and run to it) Is this correct? ========================== #.globl hardware_hazard_hook .text .globl _start .ent _start _start: .set noreorder addiu v0, 1 addiu v0, 1 lop3: addiu v0, 1 addiu v0, 1 lop2: addiu v0, 1// Line 1 addiu v0, 1 lop1: addiu v0, 1 addiu v0, 1 addiu v0, 1 addiu v0, 1 nop ... ------------------------------- gdb/minsyms.c CORE_ADDR get_pc_function_start (CORE_ADDR pc) { struct block *bl; struct minimal_symbol *msymbol; bl = block_for_pc (pc); if (bl) { struct symbol *symbol = block_linkage_function (bl); if (symbol) { bl = SYMBOL_BLOCK_VALUE (symbol); return BLOCK_START (bl); } } msymbol = lookup_minimal_symbol_by_pc (pc); if (msymbol) { CORE_ADDR fstart = SYMBOL_VALUE_ADDRESS (msymbol); if (find_pc_section (fstart)) return fstart; } return 0; }
> get_pc_function_start(CORE_ADDR pc) try to get the function start for a > special pc, but the function > lookup_minimal_symbol_by_pc(CORE_ADDR pc) may return a minimal_symbol, which > is not a function(e.g. a label in assembler code). So the fstart is not a > function start address, too. The only way to tell apart a label from a function, is from the minimal symbol's size. Try stepping through lookup_minimal_symbol_by_pc_section_1, and see the comments there. > This may cause a problem: in following code, GDB can not stop when try to next > over Line 1.(lop2 and lop3 are mistaken for a function, so GDB thinks that it > step into a new function, set a breakpoint at the address stored in register > $ra, and run to it) Sounds like something else might be tricking GDB into thinking you stepped into a new function. See the code just below "Check for subroutine calls." part of infrun.c. That's where the logic to detect if the program called a new function is. I wonder if this related to the outermost heuristics, or something odd in the unwinder/backtrace. What does "bt" show when the program is stopped at the instruction just before lop2, and then again "bt" when you stepi to lop2?
> "bt" show when the program is stopped at the instruction just before lop2: #0 _start () at crt0.S:93 > then stepi (gdb) si warning: GDB can't find the start of the function at 0xfffffffc. > and then again "bt" when stepi to lop2? #0 lop2 () at crt0.S:95 #1 0xfffffffe in ?? () > Sounds like something else might be tricking GDB into thinking you stepped > into a new function. See the code just below "Check for subroutine calls." > part of infrun.c. That's where the logic to detect if the program called a > new function is. GDB use the frame id to "Check for subroutine calls", and the function frame_id_eq() will check the .code_addr,If .code addresses are different, the frames are different. If lop3 and lop2 are mistaken for function start address, the code address are different. So GDB thinking the program stepped into a new function gdb/frame.c: int frame_id_eq (struct frame_id l, struct frame_id r) { int eq; if (!l.stack_addr_p && l.special_addr_p && !r.stack_addr_p && r.special_addr_p) /* The outermost frame marker is equal to itself. This is the dodgy thing about outer_frame_id, since between execution steps we might step into another function - from which we can't unwind either. More thought required to get rid of outer_frame_id. */ eq = 1; else if (!l.stack_addr_p || !r.stack_addr_p) /* Like a NaN, if either ID is invalid, the result is false. Note that a frame ID is invalid iff it is the null frame ID. */ eq = 0; else if (l.stack_addr != r.stack_addr) /* If .stack addresses are different, the frames are different. */ eq = 0; else if (l.code_addr_p && r.code_addr_p && l.code_addr != r.code_addr) /* An invalid code addr is a wild card. If .code addresses are different, the frames are different. */ eq = 0; else if (l.special_addr_p && r.special_addr_p && l.special_addr != r.special_addr) /* An invalid special addr is a wild card (or unused). Otherwise if special addresses are different, the frames are different. */ eq = 0; else if (l.artificial_depth != r.artificial_depth) /* If artifical depths are different, the frames must be different. */ eq = 0; else /* Frames are equal. */ eq = 1; if (frame_debug) { fprintf_unfiltered (gdb_stdlog, "{ frame_id_eq (l="); fprint_frame_id (gdb_stdlog, l); fprintf_unfiltered (gdb_stdlog, ",r="); fprint_frame_id (gdb_stdlog, r); fprintf_unfiltered (gdb_stdlog, ") -> %d }\n", eq); } return eq; }
look the code in the function get_pc_function_start(CORE_ADDR pc) > CORE_ADDR > get_pc_function_start (CORE_ADDR pc) > { > ........... > ........... > ........... > ........... > ........... > ........... > msymbol = lookup_minimal_symbol_by_pc (pc); > if (msymbol) > { > CORE_ADDR fstart = SYMBOL_VALUE_ADDRESS (msymbol); > if (find_pc_section (fstart)) > return fstart; > } > return 0; >} the label lop2 and lop3 hava adress values, if the pc value is equal to the address of lop2 or lop3, the msymbol returned from lookup_minimal_symbol_by_pc() must be lop2 or lop3, then uses SYMBOL_VALUE_ADDRESS (msymbol) to get the address, and treats the address as function start address. I Think this is the problem, is it accurate?
> the code address are different. So GDB thinking the program stepped into a new > function That's not sufficient, the frame that was frame #0 before the step must be frame #1 after the step for GDB to consider this was a subroutine call. That's this part of the condition: && (frame_id_eq (frame_unwind_caller_id (get_current_frame ()), ecs->event_thread->control.step_stack_frame_id) If before the stepi you have: #0 _start () at crt0.S:93 and then after you have: #0 lop2 () at crt0.S:95 #1 0xfffffffe in ?? () Then I don't understand how that frame_id_eq returned true. Well, unless both were outer_frame_id. Please check that. I also don't understand why GDB thinks the function is _start just before the stepi, instead of lop3. What's different between lop3 and lop2? You need to step through lookup_minimal_symbol_by_pc_section_1 and understand that. > the label lop2 and lop3 hava adress values, if the pc value is equal to the > address of lop2 or lop3, the msymbol returned from > lookup_minimal_symbol_by_pc() must be lop2 or lop3, then uses > SYMBOL_VALUE_ADDRESS (msymbol) to get the address, and treats the address as > function start address. > I Think this is the problem, is it accurate? Not exactly. lookup_minimal_symbol_by_pc, if not returning the "real" function, then should be returning the closes label. That is, for all instructions between lop3 and lop2, it should return lop3, etc. But that shouldn't be a problem on its own, the other checks in the "Check for subroutine calls" bit should catch that. Unless, again, this is really the outer_frame_id bits triggering. outer_frame_id really should die...
> Unless, again, this is really the outer_frame_id bits triggering. > outer_frame_id really should die... BTW, if this is the case, this means that this issue only triggers when stepping through code in the outermost frame (the entry point). IOW, iIf your _start was actually some other function that was called by _start (so that it'd wouldn't be the outermost frame), this issue wouldn't trigger.
> Then I don't understand how that frame_id_eq returned true The frame_id_eq return false(eq == 0) according to the following condition: ----- else if (l.code_addr_p && r.code_addr_p && l.code_addr != r.code_addr) /* An invalid code addr is a wild card. If .code addresses are different, the frames are different. */ eq = 0; ----- and I try to delete this code, The problem disappears. > What's different between lop3 and lop2? No different from lop2 and lop3, only 2 labels. ------------------ the next case: when I single step in line 196, program run until exit: Breakpoint 1, zerobss () at crt0.S:196 196 sw v0, 0(s0) (gdb) l 191 nop 192 193 # Tell other cores it's ready 194 li v0, 1 195 LA (s0, flag_ready) 196 sw v0, 0(s0) 197 198 all_wait_1: 199 LA (s0, flag_ready) 200 lw v0, 0(s0) (gdb) l 196 191 nop 192 193 # Tell other cores it's ready 194 li v0, 1 195 LA (s0, flag_ready) 196 sw v0, 0(s0) 197 198 all_wait_1: 199 LA (s0, flag_ready) 200 lw v0, 0(s0) (gdb) set debug infrun 1 (gdb) s =pc:===ffffffffbfc0012c==== =func start===ffffffffbfc000d8==== infrun: clear_proceed_status_thread (Thread 1) infrun: proceed (addr=0xffffffff, signal=144, step=1) infrun: resume (step=1, signal=0), trap_expected=1, current thread [Thread 1] at 0xbfc0012c infrun: wait_for_inferior () infrun: target_wait (-1, status) = infrun: 42000 [Thread 1], infrun: status->kind = stopped, signal = SIGTRAP infrun: infwait_normal_state infrun: TARGET_WAITKIND_STOPPED infrun: stop_pc = 0xbfc00130 =pc:===ffffffffbfc00130==== =func start===ffffffffbfc00130==== =pc:===ffffffffbfc0011f==== =func start===ffffffffbfc000d8==== infrun: stepped into subroutine infrun: inserting step-resume breakpoint at 0xbfc00004 infrun: resume (step=0, signal=0), trap_expected=0, current thread [Thread 1] at 0xbfc00130 infrun: prepare_to_wait infrun: target_wait (-1, status) = infrun: 42000 [Remote target], infrun: status->kind = exited, status = 0 infrun: infwait_normal_state infrun: TARGET_WAITKIND_EXITED [Inferior 1 (Remote target) exited normally]
>BTW, if this is the case, this means that this issue only triggers when >stepping through code in the outermost frame (the entry point). IOW, iIf >your _start was actually some other function that was called by _start (so >that it'd wouldn't be the outermost frame), this issue wouldn't trigger. You are right! this issue only triggers when debugging assembler code!
I feels like you're either ignoring half my suggestions, or not reading carefully. It makes it hard for me to help you. > > What's different between lop3 and lop2? >No different from lop2 and lop3, only 2 labels. I'm well aware they're too labels. But what makes it so that for instructions between lop3 and lop2, gdb believes the function is _start, not lop3? You still haven't checked for outer_frame_id. > >BTW, if this is the case, this means that this issue only triggers when > >stepping through code in the outermost frame (the entry point). IOW, iIf > >your _start was actually some other function that was called by _start (so > >that it'd wouldn't be the outermost frame), this issue wouldn't trigger. > You are right! this issue only triggers when debugging assembler code! Sure, except that's not what I said.
> That's not sufficient, the frame that was frame #0 before the step must be > frame #1 after the step for GDB to consider this was a subroutine call. > That's this part of the condition: > > && (frame_id_eq (frame_unwind_caller_id (get_current_frame ()), > ecs->event_thread->control.step_stack_frame_id) > Then I don't understand how that frame_id_eq returned true. Well, unless > both were outer_frame_id. Please check that. I have checked it, frame_id_eq() return true, but frame_unwind_caller_id() and ecs->event_thread->control.step_stack_frame_id are not outer_frame_id. If frame_unwind_caller_id() can find a valid function address in register $ra, the returned frame id id equal to ecs->event_thread->control.step_stack_frame_id and the vlaue is: struct frame_id { stack_addr=0xffffffff; code_addr=0x80001470;//The address of _start(entry point) special_addr=0x0; stack_addr_p=0x1; code_addr_p=0x1; special_addr_p=0x0; artificial_depth=0x0; }; If can't find a valid function address in register $ra, GDB will print ============ warning: GDB can't find the start of the function at 0xfffffffc. GDB is unable to find the start of the function at 0xfffffffc and thus can't determine the size of that function's stack frame. This means that GDB may be unable to access that stack frame, or the frames below it. This problem is most likely caused by an invalid program counter or stack pointer. However, if you think GDB should simply search farther back from 0xfffffffc for code which looks like the beginning of a function, you can increase the range of the search using the `set heuristic-fence-post' command. ============ So,I think you are right,maybe something odd in the unwinder. But the unwinder is foreign for me, Can you give some advices?