Memmove causing program crashes, giving SIGTRAP in GDB(?)
KENNON J CONRAD
kennonconrad@comcast.net
Fri Feb 27 02:27:02 GMT 2026
Yes, I still have the gdb session running for the crash.
Five threads are in the function "add_suffix". Code lines are as follows:
1. node_ptr = &nodes[*sibling_node_num_ptr];
2. while (node_ptr->child_node_num != 0) {
3. new_node_ptr->sibling_node_num[1] = 0;
4. return;
5. if (*(node_symbol_ptr + 1) == *(in_symbol_ptr + 1)) {
One thread is in rank_scores_thread and giving the SIGTRAP in the memmove function
Mainline is in score_base_node_tree_cap at this line:
node_instances = node_ptr->instances;
Threads 1 - 5 do not have calls to memmove, memcopy or memset in the C code, although I'd need to check the assembly code to be sure these are not called. Mainline does have some mem library calls but these are only used at points in the code where all other threads have exited. So I don't immediately see anything that looks particularly suspect.
For now I'm going to investigate this information from Google AI since the errors are occuring on a Haswell architecture i7-4790K:
Intel Haswell (and related architectures) processors may experience stability issues, including machine check errors (MCEs), due to a microcode bug related to REP MOVS (specifically REP MOVSB or REP MOVSQ) handling. These issues often cause system crashes or lockups, leading to microcode, BIOS/UEFI updates to resolve them.
Issue: A high-rate of interrupts or specific memory operations can cause REP MOVS instructions to trigger Machine Check Errors (MCE) or internal errors (IERR) on older processors.
Affected Processors: The bug primarily impacts older Intel processors, including Haswell and Broadwell architectures.
Fix/Mitigation: The primary solution is to apply the latest motherboard BIOS/UEFI update, which contains the corrected microcode update (often labelled 20180108 or later).
Best Regards,
Kennon
> On 02/26/2026 1:42 PM PST Dimitry Andric <dimitry@unified-streaming.com> wrote:
>
>
> If such a crash occurs, can you do a "thread apply all bt" in gdb? This will show what all the other threads are doing. I'm betting some other thread is calling memcpy or some other function that is messing with the direction flag.
>
> -Dimitry
>
> > On 26 Feb 2026, at 21:47, KENNON J CONRAD <kennonconrad@comcast.net> wrote:
> >
> > Yes, lots. 7 threads were running at the point of the crash 87% load on my i7-4790k. I did a little research since the last post. The memmove code where the crash occurs is:
> >
> > 0x00007ff96ba812a8 <+136>: std
> > => 0x00007ff96ba812a9 <+137>: rep movsq %ds:(%rsi),%es:(%rdi)
> > 0x00007ff96ba812ac <+140>: cld
> >
> > This sets the direction flag immediately before the rep movsq and clears the direction flag immediately after the rep movsq. Yet when gdb breaks it shows the direction flag is not set:
> >
> > eflags 0x246 [ PF ZF IF ]
> >
> > Would a forward move on overlapping data cause the SIGTRAP? Could the code have moved to a different core? Or could it have been interrupted by some other task that corrupts the flag? As I mentioned earlier, the rep movsq is only failing once per several million times memmove is called so it seems likely to be something along those lines.
> >
> > -Kennon
> >
> >
> >> On 02/26/2026 12:20 PM PST Dimitry Andric <dimitry@unified-streaming.com> wrote:
> >>
> >>
> >> Is there some concurrency going on? Maybe some other part of the program is flipping the direction flag?
> >>
> >> -Dimitry
> >>
More information about the Cygwin
mailing list