Bug 28258

Summary: fseek(3) under certain circumstances can loop endlessly with wide character streams
Product: glibc Reporter: Ruud Harmsen <infor>
Component: stdioAssignee: Not yet assigned to anyone <unassigned>
Status: UNCONFIRMED ---    
Severity: normal    
Priority: P1    
Version: 2.34   
Target Milestone: ---   
Host: Target:
Build: Last reconfirmed:

Description Ruud Harmsen 2021-08-23 16:09:56 UTC
When doing an fseek to a position just after the start of a valid UTF-8 character, after that character itself has been read with fgetcw just before, fseek will get into an infinite loop.

Happens in glibc 2.28 (Debian), 2.31 (Mint 20.1 & Ubuntu 20.4), 2.33 (Ubuntu 20.10), 2.34 (Mint; glibc compiled from freshly downloaded GNU sources). 

Full description in 
https://rudhar.com/sfreview/siworin/siworin14.htm and 

Self-contained demonstration programs are in
https://rudhar.com/sfreview/siworin/src/ .
Comment 1 Ruud Harmsen 2021-09-29 12:29:08 UTC
Earlier on, the endless loop in fseek happened when I fseeked to an invalid byte after the start of a valid character.
Now however, I found it can also happen when going to a valid byte, that is after an invalid character.
The reverse situation.

Also, before, it happened in a testing program that wasn't a realistic representation of real life.
But now, the loop occurs when I tried to make a real-life application robust against invalid input. Every high-quality program should have such robustness. With GNU glibc 2.31, I can now only achieve that robustness in a weird, possibly even illegal way, because the proper way to do it causes this endless loop.

Therefore in my opinion this bug deserves more attention than it has received so far, and it should be fixed, urgently.

A more detailed explanation is in my web article https://rudhar.com/sfreview/siworin/siworin17.htm . A minimal self-contained demonstration of the bug is in https://rudhar.com/sfreview/siworin/src/, under the name siworin17.c . 

Meanwhile I know accurately where in the library sources the infinite loop occurs. More on that later.
Comment 2 Ruud Harmsen 2021-09-29 13:31:55 UTC
The loop is in source file libio/wfileops.c, function adjust_wide_data, and the repeated lines are 567, 568, 576, 582. Applies to GNU glibc versions 2.31 and 2.34.

More comments: http://rhar.info/sfreview/siworin/siworin18.htm .