The type regoff_t should hold at least the same amount of bytes as off_t and ssize_t here: http://www.opengroup.org/onlinepubs/009695399/basedefs/regex.h.html It's defined in regex.h as 'int` so it won't hold off_t on a 64bit machine or a 32bit machine where off_t 64bit support is enabled (#define _FILE_OFFSET_BITS 64)
This is known but obviously cannot easily be fixed. Suspended until somebody takes this serious to actually take a stab at a solution.
You mean, it cannot be easily fixed because it breaks the ABI?
(In reply to comment #2) > You mean, it cannot be easily fixed because it breaks the ABI? Yes.
On bug-gnulib, the following suggestion was made by Bruno Haible: > [glibc could] offer some preprocessor macro that makes regoff_t 64-bit wide - > like it was done for off_t. > > Would glibc need to export additional symbols for this? Yes. > > Would a compiled glibc need to contain two copies of the regex code? No, the > 32-bit version could be a thin wrapper around the 64-bit version. I guess this would count as "somebody takes this serious to actually take a stab at a solution". Would _REGEX_OFFSET_BITS be okay for you as a macro?
*** Bug 12900 has been marked as a duplicate of this bug. ***
The original bug report is old, and POSIX has changed in the meantime: regoff_t is now required to be at least as large as ptrdiff_t and ssize_t. (Previously this was off_t and ssize_t.) See: http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/regex.h.html
Yes, thanks for updating/clarifying that. Is there any chance of this ever getting fixed? I suspect there may even be obscure vulnerabilities related to this, if you can somehow pass a string longer than 4gb to regexec and cause the matches to get truncated, and thus for the caller to either dereference memory at a negative offset, exposing data it should not, or treating non-matching data early in the string as a match. Obviously these could be closed by making the interface even more non-conforming and rejecting offsets that would overflow, but I think the proper solution is to add a versioned symbol and fix the type.