This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] manual: Document optind zero set behaviour (BZ#23157)
On 23/05/2018 14:52, Carlos O'Donell wrote:
> On 05/21/2018 03:38 PM, Adhemerval Zanella wrote:
>> POSIX [1] does not explicit state the expected way to rescans the same
>> vector more than once. FreeBSD [2], for instance, exports a non-standard
>> variable 'optreset' which must be set to '1' prior the second and each
>> additional call to 'getopt'. GLIBC in turn requires the program to
>> reset 'optind' to 0 instead (and POSIX states the behavior is unspecified).
>
> I see 5 getopt test cases that use optind = 1 to reparse the options.
>
> Is it optind = 1 or optind = 0?
>
> Can you verify the exact behaviour by adding a specific test case that
> tests for *just* this particular behaviour?
The issue is getopt uses information tied to argv internal contents to
determine whether to advance to next argv element. On the testcase from
BZ#23157 it does:
---
void test_getopt(int argc, char **argv, const char *optstring, int expected) {
int oc = getopt(argc, argv, optstring);
if (oc == expected) {
printf("PASS getopt = %c (%d)\n", oc, oc);
} else {
printf("FAIL getopt = %c (%d), expected %c (%d)\n",
oc, oc, expected, expected);
}
}
int main(int ac, char **av) {
int argc;
char *argv[16];
char test[16];
argv[0] = "ignored-1";
strcpy(test, "-a");
argv[1] = test;
argv[2] = "non-option-1";
argv[3] = NULL;
argc = 3;
/* As expected */
test_getopt(argc, argv, "ab", 'a');
/* As expected */
test_getopt(argc, argv, "ab", -1);
argv[0] = "ignored-2";
argv[1] = "non-option-2";
argv[2] = NULL;
argc = 2;
optind = 1;
strcpy(test, "-ab");
/* Fails, as __nextchar is still pointing into 'test' */
test_getopt(argc, argv, "ab", -1);
return 0;
}
---
After second getopt called by test_getopt, __nextchar will point to &test[2] which
'\0'. Without changing 'test', the behaviour would be to indicate the next argv
should be used as:
posix/getopt.c
492 if (d->__nextchar == NULL || *d->__nextchar == '\0')
493 {
494 /* Advance to the next ARGV-element. */
495
496 /* Give FIRST_NONOPT & LAST_NONOPT rational values if OPTIND has been
497 moved back by the user (who may also have changed the arguments). */
498 if (d->__last_nonopt > d->optind)
However, since program explicit changed 'test' value, the __nextchar on third
getopt invocation will yield 'b' instead and thus will basically invalidate
the algorithm. Without changing test contents, setting optind to 1 works
unless the man-pages noted cases.
Now I am not sure if program is abusing of getopt semantic, or glibc is tying
with information it should (the input argument), or if it just undefined
behaviour.
>
>> Unfortunately this is not documented on the manual, only on man-pages [3]
>> (on NOTES). This patch adds an explanation of this behavior on manual.
>>
>> * manual/getopt.texi: Document optind zero set behaviour.
>>
>> [1] http://pubs.opengroup.org/onlinepubs/9699919799/
>> [2] https://www.freebsd.org/cgi/man.cgi?getopt(3)
>> [3] http://man7.org/linux/man-pages/man3/getopt.3.html
>> ---
>> ChangeLog | 4 ++++
>> manual/getopt.texi | 6 ++++++
>> 2 files changed, 10 insertions(+)
>>
>> diff --git a/manual/getopt.texi b/manual/getopt.texi
>> index 5485fc4..a4f6366 100644
>> --- a/manual/getopt.texi
>> +++ b/manual/getopt.texi
>> @@ -45,6 +45,12 @@ of the @var{argv} array to be processed. Once @code{getopt} has found
>> all of the option arguments, you can use this variable to determine
>> where the remaining non-option arguments begin. The initial value of
>> this variable is @code{1}.
>> +
>> +Resetting the variable value to @code{0} forces the invocation of an
>> +internal initialization routine and it is used mainly when a program
>> +wants to rescan the same vector more than once. It also should be used
>> +to scan multiple argument vectors or if @code{POSIXLY_CORRECT} is changed
>> +between scans.
>> @end deftypevar
>
> Suggest:
> Resetting the variable's value to @code{0} forces the invocation of an
> internal initialization routine and, on subsequent calls to getopt, causes
> the program to rescan the same vector more than once. This behaviour may
> also be used to scan multiple argument vectors or if @code{POSIXLY_CORRECT}
> is changed between scans.
I would add it is required as well if argv contents is changed over the calls.
>
>> @deftypevar {char *} optarg
>>
>