This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: misc/tst-ttyname time outs
On 1/2/19 10:08 AM, Florian Weimer wrote:
> * Carlos O'Donell:
>
>> On 1/2/19 9:48 AM, Florian Weimer wrote:
>>> * Szabolcs Nagy:
>>>
>>>> in run_chroot_tests the following loop time outs for me:
>>>>
>>>> /* keep creating PTYs until we we get a name collision */
>>>> while (stat (slavename, &st) < 0)
>>>> posix_openpt (O_RDWR|O_NOCTTY|O_NONBLOCK);
>>>>
>>>> it seems posix_openpt can fail with EMFILE or ENOSPC in the
>>>> loop and then it never finishes.
>>>>
>>>> example strace:
>>>> [pid 24510] newfstatat(AT_FDCWD, "/dev/pts/1789", 0xffffdd528fe0, 0) = -1 ENOENT (No such file or directory)
>>>> [pid 24510] openat(AT_FDCWD, "/dev/ptmx", O_RDWR|O_NOCTTY|O_NONBLOCK) = -1 ENOSPC (No space left on device)
>>>> [pid 24510] newfstatat(AT_FDCWD, "/dev/pts/1789", 0xffffdd528fe0, 0) = -1 ENOENT (No such file or directory)
>>>> [pid 24510] openat(AT_FDCWD, "/dev/ptmx", O_RDWR|O_NOCTTY|O_NONBLOCK) = -1 ENOSPC (No space left on device)
>>>> [pid 24510] newfstatat(AT_FDCWD, "/dev/pts/1789", 0xffffdd528fe0, 0) = -1 ENOENT (No such file or directory)
>>>> ...
>>>>
>>>> i'm not sure what can cause such failures, but it happens
>>>> regularly on the aarch64 build bot instance recently, let
>>>> me know if somebody knows how to make that loop or the
>>>> runtime environment more reliable.
>>>
>>> I think we should detect that posix_openpt fails and treat ENOSPC,
>>> EMFILE, ENFILE as unsupported, along with an error message that we could
>>> not create a PTY collision for slavename. Other error should result in
>>> a hard failure.
>>>
>>> I can write a patch if you are able to test it.
>>
>> I agree that's the best option. You should just checkin a fix, it's
>> obvious that if we're failing with ENOSPC, EMFILE, or ENFILE that we're
>> unable to complete the test.
>>
>> However, I would add another requirement:
>>
>> * Print an message indicating possible ways in which this can be fixed
>> to work.
>>
>> For example, should we try to determine what needs fixing and print an
>> appropriate informative message?
>
> I think for the three error codes, the remediative action is implied
> (increase the limit). But how to do that depends on the computing
> substrate being used. I don't think we can provide helpful generic
> guidance here.
I think we can do just a bit better than just UNSUPPORTED.
Let me be more specific with my suggestion.
The test could print an informative message of the form:
"info: posix_openpt: Failed with errno %d. Consider increasing limits."
This will then end up in the *.out for the test and help a developer that
is running into problems.
The *.test-result will just say:
UNSUPPORTED: misc/tst-ttyname
original exit status 77
Which doesn't help when reviewing the UNSUPPORTED test.
I will note that we are inconsistent in our use of UNSUPPORTED for ENOSPC
in that we often hard-fail the ENOSPC failures because we expect to have
infinite disk space for testing.
Perhaps just EMFILE and ENFILE should be handled as UNSUPPORTED?
--
Cheers,
Carlos.