The __opensock contains two static variables (last_type, last_family) that can race, triggering an assert on line 63 of opensock.c. We should either add some memory barrier, making it thread-local, or just remove the optimization.
The following lines are present in socket.texi for {unsigned int} if_nametoindex (const char *@var{ifname}) 491 @c It opens a socket to use ioctl on the fd to get the index. 492 @c opensock may call socket and access multiple times until it finds a 493 @c socket family that works. The Linux implementation has a potential 494 @c concurrency issue WRT last_type and last_family not being updated 495 @c atomically, but it is harmless; the generic implementation, OTOH, 496 @c takes a lock, which makes all callers AS- and AC-Unsafe. However, the explanation has been commented and not displayed in HTML file. Had the information been present in HTML, it would be easier to debug and synchronize multithreaded access on the caller side. Can the above explanation be uncommented to get it reflected in HTML? Thanks
This commit removes the problematic code from glibc 2.35: commit 5bf07e1b3a74232bfb8332275110be1a5da50f83 Author: Florian Weimer <fweimer@redhat.com> Date: Tue Sep 28 18:55:49 2021 +0200 Linux: Simplify __opensock and fix race condition [BZ #28353] AF_NETLINK support is not quite optional on modern Linux systems anymore, so it is likely that the first attempt will always succeed. Consequently, there is no need to cache the result. Keep AF_UNIX and the Internet address families as a fallback, for the rare case that AF_NETLINK is missing. The other address families previously probed are totally obsolete be now, so remove them. Use this simplified version as the generic implementation, disabling Netlink support as needed.
@florian, do you think this should be back-ported to older releases such as 2.33, 2.34 ? It is a rare race but still a bug. Also, should @pgowda send a patch to fix up the documentation: $ rg "concurrency issue WRT last_type and last_family" manual/socket.texi 494:@c concurrency issue WRT last_type and last_family not being updated Thanks! ../Randy
Sorry, missed the ping. 2.33 and 2.34 backports are now done.