This is the mail archive of the
mailing list for the glibc project.
Re: RFC: IDN support in getaddrinfo().
Ulrich Drepper <firstname.lastname@example.org> writes:
>> Having the libidn
>> API available via libc have some benefits though, because there are
>> many programs that will need stringprep functionality in the future
>> (e.g., iSCSI, XMPP instant messaging, Kerberos, SASL).
> This is something I absolutely don't want. We've been burned badly by
> the inability of RFC authors to come up with stable interfaces in the
> past. The libidn functionality would be hidden from the world and be
> only available through getaddrinfo (and getnameinfo, I guess).
> Yes, other program might need it, too. But until the appropriate
> interfaces are standardized by a respectable standards body they'll have
> to live in a separate library. If it turns out, later, that the
> interface is stable we can make the interfaces in the separate library
> simple forwarders to the glibc functions.
I understand and agree.
>> I have been thinking about a dlopen() approach, to reduce the code
>> size in libc. E.g., the application requests IDN, then libc try to
>> dlopen("idn"). The libc IDN code patch would only amount to, say,
>> less than 100 lines. Any thoughts on this? Is it feasible at all?
> It is feasible but probably not needed.
> What I suggest to do is to move the data structures in a separate file.
> This file can then be mmaped in case getaddrinfo is called with the
> appropriate option. This adds some overhead to users/programs using
> this functionality but that's life. Adding 100k+ the glibc's size for
> every user is not acceptable.
> The code part seems reasonably small to not make big problems. I would
> insist on stripping down the code to the absolute minimum, though.
>> but some utility functions to convert between UTF-8
>> and UCS-4 are used internally, so make it ~10 API functions. (Perhaps
>> those functions already exist elsewhere in libc though?)
> We obviously have code internally to map from and to UTF-8. These would
> have to be used. They are a bit lower-level than iconv.
Can you give a pointer? I couldn't find anything in the libc manual,
but I guess they aren't exported.
>> Libidn currently support non-IDN related stringprep profiles as well,
>> but they re-use the IDN-related stringprep tables. They add only
>> about ~20 lines of initialization in a static const table (100-200
>> bytes? Dunno.)
> Should go. Any bit of code which is removed cannot be misused.
Yes, and it wouldn't serve any purpose if getaddrinfo is the only
> If you send the code where the data is loaded from a file (just use read
> for now) and the iconv uses are marked, I'll change those parts to use
> the glibc-internal interfaces which are not so easy to find.
Do you have any opinion on the format of the external files? Text or
binary? Platform dependent or independent binary? Export both
Unicode NFKC and RFC 3454 tables, or just one of them?
This will likely take some time to implement, the tables are exported
to several closely intertwined C variables, and they are accessed from
many functions. It is more complicated than exporting one C variable
to a file and then update one function which reads from the variable.
Perhaps I only export the large tables, and leave some minor tables in
> You have assigned the code to the FSF already, I assume? Are there any
> collaborators? Is the license LGPL?
Libidn is not a FSF-copyrighted package, but I can assign the libc
patch to the FSF. I have not written all code, here are the
exceptions. The license is LGPL. If someone that know about the
license issues could evaluate the following statements, I think that
would be good.
punycode.*, copied from RFC 3492 (bis), with this license text:
* Disclaimer and license: Regarding this entire document or any
* portion of it (including the pseudocode and C code), the author
* makes no guarantees and is not responsible for any damage resulting
* from its use. The author grants irrevocable permission to anyone
* to use, modify, and distribute it in any way that does not diminish
* the rights of anyone else to use, modify, and distribute it,
* provided that redistributed derivative works do not contain
* misleading author or version information. Derivative works need
* not be licensed under similar terms.
* Copyright (C) The Internet Society (2003). All Rights Reserved.
* This document and translations of it may be copied and furnished to
* others, and derivative works that comment on or otherwise explain it
* or assist in its implementation may be prepared, copied, published
* and distributed, in whole or in part, without restriction of any
* kind, provided that the above copyright notice and this paragraph are
* included on all such copies and derivative works. However, this
* document itself may not be modified in any way, such as by removing
* the copyright notice or references to the Internet Society or other
* Internet organizations, except as needed for the purpose of
* developing Internet standards in which case the procedures for
* copyrights defined in the Internet Standards process must be
* followed, or as required to translate it into languages other than
* The limited permissions granted above are perpetual and will not be
* revoked by the Internet Society or its successors or assigns.
* This document and the information contained herein is provided on an
* "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
* TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
* BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
* HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
* MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
nfkc.c, contain some functions based on functions in GLIB gutf8.c and
gunidecomp.c. Some of them might not be needed, if I find
replacements within libc. The license text:
* Copyright (C) 1999, 2000 Tom Tromey
* Copyright (C) 2000 Red Hat, Inc.
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 02111-1307, USA.
gen-unicode-tables.pl, also from GLIB:
# Copyright (C) 1998, 1999 Tom Tromey
# Copyright (C) 2001 Red Hat Software
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2, or (at your option)
# any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA
# 02111-1307, USA.
# Andrew Taylor <email@example.com>
# I consider the output of this program to be unrestricted. Use it as
# you will.
Perhaps gen-unicode-tables.pl doesn't have to be included, only its