Proposed projects for Google Summer of Code 2014
- Proposed projects for Google Summer of Code 2014
The GNU C library presently has a large monolithic manual that contains documentation for much of the implemented interfaces in the libray. The problem with the manual is that it is monolithic. We would like a someone to explore an implementation of a dynamic documentation system that allows for inline documentation with code. Such a system could then be used to generate documentation directly from the source markup. Such a system could then be integrated with the existing texi documentation to be included in the final monolithic manual for compatibility. At the end of the day we wish our documentation to become easier to maintain and to be more closely integrated with the source.
ISO C11 threads
The GNU C library implements the canonical POSIX Threading (pthrad) implementation (NPTL) used on most Linux systems. In the recent ISO C11 standard a set of threading functions were defined that could easily be implemented with the existing glibc pthread functions. We would like someone to work on implementing the ISO C11 thread functions on top of pthreads and to be integrated into libpthread.so to provide these interfaces to programs wishing to use ISO C11 features. The work itself involves writing the header for the functions, implementing the functions on top of pthreads, and documenting the functions in the manual. The work can stop at any time with preference given to completing one function at a time before moving forward to the next.
The GNU C library implements the system math library "libm" following the relevant standards. Testing of the math functions is a difficult task and the present set of testing points is not sufficient to cover all of the different ranges for each of the defined functions. We would like someone to review, function by function, the math library routines, determine internal ranges for a functions implementation, and then add points to the testsuite to cover those ranges. This will also require using higher precision libraries like GMP and MPFR to validate the accuracy of the answer for the new test point. The goal being to have at least one test point per range of a given function for all functions. This work can stop at any point and we will have made progress against testing the current implementation.
Instrument glibc with AddressSanitizer
Idea for glibc GSoC project: instrument the glibc source with AddressSanitizer (asan).
goal #1: test glibc itself for bugs like stack or global buffer overflow.
goal #2: improve the testing for projects that use glibc (i.e. all projects on Linux). E.g. if a program passes a pointer to invalid memory to glibc, asan-instrumented glibc will detect it. Today asan solves this problem partially by intercepting the most interesting functions (e.g. memset), but a complete solution is more than welcome.
Bonus level 1: the same thing for ThreadSanitizer to detect more races in user programs.
Bonus level 2: Use Clang instrumentation as an alternative to the GCC instrumentation (may appear to be huge work, but very welcome)
Bonus level 3: (requires Bonus level 2): instrument glibc with MemorySanitizer to detect uses of uninitialized memory.
Implement Missing Interfaces for GNU Hurd
In glibc's Linux kernel port, most simple POSIX interfaces are in fact just forwarded to (implemented by) Linux kernel system calls. In contrast, in the GNU Hurd port, the POSIX (and other) interfaces are actually implemented in glibc on top of the Hurd RPC protocols. A few examples: getuid, open, rmdir, setresuid, socketpair.
When new interfaces are added to glibc (new editions of POSIX and similar standards, support for new editions of C/C++ standards, new GNU-specific extensions), generally ENOSYS stubs are added, which are then used as long as there is no real implementation, and often these real implementations are only done for the Linux kernel port, but not GNU Hurd. (This is because most of the contributors are primarily interested in using glibc on Linux-based systems.) Also, there is quite a backlog of missing implementations for GNU Hurd.
In coordination with the GNU Hurd developers, you'd work on implementing such missing interfaces.
Pretty Printing Support for gdb
glibc maintains a significant amount of state for applications in a number of its modules, like the dynamic linker, I/O streams, malloc, etc. To read that state, one must have an understanding of the internal implementation of these modules, which is quite difficult.
This is where pretty printing support for glibc in gdb would be handy. This would require a student to understand the internal implementation of one module and write a python pretty-printing plugin for that module. David Malcolm has written gdb-heap, which is a great example for a starting point to writing such a plugin for gdb.
Problems to think about and code in
- Where do the plugins reside in the source?
- How are they made available to gdb in various distributions?
- Which modules could/should be targeted?
Thread-safe and timezone aware date and time functions
In a multi-threaded environment where you want to print UTC in local time you must set the TZ env var, call tzset, and then print the output. This has the serious problem that it sets the timezone for all threads. There needs to be a way to print local time from a single thread using a specific timezone. The best practice at the moment is the NetBSD interfaces localetime_rz, ctime_rz, mktime_z, tzalloc, and tzfree. Implementing these functions in glibc would provide a thread-safe way to print local time and makes the zone info into a first class object e.g. timezone_t. The last problem to solve is how this zone information can be serialized such that another process or a future process can use the same timezone for printing similar messages. In general this is solved by the const char *zone passed to tzalloc, but at present there is no way to discover what zone you're currently in as a const char *zone, so we need another function for that e.g. const char *tzget(timezone_t tz).