readdir from dirent.h produces a segmentation fault if passed a NULL value, which can easily happen if the result of opendir is not checked against NULL. #include<dirent.h> int main() { readdir(NULL); } According to the man pages: "The readdir() function returns a pointer to a dirent structure, or NULL if an error occurs or end of the directory stream is reached. On error, errno is set appropriately." Fedora 10: glibc-devel-2.9-3.x86_64 glibc-2.9-3.x86_64 glibc-headers-2.9-3.x86_64 glibc-common-2.9-3.x86_64
> which can easily happen if the result of opendir is not checked against NULL. It's the caller's responsibility to check for this.
Would it not be more robust to check for NULL, and return -1, setting errno to EFAULT, or EINVAL?
No, it would hide bugs in your code and encourage the writing of non-portable programs. You cannot haphazardly pass NULL to functions that expect valid pointers and ignore the contract of the function's interface. Crashing immediately is the best possible behavior because it forces you to fix the bug. BTW, by the same argument, would you want all the stdio functions to check for NULL FILE* arguments? Even getc/putc?
Rich, I've heard that argument before; I'm not convinced though. I'm not sure if I'm missing something though. Could I ask why returning -1, and setting errno to EFAULT when readdir passed NULL would hide bugs? My view was that it actually highlights them very clearly to the application. Re stdio, yes, some already are stable: take printf as a good example in glibc, it is safe and robust. A NULL pointer won't cause a crash, it checks its parameters, returning -1, and setting errno to EINVAL. Contrast printf with puts, which SEGV currently. NULL is a special case, as it's the value people initialise their pointers, and after releasing memory also set NULL. If pointers are not going to checked, the same could be applied to file handles, letting them cause crashes. Currently passing read() a bad fd, is robustly handled by setting EBADF in errno and returning -1.
> Re stdio, yes, some already are stable: take printf as a good example in glibc, Thank you for proving my point. Do you have any idea how many buggy applications pass NULL to printf because glibc allows it? Do you have any idea what a pain that is to people trying to use such broken software on other standards-conformant systems that (correctly) don't allow you to printf NULL as "(null)" with %s? If it crashed right away, people would be forced to fix the bugs in their code. My argument is correct and it's rather akin to the argument for browser standards. As long as every browser accepts invalid HTML and renders it its own way matching what the author *using that specific browser* intended it to look like, standards go to hell and every browser ends up having to copy all the other browsers' insane behavior in order to render broken sites "correctly". And often, if several browsers have different and incompatible renderings of invalid HTML, there's no solution without trying to detect which browser the author of the HTML was targeting. The correct solution here is to refuse to render invalid HTML *at all* and simply display an error message in its place. > If pointers are not going to checked, the same could be applied to file > handles, letting them cause crashes. Currently passing read() a bad fd, is > robustly handled by setting EBADF in errno and returning -1. No, this is not allowed. File descriptors are completely different from pointers. They refer to shared/inherited resources, and the standard specifies very specific behavior for what happens when you pass a value which does not refer to an open file. Note that, unlike pointers, it is fundamentally possible, for any integer, to determine whether that integer is a valid file descriptor in the current process, and the interfaces allow you to do this. *Any* invalid file descriptor, not just -1, is detectable as invalid and has well-defined behavior. With pointers on the other hand, and with DIR * in particular, there is fundamentally no way to detect whether a particular pointer value refers to an open directory stream (without imposing major restrictions on the implementation). The same applies to FILE pointers, and many other types of resources. Finally, read again what I said about FILE * and getc/putc. They cannot and will not check for NULL because it would make them significantly slower for no benefit. As-is, glibc's DIR * handling and FILE * handling are analogous; neither does the nonsensical checks for NULL.
Are you sure printf isn't supposed to be robust and prevent de-referencing a NULL ptr? It's only following accepted wisdom and POSIX. Many other standard functions check their params for NULL ptr and return EFAULT. Take read() as a good example, open a file, pass the handle with NULL buf, and you get -1 and errno==EFAULT, this is the correct behaviour. Just looking I see quite a few glibc functions exhibit best-practice: read, write, fopen printf robust, but sets errno EINVAL rather than EFAULT. Likewise for closedir. So that's 5 cases where users are not punished while developers are still informed about the bad param. SEGV for bad params definitely is not in the contract of these POSIX functions... Just see the man pages. I encourage you to take a look at "man read" which describes when EFAULT is returned. re standards specifying what happens to a handle which doesn’t refer to a file, the man pages are pretty clear. "man read" "man write" errno=EFAULT, -1 is returned for bad pointers. errno could be set EBADF for NULL fp pointer. I don't see why fwrite can't check FILE* valid in a table for the whole range of fp handles in use, in the same way that write() does with fd, the implementation knows what are valid handles, and will check this when validating params. The existing robustness should be applied across glibc interfaces not already stable. Could even have a policy standard document to outline the position clearly to describe this best-practice (setting errno EFAULT)? Re puts() checking param for NULL overhead, have you made measurements? It's just an if(), couple of machine code instructions, won't be noticeable in any way unless someone made 100,000 calls. Browser standards, and quirks mode are an interesting point. That's actually the reason tolerant, and robust browsers do so well. You've kind of proved my point, glibc should be robust, and tolerant in all cases, indestructible if possible. Not pedantic and down-right difficult, doing undocumented SEGV instead. There is a balance to find, and printf detecting the bad pointer is that balance, it doesn't use it. Developer realises no output from errno, and fixes the code, solved. User did not suffer in the meantime. Another argument akin to this one is making -Werror default, fine in a small environment, but not in a big codebase where someone’s change in one build will block everyone else from working until that warning is fixed.
Yes, I'm sure. You're comparing apples to oranges. read, etc. are SYSTEM CALLS which can verify whether the address is valid at no additional cost because they're already running in kernelspace. The EFAULT does not come from glibc; it comes from the kernel. In fact it's much EASIER to just return -EFAULT than it would be to setup a SIGSEGV siginfo_t and raise a signal. Note that, still, not protection can be provided against clobbering memory by passing an incorrect (e.g. uninitialized) pointer whose value happens to correspond to some valid, currently-mapped virtual address range; all that can be detected is unmapped memory, and NULL just happens to be a special case of this. Also note that there is NO REQUIREMENT that read, etc. return -1 with errno set to EFAULT when an invalid address is passed. The behavior, as always, is undefined, and in fact they might someday crash if there happened to be some need for userspace code to access the memory before or after the underlying syscall takes place. It just happens that the simplest implementation of the underlying undefined behavior happens to be returning an error code, under the current implementation.
C standard specifies: "The macro NULL is defined in <stddef.h> (and other headers) as a null pointer constant; see 7.17." It's in stddef.h, NULL is not an uninitialised value, NULL is a part of the language. It's often set by the developer, and other times it's returned by functions like fopen. For these reasons the "null pointer constant" is provided, to be checked, not to be a feature of the language that is ignored. Re read vs fread, I would have thought if(NULL == ptr) would be exactly the same 4 instruction cost that it is in the user space as kernel. If read() checking for NULL + returning error code is not standard, really it should be IMHO. Could be in the POSIX standard.