This is the mail archive of the libc-help@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Asking for Help on Seeking to End of File


Dear developers,

I recently noticed that glibc does not move file cursor to the exact
position when fseek(fp, 0, SEEK_END), instead it move to another
position before it and read the bytes left.

Here goes a simple reproducible example (tested in glibc-2.17):

$ cat foo.c
#include <stdio.h>
int main(int argc, char * const *argv)
{
   FILE *fp = fopen(argv[1], "rb");
   fseek(fp, 0, SEEK_END);
   fclose(fp);
   return 0;
}
$ gcc foo.c
$ strace ./a.out ~/Research/Data/GRCh38.fa  # This is a big file for
testing (3GB)
...
open("/home/yanll/Research/Data/GRCh38.fa", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=3255188431, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
0) = 0x7fbcc4f70000
fstat(3, {st_mode=S_IFREG|0644, st_size=3255188431, ...}) = 0
lseek(3, 3255185408, SEEK_SET)          = 3255185408
read(3, "CTGATCTTCTCCCGTTGAATTAGTTCCTAAAC"..., 3023) = 3023
close(3)                                = 0
munmap(0x7fbcc4f70000, 4096)            = 0
...

Here, fseek(fp, 0, SEEK_END) was translated to system calls lseek(3,
3255185408, SEEK_SET) and read(3, "...", 3023). I guess this should be
related to the buffer machenism of fopen/fseek functions.

However, the number in lseek() calling seems to depend on the file
system or storage device. In our case, on a big storage (about 90TB,
xfs file system) of our high-performance server, the rest bytes for
read() after lseek() could be about 1GB (for a 4GB file), which makes
some applications very inefficient (even slower than running on PC
desktop).

I wanna to know what factors will affect the number in lseek()? How
can I solve this performance problem on our storage? Could you give me
some suggestions please? Thanks!

Best wishes!
Linlin


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]