glibc aio performance
Amos P Waterland
waterland@us.ibm.com
Mon Jun 3 12:28:00 GMT 2002
Don:
Thanks for the expanded analysis: I appreciate it.
> The difference in performance for async I/O is around 7 percent on
write/rewrite and around
> 2 percent for read/re-read.
>
> This is expected because:
>
> 1. There is only one disk and multiple threads don't help
Even with a single disk, shouldn't rapidly enqueing many I/O operations
enable the disk scheduler to greatly improve random access throughput by
using reordering etc? That is, if a single thread does this (pseudo-code):
b[0] = read(); b[1] = read(); ... ; b[n] = read();
the disk scheduler does not get the i-th read request until the (i - 1)-th
has completed, so the best that it can do is predictive scheduling (as you
point out: great for sequential access, but bad for random access). But if
the thread does this:
b[0] = aio_read(); b[1] = aio_read(); ... ; b[n] = aio_read();
then the disk scheduler will have approximately n I/O operations to work
with in scheduling the head seek pattern. So shouldn't there be some
instances in which using multiple threads on a single disk with AIO
outperforms SIO? (I tried to test this hypothesis with IOzone, and the -i
2 test with AIO does seem to outperform -i 2 with SIO.)
> 5. Normal read/write have a nice fast I/O completion notification model
that is implemented in the
> operating system. POSIX async I/O was a "group think" design and has a
very poor I/O completion
> model that slows things down. Polling for I/O completion or signals
was and is a very poor design.
> Most vendors have custom async I/O routines that have a fast I/O
completion
> mechanism. (call back notification)
While I am not completely enamoured with the POSIX design, I do believe
they did include a facility for callbacks. I wrote a small program that
shows how to use POSIX real-time signals to pass a signal handler an
arbitrary pointer upon completion of an AIO write:
% make test0004
gcc -lrt -Wall test0004.c -o test0004
% ./test0004
--> signal handler to be passed pointer: 0x8049180
<-- signal handler passed pointer: 0x8049180
Z
Z
Z
Z
(localhost) test% cat test0004.c
/* An example of how to use signal notification as callback.
* Amos Waterland <waterland@us.ibm.com>
* 3 June 2002
*/
#include <aio.h>
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#define BYTES 20485760
#define TEMP "/tmp/test0004.dat"
static char buff[BYTES];
void handler( int val, siginfo_t *ptr, void *ignore )
{
int i;
void *data = (ptr->si_value).sival_ptr;
printf( "<-- signal handler passed pointer: %p\n", data ); fflush(
stdout );
/* print the first four bytes of the array pointed to by pointer passed
*/
for (i = 0; i < 4; i++) { printf( "%c\n", ((char *)data)[i] ); }
}
int main( int argc, char *argv[] )
{
int i, fd;
struct aiocb cb;
struct sigaction sa;
for (i = 0; i < BYTES; i++) { buff[i] = 'Z'; }
if ((fd = open( TEMP, O_CREAT | O_WRONLY, 0600 )) < 0) {
perror( "error opening file" ); exit( 1 );
}
sa.sa_sigaction = handler;
sa.sa_flags = SA_SIGINFO;
if (sigaction( SIGRTMIN, &sa, NULL)) {
fputs( "error setting up signal handler\n", stderr ); exit( 2 );
}
printf( "--> signal handler to be passed pointer: %p\n", (void *)buff
);
fflush( stdout );
cb.aio_fildes = fd;
cb.aio_offset = 0;
cb.aio_buf = buff;
cb.aio_nbytes = BYTES;
cb.aio_reqprio = 0;
cb.aio_sigevent.sigev_notify = SIGEV_SIGNAL;
cb.aio_sigevent.sigev_signo = SIGRTMIN;
cb.aio_sigevent.sigev_value.sival_ptr = (void *)buff;
if (aio_write( &cb )) { perror( "error writing to file" ); exit( 3 ); }
while (aio_error( &cb ) == EINPROGRESS) { usleep( 10 ); }
if (aio_return( &cb ) != BYTES) { perror( "error returned" ); exit( 4
); }
if (close( fd )) { fputs( "error closing file\n", stderr ); exit( 5 );
}
if (unlink( TEMP )) { perror( "error unlinking file" ); exit( 6 ); }
return 0;
}
More information about the Libc-alpha
mailing list