This is the mail archive of the mailing list for the glibc project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: [PATCH] Native POSIX Thread Library(NPTL) ARM Supporting Patches (1/3)

> From: Philip Blundell
> > I wonder what the performance impact is of having a system call in
> > THREAD_SELF.  If it turns out to be too great, it may be possible to
> > reduce the overhead by adding some more support to the kernel.  What
> > I've been thinking of is a way for applications to supply a pointer to
> > the kernel (via a new system call), and have it store the current thread
> > ID at that address during context switch.  That way, retrieving the
> > thread descriptor would be just a regular memory access.  I think it'd
> > only add a handful of cycles to the context switch path, and that's a
> > comparatively heavyweight operation already so it would probably
> > disappear in the noise.

Similar but not the same, I was considering the following kind of 
wild and probably too wasteful idea I have been toying with lately:

Wouldn't it be useful to always map the second page of a process (page
1) to a per-task/thread specific ro page where that information can be
contained, as well as other stuff we usually get from the kernel, 
being kind of overkill (like for example, PID, date? and some other
statistics/data from the process).

It is like a generalization of your's for any piece of RO data on a
user/kernel basis; however, the big drawback is that we are wasting
yet another page per process if we pin it in physical memory to ease
access from the kernel (unpinned is ok, but kind of defeats the 
purpose, as it would make it more difficult for the kernel to 
update the statistics that make more sense to have in this page- 
as CPU time and stuff like that, that as far as I remember, is
changed from within atomic/spinlocked region).

So the kernel defines:

struct task_info_page {
	int pid;
	void *pthread_self;
	void *tls;
	... more stuff, like for example, CPU time, etc ...

And maps that at user space PAGE_SIZE - a different mapping for each
thread (how to do that when CLONE_VM is active??).

This way, for example, getpid would be as simple as:

struct task_info_page *task_info_page =
   (struct task_info_page *) PAGE_SIZE;

int sys_getpid (void)
	return task_info_page->pid;

pthread_self() and TLS would be along the same lines - 
on thread startup time you use a system call to set the 
pthread descriptor pointer or TLS - or embed the TLS into 
what the pthread_self points to ...

pthread_t pthread_self (void) 
	return task_info_page->pthread_self;
// I just made this one up
void * __pthread_get_tls (void) 
	return task_info_page->tls; // or ...
	return pthread_self()->tls; 

This would solve the TLS issues for most (if not all)
platforms, and improve the "system call" performance of
many of them for certain calls. However, I don't know
how worth it is regarding that extra page and how much
useful information we can stuff in there so that the
wasted space is not that much.

Iñaky Pérez-González -- Not speaking for Intel -- all opinions are my own
(and my fault)

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]