This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [RFC] Statistics of non-ASCII characters in strings

From: "H.J. Lu" <hjl dot tools at gmail dot com>
To: "Carlos O'Donell" <carlos at redhat dot com>
Cc: Wilco Dijkstra <wdijkstr at arm dot com>, GNU C Library <libc-alpha at sourceware dot org>
Date: Tue, 23 Dec 2014 08:50:34 -0800
Subject: Re: [RFC] Statistics of non-ASCII characters in strings
Authentication-results: sourceware.org; auth=none
References: <001401d01df6$0f7cc5a0$2e7650e0$ at com> <54998EA5 dot 3020606 at redhat dot com>

On Tue, Dec 23, 2014 at 7:47 AM, Carlos O'Donell <carlos@redhat.com> wrote:
> On 12/22/2014 09:46 AM, Wilco Dijkstra wrote:
>> Does anyone have statistics of how often strings contain non-ASCII
>> characters? I'm asking because it's feasible to make many string
>> functions faster if they are predominantly ASCII by using a different
>> check for the null byte. So if say 80-90% of strings in strcpy/strlen
>> are ASCII then it would be well worth optimizing for it.
>
> I don't know that anyone has this data.
>
> However, it brings us to a discussion on whole system benchmarking and
> data gathering.
>
> Your particular question is about the average workload, for which there
> is no real consensus yet. Note that Ondrej has posted patches for a whole
> system benchmarking framework based on his LD_PRELOAD libraries. I think
> that or a systemtap-based framework are sensible solutions. I don't care
> which goes forward really, but with such a path forward we might start
> getting users to run the whole system benchmark in data-gathering mode
> with a global LD_PRELOAD and provide us with raw or aggregate data.
>

You can use LD_AUDIT to collect such information on your
system.


-- 
H.J.

Follow-Ups:
- Re: [RFC] Statistics of non-ASCII characters in strings
  - From: Carlos O'Donell

References:
- [RFC] Statistics of non-ASCII characters in strings
  - From: Wilco Dijkstra
- Re: [RFC] Statistics of non-ASCII characters in strings
  - From: Carlos O'Donell

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]