This is the mail archive of the libc-help@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Identifying when collations change


On 09 Jul 2015 10:15, OndÅej BÃlka wrote:
> On Wed, Jul 08, 2015 at 02:10:40AM -0400, Mike Frysinger wrote:
> > On 03 Jul 2015 15:16, Craig Ringer wrote:
> > > The PostgreSQL database relies on the collation support of the
> > > underlying platform, which in GNU/Linux is glibc. This works very well
> > > for most purposes, but a problem arises when the collation rules are
> > > updated by the platform due to bug fixes or changes in accepted
> > > language rules.
> > > 
> > > PostgreSQL builds persistent on-disk b-tree indexes by executing the
> > > system C library collation functions - strcoll or strcoll_l. Correct
> > > searching of these indexes requires that the C library collation
> > > function behaviour be pure and immutable, i.e. that any two calls over
> > > any time period will return the same result for any given input.
> > > Collation updates break that assumption, and indexes must be rebuilt
> > > (REINDEXed) to ensure correct queries.
> > > 
> > > If PostgreSQL had a way to detect when the collation definition an
> > > index was built with differed from the current collation definition it
> > > would be very helpful, as we could then alert users to the situation,
> > > or even repair the index if we could tell *what* changed, not just
> > > that something changed.
> > 
> > i don't know about a portable answer, but perhaps extending nl_langinfo would
> > be more on the painless side of things ?  adding a GNU-specific keyword that'd
> > return a hash of the collation data so you could easily check. </naive>
> 
> A simple solution would be checking libc.so timestamp and reindexing when it
> changes, would reindexing once per year if user regularly updates
> matter?

that requires figuring out the right active libc.so file (certainly hardcoding 
something like /usr/lib/libc.so would be a terrible idea), and it would trigger 
more updates than necessary.  i don't think collation updates are that frequent, 
and certainly tend to be restricted to specific languages rather than all of 
them.
-mike

Attachment: signature.asc
Description: Digital signature


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]