This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: utmp

From: Albert ARIBAUD <albert dot aribaud at 3adev dot fr>
To: Arnd Bergmann <arnd at arndb dot de>
Cc: libc-alpha at sourceware dot org, Joseph Myers <joseph at codesourcery dot com>, Szabolcs Nagy <szabolcs dot nagy at arm dot com>, Yury Norov <ynorov at caviumnetworks dot com>, Andreas Schwab <schwab at suse dot de>, nd at arm dot com, vapier at gentoo dot org, cmetcalf at tilera dot com, pinskia at gmail dot com, cmetcalf at mellanox dot com, bamvor dot zhangjian at huawei dot com, catalin dot marinas at arm dot com, fweimer at redhat dot com, Prasun dot Kapoor at cavium dot com, maxim dot kuvyrkov at linaro dot org
Date: Wed, 29 Jun 2016 10:41:00 +0200
Subject: Re: utmp
Authentication-results: sourceware.org; auth=none
References: <1467103498-24243-1-git-send-email-ynorov at caviumnetworks dot com> <6475909 dot qlLPY2oMo7 at wuerfel> <20160629090502 dot 32ad1b09 dot albert dot aribaud at 3adev dot fr> <19862959 dot F2JjHcFzWe at wuerfel>

Hi Arnd,

Le Wed, 29 Jun 2016 10:12:59 +0200, Arnd Bergmann <arnd@arndb.de> a
Ãcrit :

> On Wednesday, June 29, 2016 9:05:02 AM CEST Albert ARIBAUD wrote:
> > Le Tue, 28 Jun 2016 23:49:11 +0200, Arnd Bergmann <arnd@arndb.de> a Ãcrit :  
> > > On Tuesday, June 28, 2016 11:08:53 PM CEST Albert ARIBAUD wrote:  
>  [...]  
> > > 
> > > Nothing wrong with this approach, just another idea from how we handle
> > > upgrading on-disk structures elsewhere:
> > > 
> > > In ext4, we use the two upper bits of the 32-bit nanoseconds to
> > > extend the seconds. This way a structure that stores signed
> > > seconds as 32 bit and can normally represent the range between 1902
> > > and 2038 gets extended by another 3*136 years, so we can go until
> > > 2446 with a backwards-compatible extension. If we treat the on-disk
> > > seconds as unsigned (we know that pre-1970 times are all guaranteed
> > > to be invalid for utmp), we get another 68 years.
> > > 
> > > Obviously this is a trick that only works when you have full control
> > > over all code that reads or writes the timestamps, and it doesn't
> > > solve the endianess problem, but it avoids introducing an incompatible
> > > format. Also, there is no reason for new architectures to do that,
> > > they should just do what you describe above.  
> > 
> > As far as choosing between tricks, I would favor switching to an
> > unsigned 32-bit timestamp, which would be much simpler to code and less
> > run-time-error prone.
> > 
> > But this only buys us more time beyond Y2038 just to make a clean
> > transition to the full-64-bit solution, and we already have 22 years
> > for that, and in my experience, pushing the 'doomsdate' further back
> > just makes the issue more likely to be considered SEP and not be fixed
> > in time.
> > 
> > Plus, this transition would still be temporary, and we would still need
> > a transition from the 'old' (now 'possibly extended') format to the
> > 'new' one.  
> 
> The idea would be that the transition can be done simply by
> reinterpreting the current values. If we were to pick the
> simplest method (using unsigned seconds unstead of signed),
> the only changes we would see are:
> 
> - timestamps prevously written with an incorrect pre-1970 timestamp
>   would become interpreted as incorrect post-2038 timestamps with
>   a new glibc. They are incorrect either way, so we can choose to
>   ignore this case
> 
> - correct post-2038 timestamps written by a new glibc would get
>   interpreted as incorrect pre-1970 timestamps by an old glibc
>   build. This is slightly more relevant than the first, but you
>   could argue that it is no worse than not being able to read the
>   file at all, which is what you get with an incompatible
>   format change.
> 
> > Actually, the problem is only between a GLIBC upgrade and the next
> > reboot, where utmp will be re-created, including the shutdown phase
> > itself, where some code might want to access the ('old') utmp file but
> > might accidentally use the new GLIBC.  
> 
> What about the case of having multiple glibc installations on the
> same machine: 
> 
> With a multiarch installation that has both 32-bit and 64-bit
> binaries, both environments are accessing the same file, but
> they might not be the same version.
> 
> > A solution could be to reserve utmp, wtmp and btmp for the 'old' format
> > files and use utmpx, wtmpx and btmpx for the 'new' format files (which
> > would better match the POSIX header and struct name btw), keep the
> > 'old' API implementations alongside the 'new' ones, and choose 'old' or
> > 'new' as follows:
> > 
> > - (future) nominal case: if the utmp file does not exist, only use the
> >   utmpx file.
> > 
> > - transitioning: if the utmp file exists, use it for reading, and write
> >   to both the utmp and utmpx files.
> > 
> > Pros:
> > 
> > - as long as the utmp file exists (and as long as we don't reach
> >   Y2038...) the system remains (as) compatible (as it was before) with
> >   any application code written to handle utmp records.
> > 
> > - development versions of applications which read utmp can be tested
> >   against the utmpx file while the system itself still runs on the utmp
> >   file.
> > 
> > - transitioning is under system control: remove utmp when the whole
> >   system is ready for it.
> > 
> > Cons:
> > 
> > - this forces the GLIBC utmp code to keep the 'old' code for some
> >   time, which is a code phrase for 'future bitrot'.
> > 
> > - this doubles utmp-related file I/O. This might not be desirable on
> >   systems where there is a high rate of logins and logouts.
> > 
> > While we cannot avoid bitrot, we can at least make sure the whole
> > compatible code is kept within conditionals so that it can be easily
> > disabled, then removed; the trigger for disabling/removing should be
> > explicitly stated (in code as well as in docs) as "make sure this
> > is removed before Y2038 happens".
> > 
> > Regarding the file I/O rise, it cannot be avoided, and will have to be
> > contained by planning the system upgrade so that the transition is as
> > short as possible.  
> 
> I think this approach can work, but I find the utmpx naming really confusing
> here: As you say, it uses the same naming as the POSIX utmpx structure,
> but it really does something else, as we'd likely keep both struct utmp
> and struct utmpx identical, except that we'd get a new version of both
> for the library interface.

Fair points. How about this:

- use file names utmp and utmp.trans

- (future) nominal case: if the utmp.trans file does not exist, then use
  the utmp file for reading and writing in the 'new' format.

- transitioning: if the utmp.trans file exists, then use the utmp file
  for reading in the 'old' format, and write to both the utmp file in
  'old' format and the utmp.trans file in 'new' format respectively.

Switching from transitional state to nominal state would amount to
renaming the utmp.trans file as utmp.

This way, the 'new' file name becomes unrelated to the POSIX struct
name, and any 'old' and 'new' GLIBCs on the system will agree on the
utmp file structure ('old' for all) while the system is in the
transitioning state.

Of course, the switch from transitioning to nominal state requires that
all GLIBCs on the system be 'new', since at this point, the utmp file
will have 'new' format for all.

Cordialement,
Albert ARIBAUD
3ADEV

References:
- [RFC PATCH] AARCH64/ILP32: introduce kernel time types
  - From: Yury Norov
- Re: utmp (was: [RFC PATCH] AARCH64/ILP32: introduce kernel time types)
  - From: Arnd Bergmann
- Re: utmp
  - From: Albert ARIBAUD
- Re: utmp
  - From: Arnd Bergmann

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]