Bug 6634

Summary: pututline_file() corrupts utmp file when internal_getut_r() returns -1 due to LOCK_FILE timeout.
Product: glibc Reporter: Ryan S. Arnold <rsa>
Component: libcAssignee: Ulrich Drepper <drepper.fsp>
Status: RESOLVED FIXED    
Severity: normal CC: fweimer, garyhade, glibc-bugs, suzuki
Priority: P2 Flags: fweimer: security-
Version: unspecified   
Target Milestone: ---   
Host: Target:
Build: Last reconfirmed:
Attachments: pututline() currupts the utmp file on fcntl() lock time out
Patch for the pututline() for fcntl() lock time out

Description Ryan S. Arnold 2008-06-11 21:19:32 UTC
The 'pututline' utmp entry insertion function can corrupt
the utmp file with duplicate entries when the system is busy.

Root cause:

If the utmp entry already exists and the timeout for the lock (re:LOCK_FILE) in
internal_getur_r() expires internal_getur_r() returns -1 to pututline_file().


The pututline_file() function incorrectly interprets the -1 return value as
"entry not found" rather than "lock timed out" AND then incorrectly appends a
duplicate entry to the utmp file.

This can happen whenever there are simultaneous pututline executions and the
system is under a high enough load to cause the lock to time out.

The program logic needs to be repaired to assure that a lock timeout will never
cause a duplicate entry to be added.

Perhaps a backoff algorithm could be used to retry the lock or the lock timeout
could be increased as well to accommodate systems under greater stress.

The problem was identified on an x86 machine.

The macro (FILE_LOCK) expanded version of internal_getur_r follows:

internal_getut_r (const struct utmp *id, struct utmp *buffer)
{
  int result = -1;

  {
    struct flock fl;
    struct sigaction action, old_action;
    unsigned int old_timeout;
    old_timeout = alarm (0);
    action.__sigaction_handler.sa_handler = timeout_handler;
    (__builtin_memset (&action.sa_mask, '\0', sizeof (sigset_t)), 0);
    action.sa_flags = 0;
    __sigaction (14, &action, &old_action);
    alarm (1);
    memset (&fl, '\0', sizeof (struct flock));
    fl.l_type = (0);
    fl.l_whence = 0;
    if (__fcntl_nocancel ((file_fd), 7, &fl) < 0)
      goto unalarm_return;

    if (id->ut_type == 1 || id->ut_type == 2
        || id->ut_type == 4 || id->ut_type == 3)
      {
        while (1)
          {
            if (__read_nocancel (file_fd, buffer, sizeof (struct utmp))
                != sizeof (struct utmp))
              {
                (__libc_errno = (3));
                file_offset = -1l;
                goto unlock_return;
              }
            file_offset += sizeof (struct utmp);

            if (id->ut_type == buffer->ut_type)
              break;
          }
      }
    else
      {
        while (1)
          {
            if (__read_nocancel (file_fd, buffer, sizeof (struct utmp))
                != sizeof (struct utmp))
              {
                (__libc_errno = (3));
                file_offset = -1l;
                goto unlock_return;
              }
            file_offset += sizeof (struct utmp);
            if (__utmp_equal (buffer, id))
              break;
          }
      }
    result = 0;

  unlock_return:
    fl.l_type = 2;
    __fcntl_nocancel ((file_fd), 7, &fl);
  unalarm_return:alarm (0);
    __sigaction (14, &old_action, ((void *) 0));
    if (old_timeout != 0)
      alarm (old_timeout);
  }
  while (0);

  return result;
}
Comment 1 Halesh S 2008-06-26 07:01:54 UTC
Created attachment 2800 [details]
pututline() currupts the utmp file on fcntl() lock time out


In some scenarios pututline (3) may corrupt the utmp like with heavy system
load environment and using pututline frequently.

The failure analysis is in login/utmp_file.c internal_getut_r() function is not
retaining the  lock error because of fcntl() failure so, pututline() is failing
to differentiate between utline not exists and utmp lock time out. As its
returning -1 in both cases. In both cases it treats as utline not found it
appends at the last.

Copy /var/run/utmp to current directory where you are executing tests.

who o/p Before Executing tests
==============================

$ who ./utmp
chill	 tty1	      May 15 18:53
halesh	 pts/2	      Jun 20 12:28
halesh	 pts/4	      Jun 20 13:00

who o/p After executing tests
=============================

$ who ./utmp
chill	 tty1	      May 15 18:53
halesh	 pts/2	      Jun 20 12:28
halesh	 pts/4	      Jun 20 13:00
chill	 tty1	      May 15 18:53     <- ** THE FIRST LOGIN HAS BEEN APPENDED.


**utmp got currupted becuase of adding firstlogin entry again at the last
instead of replacing it.

If you are not able to reproduce please increase the LOOP macro in testcase.
Comment 2 Halesh S 2008-06-27 07:13:06 UTC
Created attachment 2802 [details]
Patch for the pututline() for fcntl() lock time out


After applaying the attached patch..

who o/p Before Executing tests
==============================
$who ./utmp
chill	 tty1	      May 15 18:53
halesh	 pts/4	      Jun 20 13:00 (43.88.101.161)


who o/p After executing tests
=============================
$who ./utmp
chill	 tty1	      May 15 18:53
halesh	 pts/4	      Jun 20 13:00 

I have tested regression tests for adding new entry and replacing entries using
pututline() with patch and wroking fine.
Comment 3 Halesh S 2008-06-27 07:22:39 UTC
(In reply to comment #2)
> Created an attachment (id=2802)
> Patch for the pututline() for fcntl() lock time out
> 
> 
> After applaying the attached patch..
> 
> who o/p Before Executing tests
> ==============================
> $who ./utmp
> chill	 tty1	      May 15 18:53
> halesh	 pts/4	      Jun 20 13:00 (43.88.101.161)
> 
> 
> who o/p After executing tests
> =============================
> $who ./utmp
> chill	 tty1	      May 15 18:53
> halesh	 pts/4	      Jun 20 13:00 

Last part of who(1) o/p of second entry got missed while copying in text box...

 who o/p After Executing tests
 ==============================
 $who ./utmp
 chill	 tty1	      May 15 18:53
 halesh	 pts/4	      Jun 20 13:00 (43.88.101.161)

> 
> I have tested regression tests for adding new entry and replacing entries 
using
> pututline() with patch and wroking fine.

Comment 4 Ulrich Drepper 2008-08-14 04:24:32 UTC
I checked in a slightly modified patch.
Comment 5 Halesh S 2008-08-14 09:26:59 UTC
Thanks.