Bug 6634 - pututline_file() corrupts utmp file when internal_getut_r() returns -1 due to LOCK_FILE timeout.
Summary: pututline_file() corrupts utmp file when internal_getut_r() returns -1 due to...
Status: RESOLVED FIXED
Alias: None
Product: glibc
Classification: Unclassified
Component: libc (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Ulrich Drepper
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-06-11 21:19 UTC by Ryan S. Arnold
Modified: 2014-07-04 06:55 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:
fweimer: security-


Attachments
pututline() currupts the utmp file on fcntl() lock time out (514 bytes, text/plain)
2008-06-26 07:01 UTC, Halesh S
Details
Patch for the pututline() for fcntl() lock time out (607 bytes, patch)
2008-06-27 07:13 UTC, Halesh S
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Ryan S. Arnold 2008-06-11 21:19:32 UTC
The 'pututline' utmp entry insertion function can corrupt
the utmp file with duplicate entries when the system is busy.

Root cause:

If the utmp entry already exists and the timeout for the lock (re:LOCK_FILE) in
internal_getur_r() expires internal_getur_r() returns -1 to pututline_file().


The pututline_file() function incorrectly interprets the -1 return value as
"entry not found" rather than "lock timed out" AND then incorrectly appends a
duplicate entry to the utmp file.

This can happen whenever there are simultaneous pututline executions and the
system is under a high enough load to cause the lock to time out.

The program logic needs to be repaired to assure that a lock timeout will never
cause a duplicate entry to be added.

Perhaps a backoff algorithm could be used to retry the lock or the lock timeout
could be increased as well to accommodate systems under greater stress.

The problem was identified on an x86 machine.

The macro (FILE_LOCK) expanded version of internal_getur_r follows:

internal_getut_r (const struct utmp *id, struct utmp *buffer)
{
  int result = -1;

  {
    struct flock fl;
    struct sigaction action, old_action;
    unsigned int old_timeout;
    old_timeout = alarm (0);
    action.__sigaction_handler.sa_handler = timeout_handler;
    (__builtin_memset (&action.sa_mask, '\0', sizeof (sigset_t)), 0);
    action.sa_flags = 0;
    __sigaction (14, &action, &old_action);
    alarm (1);
    memset (&fl, '\0', sizeof (struct flock));
    fl.l_type = (0);
    fl.l_whence = 0;
    if (__fcntl_nocancel ((file_fd), 7, &fl) < 0)
      goto unalarm_return;

    if (id->ut_type == 1 || id->ut_type == 2
        || id->ut_type == 4 || id->ut_type == 3)
      {
        while (1)
          {
            if (__read_nocancel (file_fd, buffer, sizeof (struct utmp))
                != sizeof (struct utmp))
              {
                (__libc_errno = (3));
                file_offset = -1l;
                goto unlock_return;
              }
            file_offset += sizeof (struct utmp);

            if (id->ut_type == buffer->ut_type)
              break;
          }
      }
    else
      {
        while (1)
          {
            if (__read_nocancel (file_fd, buffer, sizeof (struct utmp))
                != sizeof (struct utmp))
              {
                (__libc_errno = (3));
                file_offset = -1l;
                goto unlock_return;
              }
            file_offset += sizeof (struct utmp);
            if (__utmp_equal (buffer, id))
              break;
          }
      }
    result = 0;

  unlock_return:
    fl.l_type = 2;
    __fcntl_nocancel ((file_fd), 7, &fl);
  unalarm_return:alarm (0);
    __sigaction (14, &old_action, ((void *) 0));
    if (old_timeout != 0)
      alarm (old_timeout);
  }
  while (0);

  return result;
}
Comment 1 Halesh S 2008-06-26 07:01:54 UTC
Created attachment 2800 [details]
pututline() currupts the utmp file on fcntl() lock time out


In some scenarios pututline (3) may corrupt the utmp like with heavy system
load environment and using pututline frequently.

The failure analysis is in login/utmp_file.c internal_getut_r() function is not
retaining the  lock error because of fcntl() failure so, pututline() is failing
to differentiate between utline not exists and utmp lock time out. As its
returning -1 in both cases. In both cases it treats as utline not found it
appends at the last.

Copy /var/run/utmp to current directory where you are executing tests.

who o/p Before Executing tests
==============================

$ who ./utmp
chill	 tty1	      May 15 18:53
halesh	 pts/2	      Jun 20 12:28
halesh	 pts/4	      Jun 20 13:00

who o/p After executing tests
=============================

$ who ./utmp
chill	 tty1	      May 15 18:53
halesh	 pts/2	      Jun 20 12:28
halesh	 pts/4	      Jun 20 13:00
chill	 tty1	      May 15 18:53     <- ** THE FIRST LOGIN HAS BEEN APPENDED.


**utmp got currupted becuase of adding firstlogin entry again at the last
instead of replacing it.

If you are not able to reproduce please increase the LOOP macro in testcase.
Comment 2 Halesh S 2008-06-27 07:13:06 UTC
Created attachment 2802 [details]
Patch for the pututline() for fcntl() lock time out


After applaying the attached patch..

who o/p Before Executing tests
==============================
$who ./utmp
chill	 tty1	      May 15 18:53
halesh	 pts/4	      Jun 20 13:00 (43.88.101.161)


who o/p After executing tests
=============================
$who ./utmp
chill	 tty1	      May 15 18:53
halesh	 pts/4	      Jun 20 13:00 

I have tested regression tests for adding new entry and replacing entries using
pututline() with patch and wroking fine.
Comment 3 Halesh S 2008-06-27 07:22:39 UTC
(In reply to comment #2)
> Created an attachment (id=2802)
> Patch for the pututline() for fcntl() lock time out
> 
> 
> After applaying the attached patch..
> 
> who o/p Before Executing tests
> ==============================
> $who ./utmp
> chill	 tty1	      May 15 18:53
> halesh	 pts/4	      Jun 20 13:00 (43.88.101.161)
> 
> 
> who o/p After executing tests
> =============================
> $who ./utmp
> chill	 tty1	      May 15 18:53
> halesh	 pts/4	      Jun 20 13:00 

Last part of who(1) o/p of second entry got missed while copying in text box...

 who o/p After Executing tests
 ==============================
 $who ./utmp
 chill	 tty1	      May 15 18:53
 halesh	 pts/4	      Jun 20 13:00 (43.88.101.161)

> 
> I have tested regression tests for adding new entry and replacing entries 
using
> pututline() with patch and wroking fine.

Comment 4 Ulrich Drepper 2008-08-14 04:24:32 UTC
I checked in a slightly modified patch.
Comment 5 Halesh S 2008-08-14 09:26:59 UTC
Thanks.