[PATCH] LC_COLLATE: Fix last character ellipsis handling (Bug 22668)

Carlos O'Donell carlos@redhat.com
Mon Apr 26 12:39:51 GMT 2021


On 3/1/21 12:18 PM, Florian Weimer wrote:
> * Carlos O'Donell via Libc-alpha:
> 
>> From: Hanataka Shinya <hanataka.shinya@gmail.com>
>>
>> During ellipsis processing the collation cursor was not correctly
>> moved to the end of the ellipsis after processing.
>>
>> The code inserted the new entry after the cursor, but before the
>> real end of the ellipsis:
>>                                 [cursor]
>> ... element_t <-> element_t <-> element_t <-> element_t
>>                   "<U0000>"     "<U0001>"     "<U007F>"
>>                   startp                      endp
>>
>> At the end of the function we have:
>>
>>                   [cursor]
>> ... element_t <-> element_t <-> element_t
>>                   "<U007E>"     "<U007F>"
>>                                 endp
>>
>> The cursor should be pointing at endp, the last element in the
>> doubly-linked list, otherwise when execution returns to the
>> caller we will start inserting the next line after <U007E>.
>>
>> Subsequent operations end up unlinking the ellipsis end entry or
>> just leaving it in the list dangling from the end.  This kind of
>> dangling is immediately visible in C.UTF-8 with the following
>> sorting from strcoll:
>> <U0010FFFF>
>> <U0000FFFF>
>> <U000007FF>
>> <U0000007F>
>>
>> With the cursor correctly adjusted the end entry is correctly given
>> the right location and thus the right weight.
>>
>> No regressions on x86_64 and i686.
>>
>> Co-authored-by: Carlos O'Donell <carlos@redhat.com>
>> ---
>>  locale/programs/ld-collate.c | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/locale/programs/ld-collate.c b/locale/programs/ld-collate.c
>> index 0af21e05e2..b6406b775d 100644
>> --- a/locale/programs/ld-collate.c
>> +++ b/locale/programs/ld-collate.c
>> @@ -1483,6 +1483,9 @@ order for `%.*s' already defined at %s:%Zu"),
>>  	    }
>>  	}
>>      }
>> +  /* Move the cursor to the last entry in the ellipsis.
>> +     Subsequent operations need to start from the last entry.  */
>> +  collate->cursor = endp;
>>  }
> 
> I do not completely understand the code, but I double-checked a few
> things, and this looks consistent.  So I guess it's okay to check this
> in.

Re-tested again, no regressions. Pushed. Thanks.


-- 
Cheers,
Carlos.



More information about the Libc-alpha mailing list