This is the mail archive of the mailing list for the elfutils project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

`strings` mis-behavior


I thought about reading strings.c but the code turned out to be 
surprisingly complex. I understand that it's tuned for speed but I doubt 
that I would want my default `strings` to be that complex, from security 
POV. I ended up looking into process_chunk and read_block_no_mmap 
functions mostly (probably not the most tested codepath). Some notes 
follow. I don't think any of them is a security issue.

1. You cannot override one -e option with another. E.g. `strings -e b -e 
l` will leave "big_endian = true".

2. A string at the end of file is not printed:

$ printf abcd | ./strings
(no output)

3. The things don't generally work well cross blocks. E.g. strings of 
the length of about CHUNKSIZE trigger an assert:

$ printf '%65536s\n' x | ./strings
strings: strings.c:530: read_block_no_mmap: Assertion `unprinted == 
((void *)0) || ntrailer == 0' failed.

Or with a delay between blocks:

$ (printf abcd; sleep 1; echo e) | ./strings
strings: strings.c:530: read_block_no_mmap: Assertion `unprinted == 
((void *)0) || ntrailer == 0' failed.

4. Strings longer than about 2 CHUNKSIZE are not handled (only tail is 

$ printf '%132000s\n' x > test
$ cat test | ./strings | wc -L

5. Multibyte support seems to be completely broken:

$ printf abcdef | iconv -t ucs-4le | ./strings -n 1 -e L
(it should print "abcdef").

Probably the first step could be to replace "++buf" with "buf += 
bytes_per_char" in process_chunk_mb which should make it work in easy 
cases assuming that NULs are ignored in output.

And I don't understand this comment:

    378                /* There is no sane way of printing the string. 
If we
    379                   assume the file data is encoded in UCS-2/UTF-16 or
    380                   UCS-4/UTF-32 respectively we could covert the 
    381                   But there is no such guarantee.  */
    382                fwrite_unlocked (start, 1, buf - start, stdout);

6. There is the following code in read_block_no_mmap:

    548            /* We only use complete characters.  */
    549            nb &= ~(bytes_per_char - 1);

but it seems there is no compensation for the dropped bytes. If we split 
the input from example in the item 5 into chunk of 6 bytes we can see 
the loss:

$ printf abcdef | iconv -t ucs-4le | perl -e '$|=1; while (read STDIN, 
$s, 6) { print $s; select undef, undef, undef, .1; }' | ./strings -n 1 -e L
Alexander Cherepanov

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]