[PATCH setup] Add new option '--compact-os'

Christian Franke Christian.Franke@t-online.de
Thu May 13 14:42:58 GMT 2021


Corinna Vinschen via Cygwin-apps wrote:
> On May 12 16:14, Jon Turney wrote:
>> On 08/05/2021 21:03, Christian Franke wrote:
>> [...]
>>> +bool io_stream_cygfile::compact_os_is_available = (OSMajorVersion () >= 10);
>> The documentation seems a bit vague, but are we really expecting this to
>> work on Windows 10 1507?
> I think this could even work under 8.1 from what I can see on MSDN.

I skipped all Win8*, so I didn't test with 8.1 :-)

This page says "Available starting with Windows 10":
https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/ntifs/ns-ntifs-_file_provider_external_info_v0

It also says "Header: ntifs.h" but in recent "Windows Kits" all required 
defines are in winioctl.h.

These defines are enabled even for '>= _WIN32_WINNT_WIN7'. According to 
a test I did some time ago, Win7 could not read these files.


>
>>> +{
>>> +  const char * const p = name.c_str();
>>> +  if (!(!strncmp (p, "/bin/", 5) || !strncmp (p, "/sbin/", 6) || !strncmp (p, "/usr/", 5)))
>>> +    return true; /* File is not in R/O tree. */
>>> +  const size_t len = name.size(); /* >= 5 */
>>> +  if (!strcmp (p + (len - 4), ".dll") || !strcmp (p + (len - 3), ".so"))
>>> +    return true; /* Rebase will open file for writing which uncompresses the file. */
>>> +  if (!strcmp (p + (len - 3), ".gz") || !strcmp (p + (len - 3), ".xz"))
>>> +    return true; /* File is already compressed. */
>> Is this an assertion that there are no .bz2, .lzma, .zst etc. files in the
>> install?
> Another question is this: FILE_PROVIDER_COMPRESSION_LZX
> "This algorithm is designed to be highly compact, and provides for small
>   footprint for infrequently accessed data."
>
> When running a shell script, certain executables (especially coreutils,
> gawk, sed, grep, find) are not so very infrequently accessed.  Is this
> compression really feasible for these binaries?  Did you compare shell
> script performance with non-compressed, XPRESS16K and LZX compressed
> /bin dir?

Good point. Now I did a test with a ./configure script run after reboot: 
There was significant difference with /bin/*.exe (only) uncompressed, 
NTFS-, XPRESS16K- or LZX-compressed. Time was always around 23s.

Here a read speed test with fast and slow storage and a 10+ years old 
i7-2600K (4C/8T). The 256MiB test file was generated by concatenating 
various EXE files. All file accesses were the first after reboot. AV 
(defender) was turned off:


  Compression MiB      T1     T2   T3,T4
  ======================================
  None        256   0.69s  10.1s  <0.02s
  NTFS        159   1.03s   8.1s  <0.02s
  XPRESS4K    138   -
  XPRESS8K    128   -
  XPRESS16K   123   0.64s   5.4s  <0.02s
  LZX          97   0.79s   4.8s  <0.02s

T1,T2: Read whole file: time dd if=FILE bs=FILESIZE of=/dev/null
T3,T4: Read last byte: time dd if=FILE bs=1 skip=FILESIZE-1 of=/dev/null

T1,T3: SATA SSD, raw read speed with dd bs=1M: ~520MB/s
T2,T4: USB3 flash drive via USB2, raw read speed: ~27MB/s


As expected, compression helps to improve 'virtual' read speed on slow 
storage. Otherwise, it depends on storage speed, CPU speed, system load, ...
As unexpected (for me), even LZX seems to be suitable for random reads 
which are done when EXE files are preloaded or paged-in.

If the files were already cached, all read times were similar: ~0.135s 
for the whole file.

For more flexibility, I will provide a new version of the patch with 
'--compact-os ALGORITHM' option.

Thanks,
Christian



More information about the Cygwin-apps mailing list