[cygwin] DD bug fails to wipe last 48 sectors of a disk

Brian Inglis Brian.Inglis@SystematicSw.ab.ca
Sun Jun 28 20:28:04 GMT 2020


On 2020-06-28 11:50, Jason Pyeron wrote:
> On Sunday, June 28, 2020 10:35 AM, Christian Franke wrote:
>> Andrey Repin wrote:
>>>> dd if=/dev/zero of=/dev/sda iflag=fullblock bs=4M status=progress
>> The root of the problem is that the Windows WriteFile() function
>> apparently does not support truncated writes at EOM. If seek_position +
>> write_size > disk_size, then WriteFile() does nothing and returns an error.
>>> oflag=direct
>>> Although I'm unsure how Cygwin/Windows handles it. But without this flag, the
>>> write is cached, and the problem may be outside dd, or even Cygwin.
>> If 'oflag=direct' is used, dd passes O_DIRECT flag to open() call of
>> output file. Cygwin's open() function then passes
>> FILE_NO_INTERMEDIATE_BUFFERING to NtCreateFile() and the write()
>> function calls WriteFile() directly with original address and size.
>> Without O_DIRECT, Cygwin ensures that address and size passed to
>> WriteFile() are both aligned to sector size. All writes are then done
>> through a 64KiB internal buffer.
>> As a consequence, oflag=direct in the above dd command may increase
>> speed but would also let the final 4MiB WriteFile() fail. Without
>> oflag=direct, only the last 64KiB WriteFile() fails.
>> To clear the last sectors of the disk, use an appropriate small block
>> size. I did this several times with Cygwin 'dd seek=... bs=512 ...' to
>> get rid of Intel RST RAID metadata.

It appears that write(2) allows:

"An error return value while performing write() using direct I/O does not mean
the entire write has failed. Partial data may be written and the data at the
file offset on which the write() was attempted should be considered inconsistent."

and write(3p) can return -1 errno EAGAIN to suggest a retry with a smaller
buffer size, in regards to pipes and FIFOs, so this could be allowed on devices:

"Deferred: ret=−1, errno=[EAGAIN]
This  error  indicates that a later request may succeed. It does not indicate
that it shall succeed, even if nbyte≤{PIPE_BUF}, because if no process reads
from the pipe or FIFO, the write never succeeds. An application could usefully
count the number of times [EAGAIN] is caused by a particular value of
nbyte>{PIPE_BUF} and perhaps do later writes with a smaller value, on the
assumption that the effective size of the pipe may have decreased.
Partial and deferred writes are only possible with O_NONBLOCK set."

"There is no exception regarding partial writes when O_NONBLOCK is set. With the
exception of writing to an empty pipe, this volume of POSIX.1‐2008 does not
specify exactly when a partial write is performed since that would require
specifying internal details of the implementation. Every application should be
prepared to handle partial writes when O_NONBLOCK is set and the requested
amount is greater than {PIPE_BUF}, just as every application should be prepared
to handle partial writes on other kinds of file descriptors."

and Cygwin write() could return -1 errno EAGAIN when a write at the end of a
device fails, and the application (dd) should be prepared to retry with a
smaller blocksize (half while even, then round up to even, as I/Os are
often/always? required to be even or binary multiple sizes) e.g. if the last
block below returned an error 39072726/2 -> 19536363 so +1 -> 19536364, or a
larger binary multiple such as 512, 64K, 1M using integer arithmetic: ((n - m +
1)/m + 1)*m, e.g in the trivial case above: (19536363 - 2 + 1)/2 + 1)*2 ->
19536364, either until a write succeeded, or the size was small enough e.g. 512,
that the result should be returned as a failure.

> The reason I have never encountered this is because I use a block size which 
> is the largest practical GCD of the drive size and 512 bytes (typically
> between 32 MB and 64 MB).
> 
> E.g. I have a drive that is 160,041,885,696 bytes, which divides 312,581,808
> times evenly into 512. I would use a block size of ‭39,072,726‬ bytes, which
> gives 4,096 blocks to write.
I am somewhat surprised that Unix and Cygwin support non-blocksize aligned
writes, but it seems you have used this successfully, correct?

A useful way to approach this uses coreutils bc and factor:

	$ bc <<< 160041885696/512	# blocks
	312581808
	$ factor 312581808		# prime factors
	312581808: 2 2 2 2 3 3 3 7 11 9397

checking and splitting off the non-binary factors, we have to pick some binary
multiple of the non-binary factor product in a reasonable range:

	$ bc <<< 9397*11*7*3^3*2^4\;9397*11*7*3^3\;2^4\;9397*11*7*3^3*2
	312581808
	19536363
	16
	39072726

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

This email may be disturbing to some readers as it contains
too much technical detail. Reader discretion is advised.
[Data in IEC units and prefixes, physical quantities in SI.]


More information about the Cygwin mailing list