ls/stat on OneDrive causes download of files
Jeffrey Altman
jaltman@secure-endpoints.com
Wed Mar 6 18:55:17 GMT 2024
On 3/6/2024 12:19 PM, Corinna Vinschen via Cygwin wrote:
> We can add an explicit call to
>
> RtlSetProcessPlaceholderCompatibilityMode (PHCM_EXPOSE_PLACEHOLDERS);
>
> and we can recognize the IO_REPARSE_TAG_FILE_PLACEHOLDER and
> IO_REPARSE_TAG_CLOUD_* tags during symlink evaluation, but even then
> we still have to know what the reparse point buffer actually contains.
>
> Given that the content of reparse points with these reparse tags are
> undocumented, some people using cloud services should examine these
> reparse points so we can add some suitable code to Cygwin.
>
>
> Corinna
I'm not an expert in this area by any means but here are my
recollections from when Microsoft presented in-person on cloud
placeholders to filter and filesystem developers many years ago.
Files and directories that are placeholders should have either the
FILE_ATTRIBUTE_RECALL_ON_DATA_ACCESS or FILE_ATTRIBUTE_RECALL_ON_OPEN
file attributes set. When these attributes are set, applications and
mini filters are advised not to "read" or "open" the files or
directories unless they absolutely need to because doing so will cause
the placeholder to be replaced by an object containing the actual data
which might take a long time to fetch, might cost the end user money, or
might fail depending upon the network connectivity. In particular,
anti-malware should ignore them during scans and only analyze the data
when it is fetched locally by an end user application.
I believe that IO_REPARSE_TAG_FILE_PLACEHOLDER was replaced by
IO_REPARSE_TAG_CLOUD_1 ..IO_REPARSE_TAG_CLOUD_F. Any reparse tag
attached to a placeholder object is for the interpretation of the filter
associated with the back-end storage and not for the consumption of
applications. The content of the reparse tags can be back-end
proprietary; different reparse data for onedrive, icloud, dropbox, etc.
The default ProcessPlaceholderCompaibilityMode is
PHCM_EXPOSE_PLACEHOLDERS which makes the FILE_ATTRIBUTE flags and
reparse tags visible. Microsoft maintains a database of processes for
which PHCM_DISGUISE_PLACEHOLDER is set which hides that information. Its
unclear to me that explicitly setting the placeholder compatibility mode
is useful.
I'm not sure that exposing the object as a symlink is a good idea. A
posix symlink is an object whose type and target information cannot
change. In the case of a placeholder, the placeholder is silently
replaced by the actual object either when the object is opened or the
object's data is accessed. An application that believes it knows that
the object is a symlink will be mighty confused when it turns out to be
a file or a directory.
Perhaps the question that needs to be asked is whether there are opens
that can be skipped if an object is known to not be locally present
(either of the FILE_ATTRIBUTE flags are set)?
Jeffrey Altman
More information about the Cygwin
mailing list