Fix fuse_dfs flush skipping hdfsHFlush when fi->flags is 0#8545
Fix fuse_dfs flush skipping hdfsHFlush when fi->flags is 0#8545shanrui wants to merge 3 commits into
Conversation
|
@ajfabbri May i ask a review on this? |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
Thanks for the PR. This needs a Jira which describes the bug and how it manifests. Please use our PR template for the main description here and follow its instructions. This should also include a test case. Once you've done this and Yetus is happy I'll take another look. I'm not familiar with why newer Fuse impls. would clear those flags, so any reference to that information would be helpful as well. |
|
@ajfabbri Thanks for the feedback. There is a related historical Jira, HDFS-2551. While the root cause is different, the observed behavior is conceptually similar: if data is not properly flushed (especially when hdfs flush() not hdfs hflush() is invoked), HDFS cannot guarantee that newly written data will be immediately visible to readers. In this case, skipping hdfsHFlush() in dfs_flush() can lead to the same symptom: a reader opened after the write may observe an empty file or stale data until the writer closes the file. Regarding fi->flags, I found that as early as FUSE 2.4 the documentation did not guarantee that the original open flags would be reliably preserved across all callback invocations(https://github.com/libfuse/libfuse/blob/fuse_2_4_0/include/fuse_common.h#L36). Because of this, using fi->flags inside flush() to determine whether the file was opened for writing may not be reliable. In my environment, fi->flags was observed to be zero in dfs_flush(), which caused hdfsHFlush() to be skipped entirely. I'm working on creating a Jira and adding a test case as suggested. |
Problem:
In modern libfuse implementations, fi->flags in flush() callback
may be reset to 0. The original fuse_dfs code relies on:
if (fi->flags & O_WRONLY)
As a result, hdfsHFlush/hdfsHSync is skipped, causing newly written
data to be invisible to readers until file close.
Root cause:
fi->flags is not guaranteed to be preserved in flush().
Fix:
Store open flags in dfs_fh during dfs_open(), and use it in flush()
instead of fi->flags.
Testing:
Verified write -> flush -> read consistency
Verified hdfsHFlush is triggered correctly