Skip to content

Replace time based logic with atomci rename for write#2897

Open
mamazzol wants to merge 1 commit into
open-telemetry:mainfrom
mamazzol:disk-buffering-atomic-rename
Open

Replace time based logic with atomci rename for write#2897
mamazzol wants to merge 1 commit into
open-telemetry:mainfrom
mamazzol:disk-buffering-atomic-rename

Conversation

@mamazzol
Copy link
Copy Markdown

@mamazzol mamazzol commented Jun 2, 2026

Description:

Feature addition: This PR replaces the time-based separation with an atomic rename

  • The writer always writes to .tmp. On close the temporary file is atomically renamed to its final name via File.renameTo, which delegates to POSIX rename(2) for same-directory renames.
  • FolderManager now sweeps for orphan temporary files at construction and promotes them.
  • A read attempt that finds no ready file will force-close an expired writer instead of waiting for the next append.

As a consequence, minFileAgeForReadMillis is no longer required for correctness (default is now 0) and the build-time validation minFileAgeForRead > maxFileAgeForWrite has been dropped.

Existing Issue(s):

#2889

Testing:

./gradlew :disk-buffering:check is green.

  • New WritableFileTest case for the staging-then-rename behavior.
  • New FolderManagerTest cases for orphan recovery and force-close-on-read with an expired writer.

Documentation:

  • README.md: updated the writer/reader synchronization explanation and the minFileAgeForReadMillis description.
  • DESIGN.md: added a short section on the rename-on-close model.
  • FileStorageConfiguration#getMinFileAgeForReadMillis Javadoc updated to describe the new default and use case.

Copilot AI review requested due to automatic review settings June 2, 2026 12:23
@mamazzol mamazzol requested a review from a team as a code owner June 2, 2026 12:23
@linux-foundation-easycla
Copy link
Copy Markdown

CLA Not Signed

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR changes disk-buffering writer/reader coordination to use a “stage then rename-on-close” workflow, reducing reliance on time-based read delays and improving crash recovery for in-flight writes.

Changes:

  • Writer now appends to <timestamp>.tmp and promotes to the final numeric filename on close.
  • Folder startup recovers orphan *.tmp files by promoting them to final names when safe.
  • Configuration/docs updated to make minFileAgeForReadMillis optional (default 0) and update tests accordingly.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
disk-buffering/src/main/java/io/opentelemetry/contrib/disk/buffering/internal/storage/files/WritableFile.java Implements staging + rename-on-close promotion behavior.
disk-buffering/src/main/java/io/opentelemetry/contrib/disk/buffering/internal/storage/FolderManager.java Recovers orphan temp files and allows reader to close expired writer to surface data.
disk-buffering/src/main/java/io/opentelemetry/contrib/disk/buffering/storage/impl/FileStorageConfiguration.java Updates defaults/docs and removes min-read-age vs max-write-age validation.
disk-buffering/src/test/java/io/opentelemetry/contrib/disk/buffering/internal/storage/files/WritableFileTest.java Adds/updates tests for staged writes and promotion-on-close behavior.
disk-buffering/src/test/java/io/opentelemetry/contrib/disk/buffering/internal/storage/StorageTest.java Updates expectations to account for visible .tmp files while writing.
disk-buffering/src/test/java/io/opentelemetry/contrib/disk/buffering/internal/storage/FolderManagerTest.java Adds coverage for orphan *.tmp recovery and adjusts filename expectations.
disk-buffering/README.md Updates user-facing explanation for rename-on-close synchronization.
disk-buffering/DESIGN.md Documents the new synchronization approach and crash recovery.
CHANGELOG.md Notes behavioral/config changes for the release notes.

Comment on lines 93 to 100
public synchronized void close() throws IOException {
if (isClosed.compareAndSet(false, true)) {
out.close();
if (!staging.renameTo(destination)) {
throw new IOException("Could not rename " + staging + " to " + destination);
}
}
}
Comment on lines 93 to 100
public synchronized void close() throws IOException {
if (isClosed.compareAndSet(false, true)) {
out.close();
if (!staging.renameTo(destination)) {
throw new IOException("Could not rename " + staging + " to " + destination);
}
}
}
Comment on lines +58 to 85
private void recoverOrphanTempFiles() {
File[] existingFiles = folder.listFiles();
if (existingFiles == null) {
return;
}
for (File file : existingFiles) {
String name = file.getName();
if (!name.endsWith(STAGING_SUFFIX)) {
continue;
}
File target = new File(folder, name.substring(0, name.length() - STAGING_SUFFIX.length()));
if (!NUMBER_PATTERN.matcher(target.getName()).matches()) {
continue;
}
if (target.exists()) {
if (!file.delete()) {
logger.warning("Could not delete duplicate orphan temp file: '" + file.getName() + "'");
}
continue;
}
if (file.renameTo(target)) {
logger.fine(
"Recovered orphan temp file: '" + file.getName() + "' -> '" + target.getName() + "'");
} else {
logger.warning("Could not promote orphan temp file: '" + file.getName() + "'");
}
}
}
Comment on lines +124 to +127
/*
* If the current writable file has expired, close it and return true.
* This allows to have a readeable file without waiting for the next write to trigger the check.
*/
Comment on lines 86 to 88
public final FileStorageConfiguration build() {
FileStorageConfiguration configuration = autoBuild();
if (configuration.getMinFileAgeForReadMillis()
<= configuration.getMaxFileAgeForWriteMillis()) {
throw new IllegalArgumentException(
"The configured max file age for writing must be lower than the configured min file age for reading");
}
return configuration;
return autoBuild();
}
Comment thread disk-buffering/README.md
Comment on lines +184 to +188
The writing and reading processes can run in parallel because they target disjoint files: the
writer appends to `<timestamp>.tmp` and only files whose name is *entirely* numeric are visible to
the reader (the reader applies `Matcher.matches()` on `\d+`, so the trailing `.tmp` excludes
in-flight files). Each rollover atomically renames the temp file to its final name via
`rename(2)`, so the reader can never observe a partially-written file. If the reader is invoked
Copy link
Copy Markdown
Contributor

@LikeTheSalad LikeTheSalad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, @mamazzol 🙏

Please take a look at Copilot's suggestions because I think those are valid points. But apart from those, I think it's good to go.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants