Skip to content

fix(ilp): fix disk-full handling for SF buffer creation#25

Open
mtopolnik wants to merge 2 commits into
mainfrom
mt_allocate-cross-platform
Open

fix(ilp): fix disk-full handling for SF buffer creation#25
mtopolnik wants to merge 2 commits into
mainfrom
mt_allocate-cross-platform

Conversation

@mtopolnik
Copy link
Copy Markdown
Contributor

Summary

Splits the responsibilities of Files.openCleanRW and Files.allocate and unifies the allocate contract across Linux, macOS, and Windows.

The two functions previously overlapped: both advanced logical EOF, and the old allocate on POSIX simply called ftruncate (leaving the file sparse) while on macOS it also over-allocated by currentSize on non-empty files. The single production caller pattern -- openCleanRW(path, sz) then allocate(fd, sz) -- left a sparse file on POSIX and a correctly-reserved one on Windows. When the disk filled later, a producer thread storing into the mmap'd region would trigger a SIGBUS that aborts the JVM (Linux/macOS) or an in-page exception (Windows).

After:

  • openCleanRW(path) owns the file lifecycle: open RW, truncate to empty. No size parameter.
  • allocate(fd, size) owns "extend EOF and reserve real disk blocks for [0, target)." target = max(size, currentSize); the file never shrinks. ENOSPC / ERROR_DISK_FULL surface as a clean false return. Same observable behaviour on all three platforms.

End-user impact: SF buffer creation hitting a full disk now fails synchronously with a MmapSegmentException the sender can recover from, instead of crashing the JVM later.

Tradeoffs

  • API change: Files.openCleanRW and FilesFacade.openCleanRW lost their size parameter. Callers that need a sized file follow with Files.allocate (reserves blocks; fails on ENOSPC) or Files.truncate (sparse; faster). Two production call sites and several test sites updated.
  • On Linux/macOS, if the filesystem rejects posix_fallocate / F_PREALLOCATE, allocate falls back to ftruncate and leaves blocks sparse; the SIGBUS risk re-emerges on that filesystem only. Documented on Files.allocate's Javadoc. Windows has no equivalent fallback.
  • macOS allocate now passes target - currentSize to F_PREALLOCATE (the correct beyond-EOF length under F_PEOFPOSMODE) instead of the full target. Behaviour change on non-empty files: stops requesting duplicate allocation for the existing region.

Test plan

  • `mvn -pl core test` passes (2219 / 2219, 1 skipped, no failures)
  • FilesTest pins the no-shrink and zero-on-fresh-file invariants on `allocate`
  • MmapSegmentTest covers the full create path including fault-injected `openCleanRW` failure and post-`allocate` cleanup
  • AckWatermarkTest covers the wrong-/missing-size branch that now uses `openCleanRW + allocate`

mtopolnik and others added 2 commits May 14, 2026 15:29
openCleanRW(path, size) and allocate(fd, size) both advanced the
logical EOF, which became a problem once allocate gained a
never-shrinks contract: callers that did openCleanRW(path, sz) then
allocate(fd, sz) (the only production pattern) hit the
target == currentSize short-circuit and got no block reservation.
SIGBUS protection on mmap stores was silently disabled.

Clean split of responsibilities:

- openCleanRW(path): sole owner of file lifecycle -- create or
  truncate to empty, open RW. The size parameter is gone from the
  JNI, Java, and FilesFacade API.

- allocate(fd, size): sole owner of "extend EOF and reserve real
  blocks for [0, target)." Cross-platform contract:
    * Linux: posix_fallocate(fd, currentSize, target - currentSize),
      followed by ftruncate only on the sparse-fallback path.
    * macOS: fcntl(F_PREALLOCATE) passes newBytes (not the full
      target) to fst_length, fixing a long-standing over-allocation
      on non-empty files; ftruncate(fd, target) advances EOF.
    * Windows: FILE_ALLOCATION_INFO + FILE_END_OF_FILE_INFO,
      short-circuiting when the file is already at target.

Production callers updated:

- MmapSegment.create: openCleanRW(ptr) + allocate(fd, sizeBytes).
  Restores ENOSPC-at-create semantics for SF buffers.
- AckWatermark.open: openCleanRW(path) + allocate(fd, FILE_SIZE) on
  the wrong-/missing-size branch; the correct-size branch still uses
  openRW to preserve the previous session's watermark.

Tests updated for the new signatures; the full client test suite
passes (2219 / 2219).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants