disagg: limit S3 read amplification on compute nodes by JaySon-Huang · Pull Request #10771 · pingcap/tiflash

JaySon-Huang · 2026-03-25T01:33:25Z

What problem does this PR solve?

Issue Number: close #10752

Problem Summary:

In disaggregated read workloads, compute nodes can generate uncontrolled S3 traffic when cold reads and repeated cache misses amplify remote reads on the same node. We need a production-safe way to bound S3 traffic, reduce duplicate downloads for the same object, and make the resulting behavior observable and tunable.

What is changed and how it works?

add a node-level S3 byte limiter and wire it into both direct reads and FileCache downloads
avoid duplicate S3 downloads for the same key by adding bounded wait on in-flight FileCache entries
add phase-2 remote cache observability for bounded wait and background download stages
refine FileCache and S3 read paths for maintainability, comments, failure handling, and reload consistency
add an English design document for the latest disagg S3 backpressure plan

More specifically:

add S3ReadLimiter to cap node-level S3 read bytes for both direct reads and FileCache downloads
add storage.io_rate_limit.s3_max_read_bytes_per_sec
remove the earlier s3_max_get_object_streams stream-cap design after validation showed that a reader-lifetime stream token can stall disaggregated query progress
publish the shared limiter through ClientFactory so all S3 clients on the node observe the same node-level byte budget
keep the limiter object stable across config reloads; reloading to 0 now updates the existing limiter instance instead of replacing it, so in-flight readers observe the disable and the related metric is updated consistently
make S3RandomAccessFile use limiter-aware chunked read/seek paths while sharing common read/seek tail logic
make FileCache downloads reuse the same limiter, add profiles.default.dt_filecache_wait_on_downloading_ms, and let follower requests perform bounded wait on an in-flight same-key download instead of immediately creating more remote-read pressure
handle bounded-wait download failures by removing failed placeholders eagerly so later retries can proceed
add remote-cache observability for:
- bounded wait result / bytes / wait-seconds by result x file_type
- background download stage seconds by stage x file_type
- reject counts for too_many_download by file_type
- background download queue depth
add an English design doc under docs/design/2026-03-24-disagg-s3-node-level-backpressure-and-filecache-dedup.md

Check List

Tests

Unit test
Integration test
Manual test (add detailed scripts or steps below)
No code

Manual test

Tested with following py script

run_cold_disagg_read.py

# Round 1: cold-cache baseline without node-level S3 byte limiting and without FileCache bounded wait.
# Round 2: cold-cache baseline with two concurrent queries and no parameter changes.
# Round 3: cold-cache run with node-level S3 byte limiting enabled at 150 MiB/s.
# Round 4: keep the same S3 byte limit, enable FileCache bounded wait at 5000ms.
# Round 5: raise the S3 byte limit from 150 MiB/s to 200 MiB/s and rerun the cold-cache query.
# Round 6: keep the same config as round 5 and run the target query twice concurrently.

The six-round run_cold_disagg_read.py experiment clearly reproduces the target scenario described in the design doc: under cold-cache conditions, two concurrent queries without any limiter push the host bond1 receive peak to about 2.48 GB/s, while direct reads and FileCache downloads both surge at the same time. This is exactly the node-level S3 read amplification problem the proposal is meant to control. After enabling s3_max_read_bytes_per_sec, TiFlash shows sustained s3_read_byte pending activity, and the host-level network peak is consistently reduced to the few-hundred-MB/s range. That confirms the node-level byte limiter is effective not only in internal accounting but also at the real machine network level, turning the system from “unbounded amplification that can blow up the node network” into a bounded, observable, and tunable degraded state.

In this workload, 150 MiB/s is too conservative and significantly stretches TiFlash execution time. Raising the limit to 200 MiB/s noticeably improves both tiflash_task execution time and overall slowlog wall time in the single-query case, while still keeping network usage under control. For the current sample, s3_max_read_bytes_per_sec = 209715200 together with dt_filecache_wait_on_downloading_ms = 5000 is the most balanced configuration observed so far. Bounded wait helps mainly on the merged path in single-query runs, but under two-query concurrency it also starts producing substantial coldata hits/timeouts, which indicates that its benefit expands once overlap becomes strong enough. At this point, the dominant remaining bottleneck is no longer uncontrolled network traffic, but FileCache background-download supply: bg_downloading_count, bg_download_queue_count, and too_many_download stay high across multiple rounds. Therefore, the next optimization priority should be downloader concurrency / queue / admission behavior rather than further increasing the bounded-wait budget.

Side effects

Performance regression: Consumes more CPU
Performance regression: Consumes more Memory
Breaking backward compatibility

Documentation

Affects user behaviors
Contains syntax changes
Contains variable changes
New configs

profiles.default.dt_filecache_wait_on_downloading_ms
storage.io_rate_limit.s3_max_read_bytes_per_sec

Contains experimental features
Changes MySQL compatibility

Release note

limit S3 read amplification on compute nodes

Summary by CodeRabbit

New Features
- Node-level S3 read-rate limiter; bounded follower wait for in-flight FileCache downloads; two per-node HTTP cache-evict endpoints.
Improvements
- Limiter-aware chunked reads/seeks and downloads, background-download queue/stage timing, failed-placeholder cleanup, expanded remote-cache & I/O-limiter metrics, runtime wait tuning setting.
Tests
- New unit/integration coverage for limiter, FileCache wait semantics, background-failure and scheduling cleanup paths, and limiter publication.
Documentation
- Added design doc and HTTP API docs.

ti-chi-bot · 2026-03-25T01:33:29Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

ti-chi-bot · 2026-03-25T01:33:32Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from jayson-huang and additionally assign yudongusa for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS
dbms/src/Interpreters/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

coderabbitai · 2026-03-25T01:33:33Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
✅ Review completed - (🔄 Check again to review again)

📝 Walkthrough

Walkthrough

Adds a node-level S3 read token-bucket limiter and S3 read metrics/recording; wires the limiter through IORateLimiter → ClientFactory → TiFlashS3Client → S3 clients; makes S3 reads and FileCache downloads limiter-aware (chunked reads, bg downloads), adds bounded follower waits, new metrics, failpoints, tests, and an HTTP cache-evict API.

Changes

Cohort / File(s)	Summary
Fail Points `dbms/src/Common/FailPoint.cpp`	Register new fail-points `file_cache_bg_download_fail`, `file_cache_bg_download_schedule_fail`.
Metrics `dbms/src/Common/TiFlashMetrics.h`, `dbms/src/Common/TiFlashMetrics.cpp`	Add remote-cache enums, metric families (counters/histograms), bucket boundaries, and accessor methods for per-(file_type × result/reason/stage) metrics.
Rate-Limit Config `dbms/src/IO/BaseFile/IORateLimitConfig.h`, `dbms/src/IO/BaseFile/IORateLimitConfig.cpp`	Add `s3_max_read_bytes_per_sec` field, parsing, toString, and equality comparison updates.
IORateLimiter `dbms/src/IO/BaseFile/RateLimiter.h`, `dbms/src/IO/BaseFile/RateLimiter.cpp`	Add `s3_read_limiter` member, public `getS3ReadLimiter()`, create/update logic from config, and propagate stop.
S3 Read Limiter (API & Impl) `dbms/src/Storages/S3/S3ReadLimiter.h`, `dbms/src/Storages/S3/S3ReadLimiter.cpp`, `dbms/src/Storages/S3/S3ReadLimiter_fwd.h`	Introduce `S3ReadLimiter` (token-bucket), `S3ReadMetricsRecorder`, refill/update, blocking `requestBytes`, suggested-chunk helper, stop semantics, and metrics recording.
S3 Client / Factory `dbms/src/Storages/S3/S3Common.h`, `dbms/src/Storages/S3/S3Common.cpp`, `dbms/src/Storages/S3/MockS3Client.h`	Propagate shared `S3ReadLimiter`/metrics-recorder into TiFlashS3Client/MockS3Client; add client accessors and ClientFactory setters to publish shared limiter/recorder.
S3 RandomAccessFile `dbms/src/Storages/S3/S3RandomAccessFile.h`, `dbms/src/Storages/S3/S3RandomAccessFile.cpp`	Add limiter-aware chunked read/seek flows, finalize helpers, integrate `read_limiter`/metrics-recorder, and centralize post-read/seek handling.
FileCache `dbms/src/Storages/S3/FileCache.h`, `dbms/src/Storages/S3/FileCache.cpp`	Add bounded follower wait (`waitForNotEmptyFor`), pass enqueue_time to bg executor, integrate `S3ReadLimiter` into download path, record bg queue/stage metrics, add failpoint-triggered bg failure handling and cleanup, and runtime `wait_on_downloading_ms`.
Server wiring `dbms/src/Server/Server.cpp`	Publish IORateLimiter’s S3ReadLimiter into ClientFactory after init and on config reload.
Settings `dbms/src/Interpreters/Settings.h`	Add `dt_filecache_wait_on_downloading_ms` setting (UInt64, default 0).
Tests tests across `dbms/src/IO/BaseFile/tests/`, `dbms/src/Server/tests/`, `dbms/src/Storages/S3/tests/*`	Add unit/integration tests for S3ReadLimiter behavior, config parsing, FileCache bounded-wait semantics, failpoint-driven bg failures, and publication of the shared limiter.
HTTP status / Evict API `dbms/src/Storages/KVStore/FFI/ProxyFFIStatusService.cpp`, `.../ProxyFFIStatusService.h`, tests, docs	Add `/tiflash/cache/evict/mark` and `/tiflash/cache/evict/minmax` handlers, parsing helpers, tests, and documentation.
Misc — robustness & logging various files (`ChecksumBuffer.cpp`, `DMFileReader.cpp`, etc.)	Tighter seek error logging and added exception logging/handling around seeks.
Docs / Design `docs/design/...-disagg-s3-node-level-backpressure-and-filecache-dedup.md`, `docs/tiflash_http_api.md`	Add design doc describing node-level S3 backpressure, FileCache deduplication, metrics, and new HTTP API doc.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant S3RandomAccessFile as S3RandomAccessFile
    participant S3ReadLimiter as S3ReadLimiter
    participant IStream as S3_IStream

    Client->>S3RandomAccessFile: readImpl(buf, size)
    alt Limiter active
        S3RandomAccessFile->>S3ReadLimiter: getSuggestedChunkSize(...)
        loop per chunk
            S3RandomAccessFile->>S3ReadLimiter: requestBytes(chunk, DirectRead)
            S3ReadLimiter-->>S3RandomAccessFile: grant (or block until tokens)
            S3RandomAccessFile->>IStream: read(chunk)
            IStream-->>S3RandomAccessFile: bytes_read
            S3RandomAccessFile->>S3RandomAccessFile: finalizeRead(...)
        end
    else Limiter disabled
        S3RandomAccessFile->>IStream: read(size)
        IStream-->>S3RandomAccessFile: bytes_read
        S3RandomAccessFile->>S3RandomAccessFile: finalizeRead(...)
    end
    S3RandomAccessFile-->>Client: return bytes_read

sequenceDiagram
    actor Leader
    actor Follower
    participant FileCache
    participant FileSegment
    participant BGDownloader as BG_Downloader
    participant S3ReadLimiter as S3ReadLimiter

    Leader->>FileCache: get(key) — insert Empty placeholder
    FileCache->>BGDownloader: schedule bgDownloadExecutor(enqueue_time)
    BGDownloader->>S3ReadLimiter: requestBytes(..., FileCacheDownload)
    Note over S3ReadLimiter: may block until budget available (token-bucket)
    BGDownloader->>BGDownloader: downloadToLocal(...) (chunked when limiter active)
    BGDownloader-->>FileSegment: setState(Complete) and notify waiters
    Leader-->>FileCache: return miss

    Follower->>FileCache: get(key) — sees Empty
    FileCache->>FileSegment: waitForNotEmptyFor(timeout)
    alt Background finishes within timeout
        FileSegment-->>Follower: Complete (hit)
    else Timeout or Failed
        FileSegment-->>Follower: Miss (timeout/failed)
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

disagg: Add O11y on object store usage summary of each tiflash store #10764: Adds similar remote-cache metrics and accessor methods in TiFlashMetrics.
disagg: Fix TiFlash crash randomly when fails connect to s3 #10676: Adds fail-point registration changes overlapping new fail-point identifiers.
disagg: Fine-grained rate limiting for S3 cache downloads #10637: Related S3 download chunking and rate-limiting integration work touching FileCache download paths.

Suggested reviewers

JinheLin
CalvinNeo
wshwsh12

Poem

🐇 I nibbled tokens, tiny and bright,
I kept S3 rivers from rushing the night.
Followers pause a beat or two,
Background hops, then finishes through.
Quiet bytes — my carrot delight.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 5.67% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'disagg: limit S3 read amplification on compute nodes' clearly summarizes the main change, which is adding node-level S3 byte limiting to prevent read amplification in disaggregated workloads.
Linked Issues check	✅ Passed	The PR fully addresses the objectives from issue `#10752`: prevents MPP query failures from network saturation by implementing a node-level S3ReadLimiter, reduces duplicate downloads via bounded FileCache waits, provides tunable controls via s3_max_read_bytes_per_sec and dt_filecache_wait_on_downloading_ms, and adds observability metrics for S3/remote-cache activity.
Out of Scope Changes check	✅ Passed	All code changes are within scope: S3ReadLimiter implementation, FileCache bounded-wait support, configuration changes, metrics integration, and HTTP cache eviction endpoints are all required by the linked objectives. Design documentation and test expansions are appropriate supporting changes.
Description check	✅ Passed	The pull request description is comprehensive, well-structured, and covers all required sections including problem statement, detailed changes, test coverage, and documentation.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Signed-off-by: JaySon-Huang <tshent@qq.com>

dbms/src/Storages/S3/FileCache.cpp

dbms/src/Server/Server.cpp

Signed-off-by: JaySon-Huang <tshent@qq.com>

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

dbms/src/Storages/S3/FileCache.cpp (1)
1317-1342: ⚠️ Potential issue | 🔴 Critical

Failed downloads can corrupt cache_used.

finalizeReservedSize() rebases the reservation to content_length, but the segment keeps its old estimated size until Line 1357. The new cleanup on Lines 1395-1398 / 1450-1451 removes the failed segment and releases file_seg->getSize(), so any failure after Line 1317 with content_length != estimated_size will either leak cache capacity or underflow cache_used.

Persist the finalized reserved size before any later throw point, and add a regression that calls get(...) with std::nullopt or an intentionally wrong size.

Also applies to: 1355-1357, 1390-1399, 1446-1452
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dbms/src/Storages/S3/FileCache.cpp` around lines 1317 - 1342,
finalizeReservedSize updates the reservation but the FileSegment still holds the
old estimated size, so persist the finalized size into the segment immediately
after a successful finalizeReservedSize(...) call (i.e. set
file_seg->setSize(content_length) or equivalent on the FileSegment) before any
later operations that can throw (e.g. before prepareParentDir,
toTemporaryFilename, downloadToLocal, or std::filesystem::rename) and ensure
failed paths release the same size that was reserved; also add a regression test
that calls get(...) with std::nullopt and another test that requests an
intentionally wrong size to verify cache_used/cleanup behavior on download
failures.

🧹 Nitpick comments (1)

dbms/src/Storages/S3/S3RandomAccessFile.cpp (1)

144-171: Refresh the suggested chunk size inside the loop.

Both chunked paths snapshot getSuggestedChunkSize() once before the loop. If s3_max_read_bytes_per_sec is lowered while a large read/seek is already in flight, the remaining iterations keep using the old chunk size and only pick up the tighter burst cap on the next call.

♻️ Minimal change

-    const auto chunk_size = read_limiter->getSuggestedChunkSize(s3_read_limiter_preferred_chunk_size);
     size_t total_gcount = 0;
     while (total_gcount < size)
     {
+        const auto chunk_size = read_limiter->getSuggestedChunkSize(s3_read_limiter_preferred_chunk_size);
         auto to_read = std::min(size - total_gcount, static_cast<size_t>(chunk_size));
         read_limiter->requestBytes(to_read, S3ReadSource::DirectRead);
         istr.read(buf + total_gcount, to_read);
         auto gcount = istr.gcount();
         total_gcount += gcount;
@@
-    const auto chunk_size = read_limiter->getSuggestedChunkSize(s3_read_limiter_preferred_chunk_size);
     size_t total_ignored = 0;
     const auto bytes_to_ignore = static_cast<size_t>(offset - cur_offset);
     while (total_ignored < bytes_to_ignore)
     {
+        const auto chunk_size = read_limiter->getSuggestedChunkSize(s3_read_limiter_preferred_chunk_size);
         auto to_ignore = std::min(bytes_to_ignore - total_ignored, static_cast<size_t>(chunk_size));
         read_limiter->requestBytes(to_ignore, S3ReadSource::DirectRead);
         istr.ignore(to_ignore);
         auto ignored = istr.gcount();
         total_ignored += ignored;

Also applies to: 287-310

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@dbms/src/Storages/S3/S3RandomAccessFile.cpp` around lines 144 - 171, The code
in S3RandomAccessFile::readChunked captures
read_limiter->getSuggestedChunkSize(...) once before the loop so if
s3_max_read_bytes_per_sec is lowered during a large in-flight read the loop
continues using the stale chunk_size; update the implementation to call
read_limiter->getSuggestedChunkSize(s3_read_limiter_preferred_chunk_size) inside
the while loop before computing to_read so each iteration observes the current
limiter suggestion, and apply the same change to the other chunked path
referenced in the review (the second chunked read block around lines 287-310).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@metrics/grafana/tiflash_summary.json`:
- Around line 14015-14024: The Y-axis for the "I/O Limiter Pending Duration"
panel is configured with format "short" but should use seconds; update the
panel's first yaxes entry (the object under "yaxes" for the panel named "I/O
Limiter Pending Duration") by changing "format": "short" to "format": "s" so the
histogram quantiles render with seconds units.
- Around line 20558-20572: The two queries using histogram_quantile both use
0.999 but their legendFormat indicates different percentiles; update the expr
for the query with refId "D" to use histogram_quantile(0.99, ...) (i.e., replace
0.999 with 0.99) so the metric computed by the histogram_quantile call matches
the legendFormat "99%-{{stage}}-{{file_type}} {{$additional_groupby}}"; ensure
you only change the numeric quantile in the expression for refId "D" and keep
the rest of the selector and labels identical to the refId "B" query.
- Around line 20790-20800: The legendFormat in the Grafana target is using
{{additional_groupby}} which prevents template variable interpolation; update
the legendFormat for the target (the object with "refId": "A" and "expr":
"sum(rate(tiflash_storage_remote_cache_wait_on_downloading_result...") to use
the templated variable form {{$additional_groupby}} instead of
{{additional_groupby}} so Grafana will substitute the variable at render time.

---

Outside diff comments:
In `@dbms/src/Storages/S3/FileCache.cpp`:
- Around line 1317-1342: finalizeReservedSize updates the reservation but the
FileSegment still holds the old estimated size, so persist the finalized size
into the segment immediately after a successful finalizeReservedSize(...) call
(i.e. set file_seg->setSize(content_length) or equivalent on the FileSegment)
before any later operations that can throw (e.g. before prepareParentDir,
toTemporaryFilename, downloadToLocal, or std::filesystem::rename) and ensure
failed paths release the same size that was reserved; also add a regression test
that calls get(...) with std::nullopt and another test that requests an
intentionally wrong size to verify cache_used/cleanup behavior on download
failures.

---

Nitpick comments:
In `@dbms/src/Storages/S3/S3RandomAccessFile.cpp`:
- Around line 144-171: The code in S3RandomAccessFile::readChunked captures
read_limiter->getSuggestedChunkSize(...) once before the loop so if
s3_max_read_bytes_per_sec is lowered during a large in-flight read the loop
continues using the stale chunk_size; update the implementation to call
read_limiter->getSuggestedChunkSize(s3_read_limiter_preferred_chunk_size) inside
the while loop before computing to_read so each iteration observes the current
limiter suggestion, and apply the same change to the other chunked path
referenced in the review (the second chunked read block around lines 287-310).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 21ce3b0d-3139-4d04-9ef8-019ec8a0c894

📥 Commits

Reviewing files that changed from the base of the PR and between a1a397d and 9412be4.

📒 Files selected for processing (16)

dbms/src/IO/Checksum/ChecksumBuffer.cpp
dbms/src/Interpreters/Context.cpp
dbms/src/Interpreters/Context.h
dbms/src/Server/Server.cpp
dbms/src/Storages/DeltaMerge/File/DMFileReader.cpp
dbms/src/Storages/KVStore/FFI/ProxyFFIStatusService.cpp
dbms/src/Storages/KVStore/FFI/ProxyFFIStatusService.h
dbms/src/Storages/KVStore/FFI/tests/gtest_status_server.cpp
dbms/src/Storages/S3/FileCache.cpp
dbms/src/Storages/S3/S3Common.cpp
dbms/src/Storages/S3/S3RandomAccessFile.cpp
dbms/src/Storages/S3/S3ReadLimiter.cpp
dbms/src/Storages/S3/S3ReadLimiter.h
dbms/src/Storages/S3/tests/gtest_filecache.cpp
docs/tiflash_http_api.md
metrics/grafana/tiflash_summary.json

✅ Files skipped from review due to trivial changes (2)

dbms/src/Storages/KVStore/FFI/tests/gtest_status_server.cpp
dbms/src/Server/Server.cpp

🚧 Files skipped from review as they are similar to previous changes (2)

dbms/src/Storages/KVStore/FFI/ProxyFFIStatusService.h
dbms/src/Storages/KVStore/FFI/ProxyFFIStatusService.cpp

metrics/grafana/tiflash_summary.json

Signed-off-by: JaySon-Huang <tshent@qq.com>

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@dbms/src/Storages/S3/FileCache.cpp`:
- Around line 1272-1278: The code only calls
s3_read_metrics_recorder->recordBytes(rbuf.count(), ...) after copyData()
succeeds, so partial reads during exceptions are not recorded; update the
FileCache download logic (the block using ReadBufferFromIStream rbuf,
WriteBufferFromWritableFile wbuf, copyData and wbuf.sync) to record rbuf.count()
in an unconditional cleanup path (e.g., use a try/finally or RAII guard) so
recordBytes is invoked whether copyData() throws or not; apply the same change
to the other symmetric block around lines handling
ReadBufferFromIStream/copyData (the 1289-1294 case) so both success and
partial-failure byte counts are reported.
- Around line 1396-1403: The current failure cleanup calls remove(s3_key, true)
which flows into removeImpl() and increments eviction metrics; change the code
paths (around FileCache::FileSegment failure handling at the shown block and the
similar block at lines ~1452-1455) to unpublish the failed placeholder and
release its reservation without counting it as an eviction—either call an
existing lighter-weight API (e.g. an unpublishPlaceholder/unpublish or
releaseReservation method if present) or add a new helper (e.g.
unpublishPlaceholder(s3_key) or removeImpl(s3_key, /*count_metrics*/ false))
that removes the published placeholder without increasing
type_dtfile_evict/type_dtfile_evict_bytes; ensure
FileSegment::setStatus(FileSegment::Status::Failed) and bg_download_fail_count
handling remain the same and only replace the remove(..., force=true) call with
this non-metric-cleanup path.

In `@dbms/src/Storages/S3/tests/gtest_filecache.cpp`:
- Around line 1268-1304: The test currently only calls
waitForBgDownload(file_cache) which can pass even if background downloads
failed; update the test (TEST_F FileCacheTest
BgDownloadWorksWithSharedS3ReadLimiter) to explicitly assert the downloads
succeeded by verifying the cached entries for the two generated objects (use the
same objects from genObjects) are present and valid after waitForBgDownload —
e.g., call file_cache.get(S3FilenameView::fromKey(objects[0].key),
objects[0].size) and file_cache.get(S3FilenameView::fromKey(objects[1].key),
objects[1].size) and ASSERT they return non-null / valid results (or assert on
the on-disk cache file existence) to ensure the paused downloads actually
completed successfully.

In `@metrics/grafana/tiflash_summary.json`:
- Around line 21106-21117: The legendFormat in the target (inside the "targets"
entry with "refId": "A") is using "{{additional_groupby}}" which prevents
Grafana template interpolation; update the legendFormat value to use the
template variable with a dollar prefix (i.e., "{{$additional_groupby}}") so it
matches the expr usage and renders the variable instead of literal text.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 313aa7ad-b13b-4365-a774-ed266e69586b

📥 Commits

Reviewing files that changed from the base of the PR and between a1a397d and d440f12.

📒 Files selected for processing (17)

dbms/src/IO/Checksum/ChecksumBuffer.cpp
dbms/src/Interpreters/Context.cpp
dbms/src/Interpreters/Context.h
dbms/src/Server/Server.cpp
dbms/src/Storages/DeltaMerge/File/DMFileReader.cpp
dbms/src/Storages/KVStore/FFI/ProxyFFIStatusService.cpp
dbms/src/Storages/KVStore/FFI/ProxyFFIStatusService.h
dbms/src/Storages/KVStore/FFI/tests/gtest_status_server.cpp
dbms/src/Storages/S3/FileCache.cpp
dbms/src/Storages/S3/FileCache.h
dbms/src/Storages/S3/S3Common.cpp
dbms/src/Storages/S3/S3RandomAccessFile.cpp
dbms/src/Storages/S3/S3ReadLimiter.cpp
dbms/src/Storages/S3/S3ReadLimiter.h
dbms/src/Storages/S3/tests/gtest_filecache.cpp
docs/tiflash_http_api.md
metrics/grafana/tiflash_summary.json

✅ Files skipped from review due to trivial changes (2)

dbms/src/Storages/KVStore/FFI/tests/gtest_status_server.cpp
dbms/src/Storages/S3/S3RandomAccessFile.cpp

🚧 Files skipped from review as they are similar to previous changes (2)

dbms/src/Storages/KVStore/FFI/ProxyFFIStatusService.h
dbms/src/Storages/S3/FileCache.h

dbms/src/Storages/S3/FileCache.cpp

dbms/src/Storages/S3/tests/gtest_filecache.cpp

metrics/grafana/tiflash_summary.json

Signed-off-by: JaySon-Huang <tshent@qq.com>

Copilot

Pull request overview

Copilot reviewed 33 out of 34 changed files in this pull request and generated 4 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

dbms/src/Storages/S3/S3RandomAccessFile.cpp

dbms/src/Storages/S3/FileCache.cpp

metrics/grafana/tiflash_summary.json

Copilot

Pull request overview

Copilot reviewed 33 out of 34 changed files in this pull request and generated 2 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

dbms/src/Storages/S3/S3RandomAccessFile.cpp

dbms/src/Storages/S3/FileCache.cpp

Copilot

Pull request overview

Copilot reviewed 33 out of 34 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

dbms/src/Storages/S3/S3ReadLimiter.h

JaySon-Huang · 2026-04-05T18:14:02Z

@coderabbitai full review

coderabbitai · 2026-04-05T18:14:08Z

✅ Actions performed

Full review triggered.

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (3)

dbms/src/Storages/S3/S3ReadLimiter.h (1)

59-63: Clarify stale “stream dimension” wording in class docs.

This class currently exposes byte-rate limiting APIs only; mentioning a present “stream dimension” here can confuse future readers.

✍️ Suggested doc-only tweak

-    /// The stream dimension is best-effort protection against too many live response bodies, not a
-    /// replacement for byte throttling and not a safe cap on reader object count. In TiFlash a
-    /// `S3RandomAccessFile` may keep its body stream open across scheduling gaps, so a low stream
-    /// limit can block forward progress even when the node is no longer transferring many bytes.
+    /// This limiter currently enforces byte-rate only. Stream-count limiting was intentionally
+    /// removed because stream lifetime is not a reliable proxy for real network occupancy.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@dbms/src/Storages/S3/S3ReadLimiter.h` around lines 59 - 63, The class doc
comment for S3ReadLimiter currently mentions a "stream dimension" which is stale
and misleading because the class only exposes byte-rate limiting APIs; update
the comment above the S3ReadLimiter constructor (S3ReadLimiter(UInt64
max_read_bytes_per_sec_ = 0, UInt64 refill_period_ms_ = 100)) to remove or
rephrase the “stream dimension” wording and clearly state that only byte-rate
limiting is enforced (and that any mention of limiting concurrent/open streams
is best-effort or not provided), keeping the rest of the explanation about
scheduling gaps and refill behavior intact.

dbms/src/Common/TiFlashMetrics.cpp (1)

32-32: Give bg_download_stage more headroom.

ExpBuckets{0.0001, 2, 20} tops out at roughly 52s. The manual experiment attached to this PR already shows minute-scale background-download waits, so tiflash_storage_remote_cache_bg_download_stage_seconds will pin to its top bucket and flatten p99/p999 above that. Consider widening only the bg-stage histogram range.
♻️ Suggested tweak
-constexpr auto remote_cache_bg_download_stage_buckets = ExpBuckets{0.0001, 2, 20};
+constexpr auto remote_cache_bg_download_stage_buckets = ExpBuckets{0.01, 2, 20};
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dbms/src/Common/TiFlashMetrics.cpp` at line 32, The histogram for bg download
stage currently uses remote_cache_bg_download_stage_buckets = ExpBuckets{0.0001,
2, 20} which tops out too low and will saturate for minute-scale waits; update
the buckets to widen only this histogram (e.g., change the last parameter from
20 to 30) in TiFlashMetrics.cpp by modifying
remote_cache_bg_download_stage_buckets so
tiflash_storage_remote_cache_bg_download_stage_seconds has headroom for
minute-scale durations.

dbms/src/Storages/S3/tests/gtest_filecache.cpp (1)

1335-1338: Timing assertion may be sensitive to system load.

The assertion ASSERT_GE(watch.elapsedMilliseconds(), 200) validates that the limiter throttles the download, but the exact threshold (200ms) could be sensitive to CI runner load. Consider adding a small tolerance or documenting the expected timing rationale if flakiness is observed.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dbms/src/Storages/S3/tests/gtest_filecache.cpp` around lines 1335 - 1338, The
timing assertion using AtomicStopwatch and
ASSERT_GE(watch.elapsedMilliseconds(), 200) is brittle under variable CI load;
update the test around file_cache.get(S3FilenameView::fromKey(object_key),
object_size) and waitForBgDownload(file_cache) to allow a small tolerance (e.g.,
compare against an expected_delay minus a small delta or assert elapsed >=
expected_delay_with_tolerance) or document the rationale; adjust the assertion
to use a configurable/clearly named constant (expected_delay and tolerance) so
the limiter check remains robust across environments.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@dbms/src/IO/BaseFile/RateLimiter.h`:
- Line 223: The S3 limiter is being set to nullptr when the config is zero so
files constructed earlier (S3RandomAccessFile ctor snapshotting
client_ptr->getS3ReadLimiter()) keep nullptr; instead ensure getS3ReadLimiter()
always returns a valid (non-null) shared_ptr by publishing a no-op/unthrottled
S3::S3ReadLimiter instance when the configured rate is 0 and atomically swapping
in a real throttling S3ReadLimiter when the config becomes non-zero (or update
an existing limiter’s internal state rather than replacing with nullptr). Locate
the s3_read_limiter management in RateLimiter.cpp (and the getS3ReadLimiter
declaration) and change the zero-rate path to create/publish a non-throttling
limiter so existing snapshots in S3RandomAccessFile continue to be valid across
reloads.

In `@dbms/src/IO/BaseFile/tests/gtest_rate_limiter.cpp`:
- Around line 424-432: The test TEST(S3ReadLimiterTest,
LargeRequestDoesNotWaitForever) currently calls
S3::S3ReadLimiter::requestBytes(...) directly on the test thread which can hang
CI if the limiter regresses; instead spawn the requestBytes call in a background
task (e.g., std::async or std::thread) and use a wait_for() with a timeout to
assert the task completed (and that AtomicStopwatch::elapsedMilliseconds() is
under the expected bound), referencing S3::S3ReadLimiter, requestBytes,
S3::S3ReadSource::DirectRead and AtomicStopwatch to locate and replace the
direct blocking call with a timed async/wait_for assertion.

In `@dbms/src/Storages/S3/S3Common.h`:
- Around line 185-191: setS3ReadLimiter currently replaces the shared_ptr which
breaks existing S3RandomAccessFile snapshots; instead make the published
S3ReadLimiter object stable by updating its configuration in-place: in
setS3ReadLimiter(const std::shared_ptr<S3ReadLimiter>& limiter) if
shared_s3_read_limiter is non-null and limiter is non-null call
shared_s3_read_limiter->updateConfig(...) (or equivalent) to apply new limits,
otherwise (when there is no existing limiter) assign the pointer; also ensure
you still call shared_tiflash_client->setS3ReadLimiter(shared_s3_read_limiter)
so clients keep referencing the same long-lived S3ReadLimiter instance (refer to
setS3ReadLimiter, shared_s3_read_limiter, S3ReadLimiter::updateConfig, and
S3RandomAccessFile constructor).

---

Nitpick comments:
In `@dbms/src/Common/TiFlashMetrics.cpp`:
- Line 32: The histogram for bg download stage currently uses
remote_cache_bg_download_stage_buckets = ExpBuckets{0.0001, 2, 20} which tops
out too low and will saturate for minute-scale waits; update the buckets to
widen only this histogram (e.g., change the last parameter from 20 to 30) in
TiFlashMetrics.cpp by modifying remote_cache_bg_download_stage_buckets so
tiflash_storage_remote_cache_bg_download_stage_seconds has headroom for
minute-scale durations.

In `@dbms/src/Storages/S3/S3ReadLimiter.h`:
- Around line 59-63: The class doc comment for S3ReadLimiter currently mentions
a "stream dimension" which is stale and misleading because the class only
exposes byte-rate limiting APIs; update the comment above the S3ReadLimiter
constructor (S3ReadLimiter(UInt64 max_read_bytes_per_sec_ = 0, UInt64
refill_period_ms_ = 100)) to remove or rephrase the “stream dimension” wording
and clearly state that only byte-rate limiting is enforced (and that any mention
of limiting concurrent/open streams is best-effort or not provided), keeping the
rest of the explanation about scheduling gaps and refill behavior intact.

In `@dbms/src/Storages/S3/tests/gtest_filecache.cpp`:
- Around line 1335-1338: The timing assertion using AtomicStopwatch and
ASSERT_GE(watch.elapsedMilliseconds(), 200) is brittle under variable CI load;
update the test around file_cache.get(S3FilenameView::fromKey(object_key),
object_size) and waitForBgDownload(file_cache) to allow a small tolerance (e.g.,
compare against an expected_delay minus a small delta or assert elapsed >=
expected_delay_with_tolerance) or document the rationale; adjust the assertion
to use a configurable/clearly named constant (expected_delay and tolerance) so
the limiter check remains robust across environments.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 10de3fcb-9c5d-47f6-b1a7-8b48cbf070d8

📥 Commits

Reviewing files that changed from the base of the PR and between 9433514 and b767295.

📒 Files selected for processing (34)

dbms/src/Common/FailPoint.cpp
dbms/src/Common/TiFlashMetrics.cpp
dbms/src/Common/TiFlashMetrics.h
dbms/src/IO/BaseFile/IORateLimitConfig.cpp
dbms/src/IO/BaseFile/IORateLimitConfig.h
dbms/src/IO/BaseFile/RateLimiter.cpp
dbms/src/IO/BaseFile/RateLimiter.h
dbms/src/IO/BaseFile/tests/gtest_rate_limiter.cpp
dbms/src/IO/Checksum/ChecksumBuffer.cpp
dbms/src/Interpreters/Context.cpp
dbms/src/Interpreters/Context.h
dbms/src/Interpreters/Settings.h
dbms/src/Server/Server.cpp
dbms/src/Server/tests/gtest_storage_config.cpp
dbms/src/Storages/DeltaMerge/File/DMFileReader.cpp
dbms/src/Storages/KVStore/FFI/ProxyFFIStatusService.cpp
dbms/src/Storages/KVStore/FFI/ProxyFFIStatusService.h
dbms/src/Storages/KVStore/FFI/tests/gtest_status_server.cpp
dbms/src/Storages/S3/FileCache.cpp
dbms/src/Storages/S3/FileCache.h
dbms/src/Storages/S3/MockS3Client.h
dbms/src/Storages/S3/S3Common.cpp
dbms/src/Storages/S3/S3Common.h
dbms/src/Storages/S3/S3RandomAccessFile.cpp
dbms/src/Storages/S3/S3RandomAccessFile.h
dbms/src/Storages/S3/S3ReadLimiter.cpp
dbms/src/Storages/S3/S3ReadLimiter.h
dbms/src/Storages/S3/S3ReadLimiter_fwd.h
dbms/src/Storages/S3/tests/gtest_filecache.cpp
dbms/src/Storages/S3/tests/gtest_s3client.cpp
dbms/src/Storages/S3/tests/gtest_s3file.cpp
docs/design/2026-03-24-disagg-s3-node-level-backpressure-and-filecache-dedup.md
docs/tiflash_http_api.md
metrics/grafana/tiflash_summary.json

dbms/src/IO/BaseFile/RateLimiter.h

dbms/src/IO/BaseFile/tests/gtest_rate_limiter.cpp

coderabbitai · 2026-04-05T18:31:09Z

dbms/src/Storages/S3/S3Common.h

+    void setS3ReadLimiter(const std::shared_ptr<S3ReadLimiter> & limiter)
+    {
+        std::unique_lock lock_init(mtx_init);
+        shared_s3_read_limiter = limiter;
+        if (shared_tiflash_client != nullptr)
+            shared_tiflash_client->setS3ReadLimiter(shared_s3_read_limiter);
+    }


⚠️ Potential issue | 🟠 Major

Keep the published limiter object stable across reloads.

dbms/src/Server/Server.cpp hits this setter on config reload, but dbms/src/Storages/S3/S3RandomAccessFile.cpp snapshots the limiter once in its constructor. If a node starts with s3_max_read_bytes_per_sec = 0 and reload later enables it, already-open readers keep the old nullptr and bypass throttling for the rest of the query. Prefer updating one long-lived S3ReadLimiter via updateConfig() instead of swapping the shared_ptr.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@dbms/src/Storages/S3/S3Common.h` around lines 185 - 191, setS3ReadLimiter currently replaces the shared_ptr which breaks existing S3RandomAccessFile snapshots; instead make the published S3ReadLimiter object stable by updating its configuration in-place: in setS3ReadLimiter(const std::shared_ptr<S3ReadLimiter>& limiter) if shared_s3_read_limiter is non-null and limiter is non-null call shared_s3_read_limiter->updateConfig(...) (or equivalent) to apply new limits, otherwise (when there is no existing limiter) assign the pointer; also ensure you still call shared_tiflash_client->setS3ReadLimiter(shared_s3_read_limiter) so clients keep referencing the same long-lived S3ReadLimiter instance (refer to setS3ReadLimiter, shared_s3_read_limiter, S3ReadLimiter::updateConfig, and S3RandomAccessFile constructor).

The nonzero -> 0 reload path is already handled by keeping the existing S3ReadLimiter object alive and applying updateConfig(0) in place, so readers that already snapped the shared limiter can observe disable without pointer recreation.
What still remains intentionally limited today is the first 0 -> nonzero transition when startup published no limiter object: previously constructed readers that snapped nullptr will not start throttling retroactively, and only newly created readers will see the newly published limiter. I added a code comment and TODO in RateLimiter.cpp to document this behavior and the possible follow-up of publishing a no-op limiter even at rate 0 if we want stronger reload semantics later.

The PR description images show:

Image 1 (Node exporter – Network overview): Grafana Network panels with no drops/errors, traffic bursts concentrated on bond1 up to ~1.23 GB/s, packet peaks ~818 k p/s, link speeds stable over 2026-04-05 18:37:43–20:22:54.

Image 2 (TiFlash Summary – PageStorage Rate Limiter): Limiter throughput shows S3 direct/filecache bursts; thresholds pegged at 200 MiB/s for s3_read and bg_write; pending queues up to 30; S3 read pending P99≈1.31 s, P999≈2.77 s in the same time window.

Image 3 (TiFlash Summary – Remote Cache): Remote cache ops/flow with spikes; bounded waits up to ~7 s; queue depth up to ~150–200; occasional “too_many_download” rejects.

Image 4 (TiFlash Summary – Node/resource): Store/available sizes oscillate with workload; uptime ~2 hours; transient CPU spikes >1000%; memory peaks ~11 GiB.

@JaySon-Huang thanks for the clarification. Your described semantics match the code paths: nonzero → 0 keeps the same S3ReadLimiter instance and calls updateConfig(0), so in-flight readers that captured the pointer observe disablement; the only non-retroactive case is the very first 0 → nonzero when the process started with no limiter object and older readers had snapped nullptr.

Given that, I’m good with the current behavior for this PR. As a low-risk follow-up to tighten reload semantics later, you could consider instantiating and publishing a no-op limiter at startup (limit=0). That would make the initial 0 → nonzero transition retroactive for already-open readers without changing hot-path costs when disabled, since:

existing readers would already hold a stable pointer; a reload would just updateConfig(new_rate),

per-read locking remains unchanged (no extra get() per read).

I’ve recorded this nuance so we don’t flag it again.

✏️ Learnings added

Learnt from: JaySon-Huang Repo: pingcap/tiflash PR: 10771 File: dbms/src/Storages/S3/S3Common.h:185-191 Timestamp: 2026-04-05T19:04:11.469Z Learning: TiFlash S3ReadLimiter reload semantics: nonzero→0 is handled in place via S3ReadLimiter::updateConfig(0) so existing readers observe disablement; the initial 0→nonzero after startup with no limiter object is intentionally non‑retroactive—readers that captured nullptr continue unthrottled until recreated. A possible future enhancement is to publish a no‑op limiter (rate=0) at startup to make 0→nonzero retroactive.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

🧠 Learnings used

Learnt from: CR Repo: pingcap/tiflash PR: 0 File: dbms/src/Storages/AGENTS.md:0-0 Timestamp: 2026-01-26T03:03:41.424Z Learning: Applies to dbms/src/Storages/**/{DeltaMerge,KVStore,Page,S3}/**/*.{h,hpp,cpp} : Use `std::shared_ptr` for `StorageDeltaMerge`, `DeltaMergeStore`, `Region`, and `Context` objects

JaySon-Huang added 2 commits March 24, 2026 23:43

disagg: add configurable node-level S3 read limiter

d38c548

disagg: throttle S3 readers and deduplicate cache misses

a4b9e53

JaySon-Huang self-assigned this Mar 25, 2026

ti-chi-bot bot added do-not-merge/needs-linked-issue do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note-none Denotes a PR that doesn't merit a release note. labels Mar 25, 2026

ti-chi-bot bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed do-not-merge/needs-linked-issue labels Mar 25, 2026

JaySon-Huang force-pushed the disagg_network_limit branch 4 times, most recently from 4253d99 to cd65279 Compare April 3, 2026 09:24

Refine metrics and comments

df14a20

Signed-off-by: JaySon-Huang <tshent@qq.com>

JaySon-Huang force-pushed the disagg_network_limit branch from cd65279 to df14a20 Compare April 3, 2026 10:09

JaySon-Huang added 10 commits April 4, 2026 13:08

disagg: add remote cache phase-2 metrics and coverage

1805682

disagg: remove duplicate S3 active stream metric

099b2d9

disagg: share file cache wait implementation

df72575

disagg: clarify file cache get paths

d074a7b

disagg: handle failed bounded wait downloads

03f2d09

disagg: reuse copyData in limited cache downloads

337b5d0

disagg: clarify conservative S3 read charging

72fd48a

disagg: share S3 random access read and seek tails

d1ac68b

disagg: extend S3 read limiter unit coverage

26a8737

Refine metrics

963905a

Signed-off-by: JaySon-Huang <tshent@qq.com>

JaySon-Huang changed the title ~~disagg: throttle S3 readers~~ disagg: limit S3 read amplification on compute nodes Apr 4, 2026

JaySon-Huang added 2 commits April 4, 2026 20:26

disagg: enrich file cache observability

c2ffeb5

disagg: remove S3 stream cap support

efa152b

ti-chi-bot bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels Apr 5, 2026

JaySon-Huang commented Apr 5, 2026

View reviewed changes

dbms/src/Storages/S3/FileCache.cpp Show resolved Hide resolved

JaySon-Huang commented Apr 5, 2026

View reviewed changes

dbms/src/Server/Server.cpp Show resolved Hide resolved

JaySon-Huang added 2 commits April 5, 2026 21:26

Update grafana panel

2bda03a

Signed-off-by: JaySon-Huang <tshent@qq.com>

disagg: tighten filecache startup and low-limit reads

9412be4

coderabbitai bot reviewed Apr 5, 2026

View reviewed changes

metrics/grafana/tiflash_summary.json Show resolved Hide resolved

metrics/grafana/tiflash_summary.json Show resolved Hide resolved

metrics/grafana/tiflash_summary.json Show resolved Hide resolved

JaySon-Huang added 2 commits April 5, 2026 23:47

disagg: fix failed download cache accounting

04ee7e8

Update grafana panel

d440f12

Signed-off-by: JaySon-Huang <tshent@qq.com>

coderabbitai bot reviewed Apr 5, 2026

View reviewed changes

dbms/src/Storages/S3/FileCache.cpp Show resolved Hide resolved

dbms/src/Storages/S3/FileCache.cpp Outdated Show resolved Hide resolved

dbms/src/Storages/S3/tests/gtest_filecache.cpp Show resolved Hide resolved

metrics/grafana/tiflash_summary.json Show resolved Hide resolved

Update grafana panel

3e9bd94

Signed-off-by: JaySon-Huang <tshent@qq.com>

JaySon-Huang requested a review from Copilot April 5, 2026 16:36

Copilot started reviewing on behalf of JaySon-Huang April 5, 2026 16:36 View session

disagg: refine filecache failure cleanup metrics

1141c09

Copilot AI reviewed Apr 5, 2026

View reviewed changes

dbms/src/Storages/S3/S3RandomAccessFile.cpp Outdated Show resolved Hide resolved

dbms/src/Storages/S3/FileCache.cpp Show resolved Hide resolved

dbms/src/Storages/S3/FileCache.cpp Outdated Show resolved Hide resolved

metrics/grafana/tiflash_summary.json Show resolved Hide resolved

disagg: restore S3 fast paths and scope bg download failure

8e1e356

JaySon-Huang requested a review from Copilot April 5, 2026 17:25

Copilot AI reviewed Apr 5, 2026

View reviewed changes

dbms/src/Storages/S3/S3RandomAccessFile.cpp Outdated Show resolved Hide resolved

dbms/src/Storages/S3/FileCache.cpp Show resolved Hide resolved

JaySon-Huang added 2 commits April 6, 2026 01:39

disagg: restore disabled-limiter seek fast path

56b7caf

disagg: clean up failed file cache scheduling

b767295

JaySon-Huang requested a review from Copilot April 5, 2026 18:06

Copilot started reviewing on behalf of JaySon-Huang April 5, 2026 18:06 View session

Copilot AI reviewed Apr 5, 2026

View reviewed changes

dbms/src/Storages/S3/S3ReadLimiter.h Outdated Show resolved Hide resolved

coderabbitai bot reviewed Apr 5, 2026

View reviewed changes

disagg: clarify S3 limiter semantics and harden tests

f7d36f7

JaySon-Huang requested review from CalvinNeo and JinheLin April 6, 2026 11:03

disagg: avoid pool access during file cache shutdown

59bc78c

Conversation

JaySon-Huang commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What problem does this PR solve?

What is changed and how it works?

Check List

Manual test

Release note

Summary by CodeRabbit

Uh oh!

ti-chi-bot bot commented Mar 25, 2026

Uh oh!

ti-chi-bot bot commented Mar 25, 2026

Uh oh!

coderabbitai bot commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

JaySon-Huang commented Apr 5, 2026

Uh oh!

coderabbitai bot commented Apr 5, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Apr 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JaySon-Huang Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

JaySon-Huang commented Mar 25, 2026 •

edited

Loading

coderabbitai bot commented Mar 25, 2026 •

edited

Loading

coderabbitai bot Apr 5, 2026 •

edited

Loading