Version
hyper: 1.8.0, 1.8.1, 1.9.0
Platform
Linux x86_64 (production Kubernetes cluster, but platform-independent bug)
Summary
Description
UpgradedSendStreamTask::tick() in src/proto/h2/upgrade.rs has a logic error where break 'capacity (L98) allows send_data() to be called even when h2 flow control capacity is zero. This completely bypasses the max_send_buffer_size backpressure mechanism, causing unbounded memory growth in the h2 per-stream send buffer when the downstream TCP write is slow or blocked. In production, this manifests as rapid OOM (165MB → 8GB+ in ~2 minutes) for HTTP/2 CONNECT tunnel proxies.
Architecture Overview (v1.8.0+)
PR #3967 refactored H2Upgraded to eliminate unsafe transmute code. The new architecture introduces:
H2Upgraded::poll_write — writes data into an mpsc::channel(1) (the "bridge")
UpgradedSendStreamTask::tick() — a separate spawned task that reads from the channel and calls h2::SendStream::send_data()
The backpressure contract is supposed to be:
mpsc::channel(1) provides 1 slot of buffering between poll_write and the send task
tick() checks h2 flow control capacity via poll_capacity() before calling send_data()
- When capacity is 0,
tick() should suspend (return Poll::Pending), which leaves the channel slot occupied, causing poll_write to also return Poll::Pending
The Bug
In UpgradedSendStreamTask::tick() (src/proto/h2/upgrade.rs, L68-133):
fn tick(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Result<(), crate::Error>> {
let mut me = self.project();
loop {
me.h2_tx.reserve_capacity(1);
if me.h2_tx.capacity() == 0 {
'capacity: loop {
match me.h2_tx.poll_capacity(cx) {
Poll::Ready(Some(Ok(0))) => {}
Poll::Ready(Some(Ok(_))) => break,
Poll::Ready(Some(Err(e))) => {
return Poll::Ready(Err(crate::Error::new_body_write(e)))
}
Poll::Ready(None) => {
return Poll::Ready(Err(crate::Error::new_body_write(
"send stream capacity unexpectedly closed",
)));
}
Poll::Pending => break 'capacity, // <-- BUG: should be `return Poll::Pending`
}
}
}
match me.h2_tx.poll_reset(cx) { ... }
match me.rx.as_mut().poll_next(cx) { // <-- continues to drain data from channel
Poll::Ready(Some(cursor)) => {
me.h2_tx
.send_data(SendBuf::Cursor(cursor), false) // <-- sends with 0 capacity!
.map_err(crate::Error::new_body_write)?;
}
...
}
}
}
The critical error is at L98: Poll::Pending => break 'capacity.
break 'capacity only exits the inner 'capacity loop, but the outer loop { ... } continues execution. The code proceeds to:
poll_reset(cx) — typically returns Pending (no reset), registers waker
rx.poll_next(cx) — reads data from the mpsc channel (which is almost always Ready because the channel has capacity 1 and the writer — H2Upgraded::poll_write — eagerly fills it)
send_data() — sends the data with zero h2 flow control capacity
Why send_data() Accepts Data Without Capacity
h2::SendStream::send_data() does not check or enforce flow control capacity. Looking at h2-0.4.13/src/proto/streams/prioritize.rs L145-222:
pub fn send_data<B>(
&mut self,
frame: frame::Data<B>,
buffer: &mut Buffer<Frame<B>>,
stream: &mut store::Ptr,
counts: &mut Counts,
task: &mut Option<Waker>,
) -> Result<(), UserError>
where
B: Buf,
{
let sz = frame.payload().remaining();
// ...
// Unconditionally increases buffered data counter
stream.buffered_send_data += sz as usize;
// Implicitly requests more capacity if needed
if (stream.requested_send_capacity as usize) < stream.buffered_send_data {
stream.requested_send_capacity = ...;
self.try_assign_capacity(stream);
}
// If no flow control window available, just buffers the frame (no error!)
if stream.send_flow.available() > 0 || stream.buffered_send_data == 0 {
self.queue_frame(frame.into(), buffer, stream, task);
} else {
stream.pending_send.push_back(buffer, frame.into()); // <-- buffered indefinitely
}
Ok(()) // <-- always succeeds
}
The capacity check is done exclusively by poll_capacity(), which returns the value from stream.capacity(max_buffer_size):
// h2-0.4.13/src/proto/streams/stream.rs L275-279
pub fn capacity(&self, max_buffer_size: usize) -> WindowSize {
let available = self.send_flow.available().as_size() as usize;
let buffered = self.buffered_send_data;
available.min(max_buffer_size).saturating_sub(buffered) as WindowSize
}
When buffered_send_data >= max_buffer_size, capacity() returns 0, and poll_capacity() returns Pending. This is the only backpressure mechanism — and hyper's break 'capacity bypasses it entirely.
The Feedback Loop
Once the bug triggers, a destructive feedback loop forms:
TCP write to downstream blocks (slow client / network congestion)
→ h2 connection cannot flush DATA frames to TCP
→ h2 stream flow control window fills up
→ poll_capacity() returns Pending (capacity == 0)
→ tick() hits `break 'capacity` (BUG: should return Pending)
→ tick() continues to rx.poll_next() → gets data from channel
→ tick() calls send_data() → data goes into unbounded h2 per-stream buffer
→ channel slot is now empty → poll_write returns Ready
→ upstream TCP read continues (no backpressure!)
→ more data read from upstream → sent to channel → repeat
→ h2 send buffer grows WITHOUT BOUND
→ memory consumption: upstream_bandwidth × time_blocked → OOM
Comparison with Correct Implementation (hyper ≤ v1.7.0)
In hyper v1.0.0 through v1.7.0, H2Upgraded::poll_write directly interacts with h2 SendStream without any intermediate channel or spawned task:
// hyper v1.6.0, src/proto/h2/mod.rs L317-352
impl<B> Write for H2Upgraded<B>
where
B: Buf,
{
fn poll_write(
mut self: Pin<&mut Self>,
cx: &mut Context<'_>,
buf: &[u8],
) -> Poll<Result<usize, std::io::Error>> {
if buf.is_empty() {
return Poll::Ready(Ok(0));
}
self.send_stream.reserve_capacity(buf.len());
let cnt = match ready!(self.send_stream.poll_capacity(cx)) {
// ^^^^^^ -- KEY: `ready!()` macro returns Poll::Pending if not ready
None => Some(0),
Some(Ok(cnt)) => self
.send_stream
.write(&buf[..cnt], false)
.ok()
.map(|()| cnt),
Some(Err(_)) => None,
};
// ...
}
}
Why the Old Implementation Is Correct
ready!() macro: When poll_capacity() returns Poll::Pending, the ready!() macro immediately returns Poll::Pending from poll_write. This is the correct behavior — it tells the caller (the bidirectional copy loop) to stop reading from upstream until h2 has capacity to send.
- Direct flow control: There is no intermediate buffer between the caller and h2. The caller writes directly into h2's
SendStream::write(), which only accepts exactly as many bytes as poll_capacity() reported available.
- Tight backpressure coupling: The
poll_write caller, h2 flow control, and TCP write are all in the same task, with no channel indirection that could break the backpressure chain.
Why the New Implementation Broke
The refactoring in PR #3967 decoupled the IO path into two separate tasks connected by an mpsc::channel(1):
[poll_write caller] --mpsc::channel(1)--> [UpgradedSendStreamTask] --h2 SendStream--> [TCP]
The intent was that mpsc::channel(1) would naturally provide backpressure: when the send task can't flush to h2, the channel stays full, causing poll_write to return Pending. However, the break 'capacity bug means the send task always drains the channel regardless of h2 capacity, so the channel is effectively always empty, and poll_write is effectively never blocked.
How to Reproduce
Minimal Reproduction Setup
- An HTTP/2 server that accepts CONNECT requests and upgrades them (the proxy)
- An upstream TCP target that sends data rapidly
- A downstream HTTP/2 client whose TCP connection becomes slow/blocked
Reproduction Steps
- Establish an HTTP/2 CONNECT tunnel through the proxy
- Have the upstream target send data at high rate (e.g.,
dd if=/dev/urandom | nc)
- Throttle or block the downstream client's TCP read (e.g., using
tc netem or simply not reading from the client socket)
- Observe: the proxy's memory will grow linearly with
upstream_rate × time_blocked, without any bound
Reproduction Pseudo-Code
// 1. Start a hyper HTTP/2 server with CONNECT support (using default config)
let builder = http2::Builder::new(TokioExecutor::new());
// ... serve_connection with a handler that connects upstream and does bidirectional copy
// 2. Upstream: rapidly send data
let upstream = TcpListener::bind("127.0.0.1:9000").await?;
tokio::spawn(async move {
let (mut stream, _) = upstream.accept().await.unwrap();
let data = vec![0u8; 8192];
loop {
stream.write_all(&data).await.unwrap();
}
});
// 3. Client: establish H2 CONNECT, then stop reading
let (mut send_request, conn) = hyper::client::conn::http2::Builder::new(TokioExecutor::new())
.handshake(io)
.await?;
let req = Request::connect("127.0.0.1:9000").body(Empty::new())?;
let res = send_request.send_request(req).await?;
let upgraded = hyper::upgrade::on(res).await?;
// 4. Simply stop reading from `upgraded` — simulate slow/blocked downstream
tokio::time::sleep(Duration::from_secs(120)).await;
// After 120s, observe proxy memory: should have grown by upstream_rate * 120s
Observable Symptoms
- Memory: grows linearly over time (rate = upstream data throughput)
- CPU: elevated due to
tick() hot-looping (it re-polls on every channel receive)
- Network: inbound traffic sustained, outbound traffic drops to near zero
- Tokio tasks: stable count (no new tasks spawned — the bug is within a single task's loop)
Suggested Fix
Replace break 'capacity with return Poll::Pending at L98 of src/proto/h2/upgrade.rs:
// Before (BUG):
Poll::Pending => break 'capacity,
// After (FIX):
Poll::Pending => return Poll::Pending,
This ensures that when h2 has no flow control capacity, the entire tick() function returns Pending, the mpsc channel slot stays occupied, and H2Upgraded::poll_write also returns Pending — restoring the backpressure chain.
Verification
After the fix, the backpressure chain becomes:
h2 flow control full → poll_capacity returns Pending
→ tick() returns Pending → channel slot stays full
→ poll_write returns Pending → copy loop stops reading upstream
→ memory stays bounded at O(buffer_size) per tunnel
Workaround
- Downgrade to hyper v1.7.0: The old implementation has correct backpressure. However, it relies on
unsafe transmute with Neutered<B> (a repr(transparent) uninhabited type), which may break with future Rust compiler versions (see rust-lang/rust#147588).
- Memory limits with OOM monitoring: Ensure resource limits are set and OOM events trigger alerts and automatic restarts. This does not prevent the bug but limits blast radius.
Code Sample
fn tick(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Result<(), crate::Error>> {
let mut me = self.project();
loop {
me.h2_tx.reserve_capacity(1);
if me.h2_tx.capacity() == 0 {
'capacity: loop {
match me.h2_tx.poll_capacity(cx) {
Poll::Ready(Some(Ok(0))) => {}
Poll::Ready(Some(Ok(_))) => break,
Poll::Ready(Some(Err(e))) => {
return Poll::Ready(Err(crate::Error::new_body_write(e)))
}
Poll::Ready(None) => {
return Poll::Ready(Err(crate::Error::new_body_write(
"send stream capacity unexpectedly closed",
)));
}
Poll::Pending => break 'capacity, // <-- BUG: should be `return Poll::Pending`
}
}
}
match me.h2_tx.poll_reset(cx) { ... }
match me.rx.as_mut().poll_next(cx) { // <-- continues to drain data from channel
Poll::Ready(Some(cursor)) => {
me.h2_tx
.send_data(SendBuf::Cursor(cursor), false) // <-- sends with 0 capacity!
.map_err(crate::Error::new_body_write)?;
}
...
}
}
}
Expected Behavior
poll_capacity() backpressure effective, no data will push to h2 send buffer if h2 sending window capacity = 0
Actual Behavior
unbounded memory growth in the h2 per-stream send buffer, leading to OOM
Additional Context
Production Incident
Environment: Kubernetes cluster, Linux x86_64
Time: 21:15:30 ~ 21:16:09
Component: HTTP/2 CONNECT tunnel proxy (server side)
Timeline
| Time |
NIC Inbound (target→proxy) |
NIC Outbound (proxy→client) |
RSS Memory |
Notes |
| 21:15:30 |
705 Mbps |
705 Mbps |
165 MB |
Inbound/outbound perfectly balanced, all normal |
| 21:15:30+ |
Climbing |
Slowing, falling behind inbound |
Rising fast |
Backpressure chain breaks: downstream h2 send rate can't keep up, delta accumulates in memory |
| 21:15:45 |
1.36 Gbps |
892 Mbps |
— |
Delta reaches 468 Mbps (≈58.5 MB/s accumulating into h2 send buffer) |
| 21:16:00 |
1.55 Gbps |
89 Mbps |
4.51 GB |
Outbound nearly collapsed (severe downstream TCP congestion), inbound still at 1.55 Gbps, all buffered in memory |
| 21:16:09 |
— |
— |
> 8 GB |
RSS exceeds 8Gi limit, OOMKilled |
Key Observations
- From 21:15:30 (balanced) to 21:16:09 (OOM), the entire incident lasted only 39 seconds
- Memory grew from 165MB to 8GB+, delta ≈ 7.8GB, average accumulation rate ≈ 200 MB/s
- Two proxy instances OOMed sequentially (resource limit:
memory: 8Gi)
Key Metrics During Incident
| Metric |
Observation |
| Tunnel count |
< 500 (no spike) |
| TCP connections |
< 2000 (no spike) |
| Tokio task count |
6000, No change (no new task spawns) |
| NIC inbound |
Climbed from 705 Mbps to 1.55 Gbps |
| NIC outbound |
Climbed slowly to 892 Mbps, then collapsed to 89 Mbps |
| CPU |
Elevated (tick hot-loop consuming CPU) |
Precise Correlation with Bug Mechanism
The timeline above perfectly matches the theoretical model of break 'capacity backpressure bypass:
Phase 1: Backpressure Chain Breaks (21:15:30 ~ 21:15:45)
At 21:15:30, inbound/outbound are balanced (705 Mbps), meaning h2 flow control and TCP write capacity could fully absorb upstream data. Then upstream traffic climbs while downstream TCP begins to congest (possibly client slowdown or network jitter), and h2 flow control windows gradually fill up. Under correct backpressure, poll_capacity() returning Pending should stop upstream reads, causing inbound traffic to also drop. But break 'capacity causes tick() to keep draining the channel and calling send_data(), so upstream reads continue unblocked and inbound traffic keeps climbing.
Phase 2: Vicious Cycle Acceleration (21:15:45 ~ 21:16:00)
The inbound-outbound delta widens from 468 Mbps to 1.46 Gbps (1.55 - 0.089). Outbound plummets from 892 Mbps to 89 Mbps, indicating severe downstream TCP congestion (TCP send buffer full, window shrinking). Yet inbound traffic keeps rising — precisely because the backpressure chain is completely broken: poll_write always returns Ready, so the copy loop reads from upstream TCP at full speed. All net inbound data accumulates in h2 per-stream send buffers.
Phase 3: OOM (21:16:00 ~ 21:16:09)
In the 9 seconds from 21:16:00 to 21:16:09, inbound ≈ 1.55 Gbps (193.75 MB/s), net accumulation rate ≈ (1.55 - 0.089) Gbps ≈ 182 MB/s. 9 seconds of accumulation ≈ 1.64 GB. Adding the 4.51 GB already accumulated by 21:16:00: 4.51 + 1.64 = 6.15 GB. Accounting for jemalloc arena overhead and fragmentation amplification (typically 1.2-1.5x), actual RSS reaches 7.4 ~ 9.2 GB, precisely hitting the 8Gi OOM threshold.
Why 500 Tunnels Can Cause 8GB OOM
Without the bug, each tunnel's memory is bounded: ~30KB (2×8KB buffers + h2 overhead).
With the bug, each affected tunnel's memory is unbounded: upstream_bandwidth × time_blocked.
Only a simple combination of conditions is needed: fast upstream target (e.g., CDN, large file download) + slow downstream client (e.g., GC pause, network jitter, processing delay). In this incident, peak inbound traffic reached 1.55 Gbps, and at this rate, 39 seconds was enough to fill 8GB of memory.
Even a single tunnel with a fast upstream and blocked downstream can cause OOM.
Version
hyper: 1.8.0, 1.8.1, 1.9.0
Platform
Linux x86_64 (production Kubernetes cluster, but platform-independent bug)
Summary
Description
UpgradedSendStreamTask::tick()insrc/proto/h2/upgrade.rshas a logic error wherebreak 'capacity(L98) allowssend_data()to be called even when h2 flow control capacity is zero. This completely bypasses themax_send_buffer_sizebackpressure mechanism, causing unbounded memory growth in the h2 per-stream send buffer when the downstream TCP write is slow or blocked. In production, this manifests as rapid OOM (165MB → 8GB+ in ~2 minutes) for HTTP/2 CONNECT tunnel proxies.Architecture Overview (v1.8.0+)
PR #3967 refactored
H2Upgradedto eliminateunsafetransmute code. The new architecture introduces:H2Upgraded::poll_write— writes data into anmpsc::channel(1)(the "bridge")UpgradedSendStreamTask::tick()— a separate spawned task that reads from the channel and callsh2::SendStream::send_data()The backpressure contract is supposed to be:
mpsc::channel(1)provides 1 slot of buffering betweenpoll_writeand the send tasktick()checks h2 flow control capacity viapoll_capacity()before callingsend_data()tick()should suspend (returnPoll::Pending), which leaves the channel slot occupied, causingpoll_writeto also returnPoll::PendingThe Bug
In
UpgradedSendStreamTask::tick()(src/proto/h2/upgrade.rs, L68-133):The critical error is at L98:
Poll::Pending => break 'capacity.break 'capacityonly exits the inner'capacityloop, but the outerloop { ... }continues execution. The code proceeds to:poll_reset(cx)— typically returnsPending(no reset), registers wakerrx.poll_next(cx)— reads data from the mpsc channel (which is almost alwaysReadybecause the channel has capacity 1 and the writer —H2Upgraded::poll_write— eagerly fills it)send_data()— sends the data with zero h2 flow control capacityWhy
send_data()Accepts Data Without Capacityh2::SendStream::send_data()does not check or enforce flow control capacity. Looking ath2-0.4.13/src/proto/streams/prioritize.rsL145-222:The capacity check is done exclusively by
poll_capacity(), which returns the value fromstream.capacity(max_buffer_size):When
buffered_send_data >= max_buffer_size,capacity()returns 0, andpoll_capacity()returnsPending. This is the only backpressure mechanism — and hyper'sbreak 'capacitybypasses it entirely.The Feedback Loop
Once the bug triggers, a destructive feedback loop forms:
Comparison with Correct Implementation (hyper ≤ v1.7.0)
In hyper v1.0.0 through v1.7.0,
H2Upgraded::poll_writedirectly interacts with h2SendStreamwithout any intermediate channel or spawned task:Why the Old Implementation Is Correct
ready!()macro: Whenpoll_capacity()returnsPoll::Pending, theready!()macro immediately returnsPoll::Pendingfrompoll_write. This is the correct behavior — it tells the caller (the bidirectional copy loop) to stop reading from upstream until h2 has capacity to send.SendStream::write(), which only accepts exactly as many bytes aspoll_capacity()reported available.poll_writecaller, h2 flow control, and TCP write are all in the same task, with no channel indirection that could break the backpressure chain.Why the New Implementation Broke
The refactoring in PR #3967 decoupled the IO path into two separate tasks connected by an
mpsc::channel(1):The intent was that
mpsc::channel(1)would naturally provide backpressure: when the send task can't flush to h2, the channel stays full, causingpoll_writeto returnPending. However, thebreak 'capacitybug means the send task always drains the channel regardless of h2 capacity, so the channel is effectively always empty, andpoll_writeis effectively never blocked.How to Reproduce
Minimal Reproduction Setup
Reproduction Steps
dd if=/dev/urandom | nc)tc netemor simply not reading from the client socket)upstream_rate × time_blocked, without any boundReproduction Pseudo-Code
Observable Symptoms
tick()hot-looping (it re-polls on every channel receive)Suggested Fix
Replace
break 'capacitywithreturn Poll::Pendingat L98 ofsrc/proto/h2/upgrade.rs:This ensures that when h2 has no flow control capacity, the entire
tick()function returnsPending, the mpsc channel slot stays occupied, andH2Upgraded::poll_writealso returnsPending— restoring the backpressure chain.Verification
After the fix, the backpressure chain becomes:
Workaround
unsafetransmute withNeutered<B>(arepr(transparent)uninhabited type), which may break with future Rust compiler versions (see rust-lang/rust#147588).Code Sample
Expected Behavior
poll_capacity()backpressure effective, no data will push to h2 send buffer if h2 sending window capacity = 0Actual Behavior
unbounded memory growth in the h2 per-stream send buffer, leading to OOM
Additional Context
Production Incident
Environment: Kubernetes cluster, Linux x86_64
Time: 21:15:30 ~ 21:16:09
Component: HTTP/2 CONNECT tunnel proxy (server side)
Timeline
Key Observations
memory: 8Gi)Key Metrics During Incident
Precise Correlation with Bug Mechanism
The timeline above perfectly matches the theoretical model of
break 'capacitybackpressure bypass:Phase 1: Backpressure Chain Breaks (21:15:30 ~ 21:15:45)
At 21:15:30, inbound/outbound are balanced (705 Mbps), meaning h2 flow control and TCP write capacity could fully absorb upstream data. Then upstream traffic climbs while downstream TCP begins to congest (possibly client slowdown or network jitter), and h2 flow control windows gradually fill up. Under correct backpressure,
poll_capacity()returningPendingshould stop upstream reads, causing inbound traffic to also drop. Butbreak 'capacitycausestick()to keep draining the channel and callingsend_data(), so upstream reads continue unblocked and inbound traffic keeps climbing.Phase 2: Vicious Cycle Acceleration (21:15:45 ~ 21:16:00)
The inbound-outbound delta widens from 468 Mbps to 1.46 Gbps (1.55 - 0.089). Outbound plummets from 892 Mbps to 89 Mbps, indicating severe downstream TCP congestion (TCP send buffer full, window shrinking). Yet inbound traffic keeps rising — precisely because the backpressure chain is completely broken:
poll_writealways returnsReady, so the copy loop reads from upstream TCP at full speed. All net inbound data accumulates in h2 per-stream send buffers.Phase 3: OOM (21:16:00 ~ 21:16:09)
In the 9 seconds from 21:16:00 to 21:16:09, inbound ≈ 1.55 Gbps (193.75 MB/s), net accumulation rate ≈ (1.55 - 0.089) Gbps ≈ 182 MB/s. 9 seconds of accumulation ≈ 1.64 GB. Adding the 4.51 GB already accumulated by 21:16:00: 4.51 + 1.64 = 6.15 GB. Accounting for jemalloc arena overhead and fragmentation amplification (typically 1.2-1.5x), actual RSS reaches 7.4 ~ 9.2 GB, precisely hitting the 8Gi OOM threshold.
Why 500 Tunnels Can Cause 8GB OOM
Without the bug, each tunnel's memory is bounded: ~30KB (2×8KB buffers + h2 overhead).
With the bug, each affected tunnel's memory is unbounded:
upstream_bandwidth × time_blocked.Only a simple combination of conditions is needed: fast upstream target (e.g., CDN, large file download) + slow downstream client (e.g., GC pause, network jitter, processing delay). In this incident, peak inbound traffic reached 1.55 Gbps, and at this rate, 39 seconds was enough to fill 8GB of memory.
Even a single tunnel with a fast upstream and blocked downstream can cause OOM.