Skip to content

Enable GPU depth sorting for 2D sprite rendering in WebGL #1370

@obiot

Description

@obiot

Summary

Add proper GPU-based z-sorting for 2D sprite rendering in WebGL, using the hardware depth buffer instead of CPU-based array sorting. This should remain an opt-in option since CPU sorting (current default) is required for isometric, hexagonal, and other non-orthogonal rendering modes where draw order depends on spatial position rather than a simple z value.

Current State

The depthTest: "z-buffer" application setting already exists and:

  • Enables gl.DEPTH_TEST with gl.LEQUAL and gl.depthMask(true) (webgl_renderer.js:158-162)
  • Disables CPU-based autoSort on the world container (application.ts:387-389)

But it doesn't actually work — the vertex shaders hardcode z to 0.0:

// quad.vert — z is always 0.0
gl_Position = uProjectionMatrix * vec4(aVertex, 0.0, 1.0);
// primitive.vert — z is always 0.0
gl_Position = uProjectionMatrix * vec4(position, 0.0, 1.0);

The vertex attribute aVertex is a vec2 (x, y only), and addQuad() pushes (x, y, u, v, tint) with no z component. So all sprites write the same depth value and the depth buffer has no effect.

What Needs to Change

1. Pass z to the vertex shader

Expand the vertex attribute from vec2 to vec3 (or add a separate aDepth float attribute):

// quad.vert
attribute vec3 aVertex;  // was vec2
void main(void) {
    gl_Position = uProjectionMatrix * vec4(aVertex, 1.0);
    // z is now the renderable's depth value
}

2. Feed z into the batcher

addQuad() needs to receive and push a z/depth value per vertex:

// quad_batcher.js — addQuad currently pushes:
vertexData.push(vec0.x, vec0.y, u0, v0, tint);

// should become (when z-buffer mode is active):
vertexData.push(vec0.x, vec0.y, depth, u0, v0, tint);

The depth value comes from the renderable's pos.z (or a normalized version mapped to the camera's near/far range).

3. Normalize z values to clip space

The projection matrix (camera2d.ts:288) already defines near/far planes (this.near = -1000, this.far = 1000). The renderable's pos.z needs to be mapped into the [0, 1] (or [-1, 1]) depth range:

// normalize z to [0, 1] for the depth buffer
const depth = (pos.z - camera.near) / (camera.far - camera.near);

4. Handle alpha / transparency

The depth buffer doesn't handle semi-transparent sprites correctly — a transparent pixel that writes to the depth buffer will occlude sprites behind it. Two common solutions:

  • Alpha test in fragment shader: if (gl_FragColor.a < threshold) discard; — works for fully opaque sprites with transparent edges (pixel art)
  • Hybrid approach: render opaque sprites front-to-back with depth write, then render transparent sprites back-to-front without depth write (more complex but correct for semi-transparent sprites)

For a first implementation, the alpha discard approach is simplest and covers most 2D game use cases (pixel art, clean sprite edges).

5. Update all batchers

  • quad.vert — expand aVertex to vec3
  • primitive.vert — expand aVertex to vec3
  • QuadBatcher.addQuad() — accept and push z component
  • PrimitiveBatcher — same treatment
  • Vertex attribute definitions — update size: 2size: 3 for aVertex
  • Stride calculations will update automatically from the attribute sizes

6. Update the renderer pipeline

The renderer needs to pass the current renderable's z value through to the batcher. This could flow through:

  • renderer.currentZ set during preDraw() from the renderable's pos.z
  • Or passed explicitly to drawImage() / addQuad()

7. Automatic fallback to CPU sorting

GPU depth sorting only works when containers sort on the z axis. When a container uses a different sortOn value (e.g. "y" for isometric games, "x" for side-scrollers), draw order depends on spatial position and cannot be resolved by the depth buffer alone.

The engine should automatically fall back to CPU sorting per container when sortOn is not "z":

// in Container or Application init:
// GPU depth sorting is only valid when sortOn === "z"
if (depthTest === "z-buffer" && this.sortOn !== "z") {
    // override: force CPU sorting for this container
    this.autoSort = true;
}

This means:

  • depthTest: "z-buffer" + sortOn: "z" → GPU depth sorting, autoSort disabled
  • depthTest: "z-buffer" + sortOn: "y" or "x" → CPU sorting stays active for that container
  • depthTest: "sorting" (default) → always CPU sorting regardless of sortOn

This should also be checked dynamically — if a developer changes sortOn at runtime on a container, autoSort should be re-evaluated accordingly. Nested containers can have different sortOn values, so the per-container granularity matters.

ShaderEffect / GLShader Backward Compatibility

Changing aVertex from vec2 to vec3 and updating the vertex buffer layout is fully backward compatible with existing ShaderEffect and custom GLShader usage:

  1. ShaderEffect uses the built-in quad.vert (shadereffect.js:74) — updating the vertex shader automatically propagates to all ShaderEffect instances. No user code changes needed.

  2. The batcher owns the vertex layout, not the shaderuseShader() in batcher.js:223 always calls shader.setVertexAttributes(gl, this.attributes, this.stride) with the batcher's own attribute definitions. So any shader (built-in or custom) is told the correct offsets, sizes, and stride by the batcher. The shader doesn't independently decide the layout.

  3. User fragment code is untouchedShaderEffect users only write an apply(vec4 color, vec2 uv) function. They never reference aVertex or any vertex attributes. The varyings they receive (vColor, vRegion) are unchanged.

  4. Direct GLShader users — even if a custom vertex shader declares attribute vec2 aVertex, the attribute pointer set by the batcher (setVertexAttributes) determines what data the shader actually reads from the buffer. The batcher's attribute definition (size: 3) takes precedence over the shader's declared type. The shader would read x and y correctly; the z component would be present in the buffer but unused by shaders that don't reference it.

No migration or breaking changes for any shader users.

When NOT to Use GPU Depth Sorting

CPU sorting (depthTest: "sorting", the default) must remain available and is required when:

  • Isometric games — draw order depends on Y position, not z value (sortOn: "y")
  • Hexagonal maps — complex tile overlap rules
  • Custom sort orderssortOn: "x" or user-defined sort functions
  • Semi-transparent overlapping sprites — require strict back-to-front order
  • Canvas renderer — no GPU depth buffer available

Configuration

The existing depthTest setting already covers this:

const app = new Application({
    depthTest: "z-buffer",  // GPU depth sorting (opt-in)
    // depthTest: "sorting", // CPU array sort (default)
});

No new settings needed — just make the existing "z-buffer" option actually work. The automatic fallback when sortOn !== "z" ensures correctness without additional configuration.

Performance Impact

When enabled, GPU depth sorting should:

  • Eliminate deferred CPU sort calls on containers (container.sort() with defer())
  • Reduce JavaScript overhead on scenes with many sprites changing z-order
  • Allow the GPU to handle draw order natively
  • Trade-off: slightly larger vertex buffer (one extra float per vertex)

References

  • depthTest setting: src/application/settings.ts:53
  • autoSort disabled for z-buffer: src/application/application.ts:386-389
  • WebGL depth state setup: src/video/webgl/webgl_renderer.js:155-166
  • Quad vertex shader: src/video/webgl/shaders/quad.vert
  • Primitive vertex shader: src/video/webgl/shaders/primitive.vert
  • QuadBatcher.addQuad(): src/video/webgl/batchers/quad_batcher.js:141
  • Batcher vertex attribute binding: src/video/webgl/batchers/batcher.js:214-233
  • ShaderEffect uses quadVertex: src/video/webgl/shadereffect.js:74
  • Container CPU sorting: src/renderable/container.js:793-813
  • Container sortOn property: src/renderable/container.js:86-91
  • Camera near/far: src/camera/camera2d.ts:99-105

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions