Skip to content

ENH: SurfaceNets memory optimization via SoA layout#1546

Open
imikejackson wants to merge 1 commit intoBlueQuartzSoftware:developfrom
imikejackson:topic/surface_nets_optimization
Open

ENH: SurfaceNets memory optimization via SoA layout#1546
imikejackson wants to merge 1 commit intoBlueQuartzSoftware:developfrom
imikejackson:topic/surface_nets_optimization

Conversation

@imikejackson
Copy link
Contributor

@imikejackson imikejackson commented Feb 24, 2026

Summary

  • Replace dense Cell struct array (16 bytes/cell) with Structure-of-Arrays layout (12 bytes/cell), saving ~25% cell map memory
  • Replace Vertex::cellIndex[3] (12 bytes) with flat index (8 bytes), saving ~33% vertex array memory
  • Merge init and setCellVertices into a single O(N³) pass, eliminating a redundant traversal
  • Optimize interior cell label lookups with precomputed corner offsets and direct array indexing
  • Cache edge crossing count during cell setup, eliminating the counting pass in SurfaceNets.cpp
  • Use local vertex buffer during relaxation iterations for better cache locality
  • Remove dead code: MMGeometryOBJ, MMGeometryGL, and unused methods (net -566 lines)

Test plan

  • All 4 SurfaceNets tests pass (Default, Smoothing, Winding, Winding Smoothing)
  • Output is identical to existing exemplar data (319,447 vertices, 668,786 triangles)
  • Performance comparison on larger dataset (in progress in separate session)

Performance Comparison NO SMOOTHING

561 x 381 x 528 3D EBSD Stack
Before: 9.46 GB used and 1:14 filter time
After: 9.20 GB used and 1.06 filter time

Smoothing

Aftger takes 12.80 GB (as reported by DREAM3D-NX itself)

This may not really be worth it. The best we could do is have a templated version based on the total number of voxels. If the total is < UINT32_MAX then we could shave down the memory buy a large margin. The issue is the 64 bit clean implementation needs Size_t indices stored which just takes a bunch of memory.

Anyone have any thoughts?

…mprovements

Reduce SurfaceNets cell map memory usage by ~25% and eliminate redundant
O(N^3) traversals while producing identical output.

Key changes:
- Replace struct Cell (16 bytes/cell with padding) with separate flag and
  vertex index arrays (12 bytes/cell)
- Replace Vertex::cellIndex[3] (12 bytes) with flat index (8 bytes)
- Merge init() and setCellVertices() into a single pass
- Optimize interior cell label lookups with precomputed corner offsets
- Cache edge crossing count during cell setup
- Use local vertex buffer during relaxation for better cache locality
- Remove dead code: MMGeometryOBJ, MMGeometryGL, unused methods

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@imikejackson imikejackson requested review from JDuffeyBQ and joeykleingers and removed request for joeykleingers February 24, 2026 21:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant