[Autoloop: tsb-perf-evolve]#303
Conversation
…to reduce GC pressure Run: https://github.com/githubnext/tsessebe/actions/runs/25735886900 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…e j*2/j*3 multiplications in partition loop Operator: exploitation | Parent: c042 | Island: 3 (non-comparison/radix) Replace per-element multiplications in the main partition loop with stride counters (fsi += 2, rxBase += 3). This eliminates 2 integer multiplications per non-NaN numeric element (~190k multiplications for n=100k, 5% NaN). Also removes the redundant typeof guard from the NaN check, saving one branch per element in the common all-numeric case. Run: https://github.com/githubnext/tsessebe/actions/runs/25784510541 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Commit pushed:
|
🌱 Evergreen: Merged
|
…tructors When _permBuf / _outBuf were larger than n (from a prior larger sort), [...perm] and [...outData] in the Index/Series constructors spread all elements including stale tail entries, producing results with wrong length. Truncating to n via .length = n ensures only the n written elements are spread. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Commit pushed:
|
Evergreen: CI fix pushed 🔧Root cause: The Fix: Added an All 21 failing
|
|
Commit pushed:
|
|
Evergreen: merged This branch was 2 commits behind CI should now run on the updated branch. No code changes were needed.
|
Goal
Evolve
Series.sortValuesto match pandas performance on a 100k-element benchmark. Fitness =tsb_mean_ms / pandas_mean_ms(lower is better; < 1.0 = tsb faster than pandas).Current best: fitness 21.048 (tsb ≈ 112ms / pandas ≈ 5.3ms)
This Iteration (c042)
Change: Add module-level
_permBuf: number[]and_outBuf: number[], grown lazily. InsortValues, reuse them instead of allocatingnew Array<number>(n)andnew Array<T>(n)on every call.Hypothesis: The benchmark makes 55 calls (50 measured + 5 warmup) × 2 large JS-array allocations = ~110 MB of array allocations that trigger GC. Both
IndexandSeriesconstructors copy their inputs viaObject.freeze([...data]), so module-level buffer reuse is safe. Eliminating these per-call allocations should reduce GC pauses and lower mean_ms.Program
Related issue: #189 | State file:
tsb-perf-evolve.md🤖 This PR is maintained by Autoloop. Each accepted iteration adds a commit to this branch.