Add SearchStrategy dispatch + in-place batched searchsorted!#65
Conversation
Reformulates the existing sorted-search infrastructure as explicit strategy
types and adds an in-place batched API:
- `SearchStrategy` abstract type with concrete subtypes
`LinearScan`, `BracketGallop`, `BinaryBracket`, and `Auto`.
- `Base.searchsortedlast(strategy, v, x[, hint])` and the matching
`searchsortedfirst` dispatch on the strategy:
* `LinearScan` — walk ±1 from the hint.
* `BracketGallop` — `bracketstrictlymontonic` + bounded binary search.
* `BinaryBracket` — plain `searchsortedlast` / `searchsortedfirst`.
* `Auto` — picks based on hint validity and `length(v)`:
no/invalid hint or `length(v) > 32` → `BracketGallop`,
hint in range and `length(v) ≤ 32` → `LinearScan`,
no hint → `BinaryBracket`.
- `searchsortedlast!(idx_out, v, queries; strategy = Auto())` and
`searchsortedfirst!(...)` — in-place batched variants. For sorted
`queries` the previous result is used as a hint for the next, so the
total cost is O(length(v) + length(queries)) under `BracketGallop`.
Unsorted `queries` fall back to per-element search with no hint.
`searchsortedfirstcorrelated` / `searchsortedlastcorrelated` and
`searchsortedfirstvec` / `searchsortedlastvec` are kept as backwards-
compatible wrappers over the new dispatched API, so no existing caller
breaks. Version bumped to 1.9.0 since this is a backward-compatible
feature addition.
The intent is that downstream callers like SciML/DataInterpolations.jl
that hand-roll a sorted-batch lockstep walk (`while idx < n - 1 && tt[i]
> t[idx + 1]; idx += 1`) can instead call `searchsortedlast!` and get
correct O(log gap) behavior at every n/m ratio, not just dense queries.
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
…chmarks
Expands the strategy hierarchy from the previous commit and tunes the
`Auto` heuristic against a real benchmark sweep (`bench/strategies.jl`,
results checked in at `bench/results.md`).
New concrete strategies:
- `ExpFromLeft`: galloping forward from a left-bound hint
(`hint` must be ≤ the answer; otherwise falls back to a full search).
Backs `searchsortedfirstexp`-style access, suitable for batched-sorted
loops where the previous result is a natural lower bound.
- `InterpolationSearch`: extrapolates the answer index from a linear
interpolation between `v[1]` and `v[end]`, then refines with a bounded
binary search. O(1) per query on uniformly-spaced data, O(log n)
worst case. Numeric eltypes only; falls back to `BinaryBracket` for
non-numeric `v`.
Refined `Auto`:
- Per-query path is unchanged in spirit but with a smaller threshold
(16 elements) before falling back to `LinearScan`.
- Batched path now picks based on the *actual* expected gap between
consecutive results, not just `length(v) / length(queries)`. For
numeric data this uses the span ratio
`(queries[end] - queries[1]) / (v[end] - v[1])`, which correctly
detects sorted-burst queries (all clustered inside one segment of
`v`) and routes them to `LinearScan`. For non-numeric data, the
length ratio is used unchanged.
- `m == 1` and `m == 0` are handled by fast paths that skip the
span/issorted heuristics entirely (saves ~20 ns of bookkeeping that
used to dominate the wall time on single-element batches).
- `BracketGallop` is dropped from `Auto`'s batched choices because
`ExpFromLeft` strictly beats it on sorted forward queries (its
bidirectional bracketing wastes effort).
- `InterpolationSearch` is intentionally not part of the `Auto` choice:
the benchmark shows it wins big on uniformly-spaced data but loses
catastrophically on `two_scale` (clustered) data, and detecting which
regime applies cheaply enough to amortize over small `m` isn't
feasible without an O(n) `looks_linear` check. Users who know their
data is roughly linear opt in explicitly.
Benchmark sweep (`n × m × spacing × query-pattern × strategy`, 240 cells):
| | within 20% of best | > 20% slower |
|--------------------------------|--------------------|--------------|
| Auto **before this commit** | 89 / 240 (37%) | 151 / 240 |
| Auto **after this commit** | 214 / 240 (89%) | 10 / 240 |
The remaining 10 cells are all sparse queries on uniform/jittered data
where `InterpolationSearch` would dominate but isn't safe to pick
automatically — well-documented trade-off.
Tests: 26929 / 26929 passing on Julia 1.10 (was 26814 before — +115 new
assertions covering ExpFromLeft, InterpolationSearch, and the
batched-Auto heuristic on dense / sparse / dense-burst / m=1 / m=0 /
non-numeric inputs).
Also caught and fixed an `ExpFromLeft` correctness bug: when the hint
was past the actual answer (`x < v[hint]`), the function returned an
index ~`hint - 1` instead of the correct value (could be 0 if x was
below all of v). Added an explicit `x < v[hint] → fall back to full
search` check; covered by the new tests.
`bench/strategies.jl` is reproducible via `julia --project=bench
bench/strategies.jl` (30 samples per cell, ~3 minutes wall time);
`bench/results.md` is the checked-in artifact from the latest run.
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
|
Pushed a second commit (
Test count: 26814 → 26929 (+115). Full bench output is in |
Expands the strategy lineup, the bench coverage, and tightens the Auto
heuristic again based on the broader data:
- **`GuesserHint(::Guesser)` strategy** — wraps the existing
`Guesser`-based correlated search as a dispatchable strategy so it can
be passed to the new batched/per-query API alongside the others.
- **Sampled-linearity probe** — `_sampled_looks_linear` reads 5 fixed
elements of `v` (first, n/4, n/2, 3n/4, last) and tests whether the
interior points sit within 5% of the straight line between the
endpoints. ~12 ns regardless of `length(v)`. Used by Auto to unlock
`InterpolationSearch` on uniform/jittered data, gated by:
* `gap >= 64` (sparse enough that InterpolationSearch's
per-query advantage matters)
* `length(v) >= 1024` (the linearity probe + InterpSearch overhead
is amortized)
* `length(queries) >= 2` (m == 1 has its own fast path)
- **`m == 1` fast path** — Auto's batched path now short-circuits for
a single-element batch directly into `Base.searchsortedlast`/`first`,
saving the issorted + span heuristic cost (~20 ns).
- **Threshold fix** — gap check is now `>=` `_AUTO_INTERP_MIN_GAP`
(was `>`), so the boundary case where `gap` equals the threshold
qualifies for `InterpolationSearch`.
- **`ExpFromLeft` correctness fix** — previously assumed the hint was
a valid lower bound, which silently returned wrong indices when
`x < v[hint]`. Now detects that case and falls back to a full
`Base.searchsortedlast` / `searchsortedfirst`. Covered by new tests.
- **`bench/strategies.jl` broadened** — five sub-sweep modes
(`MODE=fast` (default), `full`, `spacing`, `ratio`, `pattern`,
`extreme`), 10 knot spacings (added `power2`, `sqrt`, `plateau`,
`bimodal`, `near_linear` on top of the original 5), 8 query
patterns (added `sorted_arithmetic`, `sorted_geometric`,
`sorted_bimodal`, `sorted_repeated` on top of the original 4).
Bench verdict across all sub-sweeps (772 cells total):
| sub-sweep | within 20% of best |
|----------------------|--------------------|
| spacing variety | 10 / 10 (100%) |
| pattern variety | 8 / 8 (100%) |
| super-dense (m ≫ n) | 270 / 270 (100%) |
| fast sweep | 213 / 240 (89%) |
| super-sparse (m ≪ n) | 156 / 180 (87%) |
| n/m crossover | 40 / 64 (62%) |
| **total** | **697 / 772 (90%)** |
The remaining 6% are mostly `m ≤ 4` sparse queries on uniform data where
the absolute per-call cost is ~100 ns and `InterpolationSearch` would
still win by 20-40 ns — a documented opt-in trade-off, since lowering
the linearity-probe threshold further would risk picking InterpolationSearch
on borderline-non-linear data where it loses by 4-13×.
Tests: 26929 / 26929 still passing. New artifacts in `bench/`:
- `bench/strategies.jl` — the bench script
- `bench/results.md` — fast sweep verdict (the default mode)
- `bench/results_focused.md` — spacing/pattern/ratio/extreme breakdowns
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
|
Pushed 1. Audit — does this maintain everything that was there before?Yes. All these existing names keep working unchanged: 2. More strategies to consider — added one
I also evaluated and rejected a couple of candidates:
3. Bench depth — vastly broadened
Final Per-sub-sweep:
What the heuristic now doesThe
Caught and fixed a real bug
Artifacts in the PR
Both are reproducible; they're the actual output from this commit run on Julia 1.10. Tests: 26929 / 26929 passing locally. CI still waiting for the same first-time-contributor approval. |
Two algorithmic refinements driven by the broader benchmark sweep:
- **Skew detection in `_estimate_avg_gap`** — the span-based gap
estimator now returns `(gap, skewed)` where `skewed` is true when
the median query sits more than 20 % of the query span away from
the linear midpoint. Gated on `m ≥ 10` so small-batch statistical
noise doesn't trip it. For sorted_near_start / sorted_geometric /
sorted_bimodal patterns (most queries clustered in part of `v`'s
range), the unmodified span heuristic massively underestimated the
effective gap and routed Auto to `LinearScan` when `ExpFromLeft`
was best (1.8-3.1× slowdown on those cells before).
- **`InterpolationSearch` gated on `!skewed`** — even on perfectly
uniform `v`, when queries are clustered, `ExpFromLeft` from
`prev_idx` beats `InterpolationSearch` because each consecutive
query's true index is close to the previous one's. The skew flag
now prevents Auto from gambling on InterpSearch in that regime.
- **Tightened sampled-linearity probe** from 5 probes / 5 % tolerance
to 9 probes (k·n/10 for k = 1..9) / 0.1 % tolerance. This reliably
rejects sorted-random data — which still looks linear-ish to a
loose probe at large n — without rejecting any genuinely uniform
or jittered grid. Probe cost rises from ~12 ns to ~25 ns and now
catches the case where Auto used to pick InterpSearch on random
data and lose 50-150 %.
- **`AbstractRange` short-circuit** — `_sampled_looks_linear(::AbstractRange)`
returns true without sampling. Ranges are definitionally uniform.
Bench expanded so the heuristic can be vetted in detail:
- Five **tuning sub-sweeps** (`MODE=tune`) that locate the actual
strategy crossover points empirically — LinearScan↔ExpFromLeft
crosses at gap ≈ 5; ExpFromLeft↔InterpSearch crosses around gap 8
*on uniform data*; minimum useful n for InterpSearch is ~2 048;
minimum useful m is ~2; InterpSearch's downside on non-linear data
bottoms out at 1.5× (random) and tops out at 14× (plateau).
- A **3 920-cell validation sweep** (`MODE=validate`) — every
combination of 10 spacings × 8 query patterns × 7 `n` × 7 `m`.
- Two more knot spacings (`near_linear`, `power2`) and pattern
variety (`sorted_arithmetic`, `sorted_geometric`, `sorted_bimodal`,
`sorted_repeated`) on top of the v9 set.
Bench verdict over **4 692 cells across all sub-sweeps**:
| sub-sweep | within 20 % | regression worst case |
|--------------------------|-------------|-----------------------|
| spacing (10) | 10 / 10 | none |
| pattern (8) | 7 / 8 | 1.51× at m=512 |
| super-dense (270) | 263 / 270 | 1.21× borderline |
| fast (240) | 221 / 240 | 1.59× at m=10 |
| super-sparse (180) | 147 / 180 | 2.00× at m=4 |
| n/m crossover (64) | 51 / 64 | 1.62× at m=4096 |
| **validate (3 920)** | **3 289 / 3 920 (84 %)** | **2.64× (1 cell)** |
| **aggregate (4 692)** | **3 988 (85 %)** | **only 1 cell exceeds 2 ×** |
Tests: 26 929 / 26 929 still passing. Artifacts:
- `bench/strategies.jl` — bench harness with 6 modes
- `bench/results.md` — fast sweep
- `bench/results_focused.md` — spacing / pattern / ratio / extreme
- `bench/results_validate.md` — the 3 920-cell sweep
- `bench/results_tune.md` — crossover measurements
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
|
Pushed Algorithmic changes1. Skew detection. Tuning sweeps surfaced a previously-hidden regression: on
2. 3. Tightened linearity probe. From 5 probes / 5 % tolerance to 9 probes ( 4. Bench changes
Final verdictsAcross 4 692 cells in all sub-sweeps:
The single cell exceeding 2× has a 180 ns absolute difference. The residual ~15 % outside-20 % cells are mostly I tried bumping Tests: 26 929 / 26 929 passing. CI awaiting maintainer approval as usual. |
Three small CI fixes from the FFF SciML#65 run: - `bench/strategies.jl`: rename loop variable `strat` to `strategy` so the typo check stops flagging it as `start` / `strata`. - `Project.toml`: add `StableRNGs = "1"` to `[compat]` — Aqua's `test_deps_compat` requires every test-extras dep to have a compat bound. - `docs/src/index.md`: add `@docs` entries for the new public symbols (`searchsortedlast!`, `searchsortedfirst!`, plus the strategy types `SearchStrategy` / `LinearScan` / `BracketGallop` / `ExpFromLeft` / `InterpolationSearch` / `BinaryBracket` / `GuesserHint` / `Auto`) so Documenter's cross-reference resolver finds them and the `missing_docs` check passes. Tests: 26929 / 26929 still passing locally. Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
The 1.9 release shipped the new strategy-dispatched API alongside the
existing single-purpose names as thin wrappers. Cleaner long-term shape
is a single canonical API, so 2.0 removes the legacy surface:
- **Removed**: `searchsortedfirstcorrelated`, `searchsortedlastcorrelated`
(both `Integer` and `Guesser` overloads).
- **Removed**: `searchsortedfirstvec`, `searchsortedlastvec` (allocating
batched variants — callers own their output buffer and use the in-place
`searchsortedfirst!` / `searchsortedlast!`).
- **Made internal**: `searchsortedfirstexp` and `bracketstrictlymontonic`.
They're still in the module as implementation helpers for `ExpFromLeft`
and `BracketGallop` respectively, but no longer documented or exported.
- **Kept (documented public)**: `searchsortedfirst!` / `searchsortedlast!`,
`SearchStrategy` and all concrete strategies (`LinearScan`,
`BracketGallop`, `ExpFromLeft`, `InterpolationSearch`, `BinaryBracket`,
`GuesserHint`, `Auto`), `Guesser`, `looks_linear`, `findfirstequal`,
`findfirstsortedequal`. The hint provider stays because it's
orthogonal to the dispatch; the equality-search functions stay because
they're a different problem.
Migration for downstream callers:
| old | 2.0 |
|--------------------------------------------------------------------|-----------------------------------------------------------------|
| `searchsortedfirstcorrelated(v, x, guess::Integer)` | `searchsortedfirst(BracketGallop(), v, x, guess)` |
| `searchsortedlastcorrelated(v, x, guess::Integer)` | `searchsortedlast(BracketGallop(), v, x, guess)` |
| `searchsortedfirstcorrelated(v, x, ::Guesser)` | `searchsortedfirst(GuesserHint(g), v, x)` |
| `searchsortedlastcorrelated(v, x, ::Guesser)` | `searchsortedlast(GuesserHint(g), v, x)` |
| `searchsortedfirstvec(v, x)` | `searchsortedfirst!(buf, v, x)` (caller-owned `buf`) |
| `searchsortedlastvec(v, x)` | `searchsortedlast!(buf, v, x)` |
| `searchsortedfirstexp(v, x, lo, hi)` | `searchsortedfirst(ExpFromLeft(), v, x, lo)` |
Tests rewritten to exercise the strategy API directly. Docs index updated to
present the strategy API as canonical, with `Guesser` / `looks_linear` /
equality-search as supporting helpers.
`Project.toml` bumped to `2.0.0`.
Downstream callers in SciML that import the removed names will need
companion PRs — see SciML/DataInterpolations.jl#529 for the first one.
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Two changes bundled because they touch the same area of the codebase:
1. **`SIMDLinearScan` search strategy.** New `SearchStrategy` subtype that
lowers the forward `LinearScan` walk to 8-wide SIMD chunks via custom
LLVM IR. Specialised for `DenseVector{Int64}` and `DenseVector{Float64}`
through two ordered-compare predicates each (`sgt`/`sge` for Int64,
`ogt`/`oge` for Float64), generated from a shared template
`_simd_scan_ir(t, pred)` modelled on the existing `FFE_IR` equality
scan. Any other element type falls back to scalar `LinearScan` (dispatch
is static, no runtime branch). Non-`Forward` orderings and the no-hint
path also fall back. Backward walks stay scalar — the SIMD primitive is
forward-only.
`SIMDLinearScan` is opt-in: `Auto` does not pick it, because the regime
where it beats `LinearScan` (long forward walks) overlaps with where
`Auto` already prefers `BracketGallop` or `ExpFromLeft`. Pin it when you
have a workload that wants a long forward scan and you know the eltype.
Caveats documented in `strategies.md`:
- `Int64` / `Float64` only; everything else uses scalar `LinearScan`.
- NaN in a `Float64` vector compares false under `fcmp o*`, so NaNs are
silently skipped by the SIMD scan. Sorted vectors with NaN aren't
well-defined under any total order anyway.
- Forward ordering only; reverse falls back to scalar.
Tests (`test/runtests.jl`):
- 10000 random fuzz comparisons against `Base.searchsortedlast` /
`Base.searchsortedfirst` per eltype (Int64, Float64) under `StableRNG`.
- Edge cases: empty / single-element / duplicates / out-of-range hints
/ x outside the vector range.
- Fallback paths: Int32, Float32, String, no-hint, reverse order.
Precompile workload extended to cover the new strategy on both
specialised eltypes.
2. **Docs restructured into four topical pages** plus a slimmed landing
page:
- `interface.md`: public API surface, strategy contract, how to add a
new strategy (with correctness-check pattern), note that `Auto` is
not externally extensible.
- `strategies.md`: catalog of every shipped strategy with a chooser
table (best case / worst case / hint usage), per-strategy notes.
`SIMDLinearScan` lives here with its caveats. The equality-search
functions (`findfirstequal`, `findfirstsortedequal`) are linked
from the bottom of the page, separated because their return
semantics (`nothing` on miss) don't match the positional-search API.
- `guessers.md`: `Guesser` + `GuesserHint` API, usage patterns
(interpolation-segment lookup), and explicit list of when *not* to
use a `Guesser`.
- `auto.md`: full `Auto` decision tree for both per-query and batched
callers, every crossover constant, justification per branch, and a
self-contained benchmark script that reproduces the regime-grid
comparison.
`docs/make.jl` updated to register the new pages. `index.md` cut down
to overview + per-section links.
All 47088 tests pass (20000 new fuzz cases for `SIMDLinearScan` plus the
existing 27013).
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
FFF 2.0 (SciML/FindFirstFunctions.jl#65) drops the legacy single-purpose names (`searchsortedfirstcorrelated`, `searchsortedlastcorrelated`, and the allocating `*vec` batched forms) in favour of a single strategy- dispatched API. This commit migrates DataInterpolations off the legacy surface. - `src/DataInterpolations.jl`: import `FindFirstFunctions` plus `Guesser` only; drop the legacy correlated-lookup name imports. - `src/interpolation_utils.jl`: split `get_idx` into two methods. The `iguess::Integer` overload routes through `searchsortedfirst(BracketGallop(), tvec, t, iguess)` / `searchsortedlast(...)`. The `iguess::Guesser` overload routes through `searchsortedfirst(GuesserHint(iguess), tvec, t)`. Both forms preserve the existing `lb`/`ub_shift`/`idx_shift`/`side` keyword contract. - `test/derivative_tests.jl`, `test/interpolation_tests.jl`: cached-index tests now use `searchsortedfirst(GuesserHint(A.iguesser), A.t, t)` and `searchsortedfirst(BracketGallop(), func.t, _t, func.iguesser(_t))`. - `src/interpolation_methods.jl`: trim explanatory/update prose comments that grew during the batch-evaluator work. Section headers and structural `else # Extension` markers retained; per-task narrative comments removed. - `Project.toml`: bump `FindFirstFunctions` compat to `2`. All 57879 DI tests pass locally on Julia 1.11: Core | 2539 pass, 5 broken (pre-existing) Methods | 42151 pass Extensions | 13178 pass Misc | 11 pass Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Callers issuing many short batches against the same sorted vector currently
pay Auto's per-call probes — `_sampled_looks_linear(v)` reads 11 elements
each time, ~25 ns. Caching the probe once is the obvious save for
interpolation-segment lookups and similar repeated-vector workloads.
- `SearchProperties` is an isbits struct holding `(has_props, is_linear,
has_nan)`. Default constructor returns the "no information" sentinel
(`has_props = false`); `SearchProperties(v)` runs the probes and
populates the fields.
- `Auto` gains a `props::SearchProperties` field. `Auto()` keeps current
behaviour (sentinel props → probe per call); `Auto(SearchProperties(v))`
skips the probe in the batched path. Single concrete struct, not
parametric — keeps call sites type-stable without specialization
explosion.
- The batched dispatch reads `s.props.is_linear` instead of probing when
`has_props` is set. Other fields populated for forward compatibility
but not consumed yet — `has_nan` is the obvious next consumer when
`SIMDLinearScan` gets wired into Auto's decision tree.
Trust-the-caller contract: stale or lying `SearchProperties` is
correctness-preserving — `InterpolationSearch` from a bad guess falls
through to `BracketGallop` (slow but O(log n)). The caller is responsible
for re-running `SearchProperties(v)` if `v` mutates.
Tests cover sentinel construction, populated construction on linear/
non-linear/NaN-containing/integer vectors, output equivalence between
`Auto()` and `Auto(props)`, the `isbits` guarantee, and a deliberately-
lying cache that still produces correct results.
47100 tests pass (was 47088; 12 new cases).
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Single-file changelog covering the 1.x → 2.0 transition. Sections:
- Removed names (six legacy functions) with a one-to-one migration
table to the new strategy dispatch.
- Made-internal helpers (`searchsortedfirstexp`, `bracketstrictlymontonic`)
and how to reach the same algorithm via the strategy API.
- The new strategy-dispatched search API (seven shipped strategies +
`Auto`).
- The new batched in-place API (`searchsortedfirst!` / `searchsortedlast!`).
- The `Auto` heuristic, the crossover constants, and how they were
empirically derived from the regime-grid sweep.
- `SearchProperties` cache integration with `Auto`.
- `SIMDLinearScan` with the eltype / NaN / order caveats.
- Documentation restructure into five topical pages.
- Internal: shared `_simd_scan_ir` template now backs both the
equality scan and the positional-search compares.
- Unchanged: equality search (`findfirstequal`,
`findfirstsortedequal`) — deliberately separate API.
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Equality search now flows through the strategy framework via a new wrapper
`findequal(strategy, v, x[, hint])` that returns an `Int` with the
sentinel `firstindex(v) - 1` for "not found". Type-stable, composes with
every existing strategy. The new `BisectThenSIMD <: SearchStrategy`
short-circuits to `findfirstsortedequal`'s algorithm on `DenseVector{Int64}`.
For positional dispatch (`searchsortedfirst`/`searchsortedlast`),
`BisectThenSIMD` delegates to `BinaryBracket` — bisect-then-equality-scan
can't answer "where would x insert?".
The new tests exposed four pre-existing duplicate-handling bugs, all
fixed in this commit:
- `searchsortedfirst(BracketGallop(), v, x, hint)` returned the
rightmost duplicate when the hint landed in a run of equals. Cause:
`bracketstrictlymontonic` used `x < v[lo]` to choose direction,
which is correct for `searchsortedlast` but wrong for
`searchsortedfirst`. Fix: add `bracketstrictlymontonic_first` with
the inverted polarity `v[lo] < x`, used only by the
`searchsortedfirst` overload.
- `searchsortedfirst(InterpolationSearch(), v, x)` propagated the
same bug via `BracketGallop`. Fixed by the bracket-first fix.
- `searchsortedfirst(ExpFromLeft(), v, x, hint)` exponential-searched
forward when `v[hint] == x`, missing earlier duplicates. Fixed by
falling back to a full search whenever `v[hint] >= x` (the
exponential-forward path is only safe when `v[hint] < x`).
- `findfirstsortedequal(var, vars)` bisected with `vars[mid] <= var`
and advanced offset on equal hits, walking past the first
duplicate. Fixed by tightening to strict `<` and adjusting the
window-shrink rule so the "keep" branch retains the midpoint in
the next window (`len = half + 1` instead of `len -= half`). The
fast-path LLVM IR branch select is replaced by plain `ifelse`;
Julia compiles to the same branchless `select` modulo the
`!unpredictable` metadata.
Tests: 84120 pass (was 47100). The new `findequal + BisectThenSIMD`
testset runs 2500 random-trial parity checks against `Base` across all
shipped strategies on Int64 and Float64 vectors (including frequent
duplicates), the OffsetArray-style sentinel, reverse ordering, empty
and single-element edge cases, and a check that `BisectThenSIMD` in
positional dispatch correctly delegates to `BinaryBracket`.
Docs:
- `strategies.md` gains a "Equality search through the strategy
framework" section with the `findequal` and `BisectThenSIMD`
@docs entries.
- The Equality-search appendix now points callers to `findequal` as
the preferred sorted-equality API and explains why
`findfirstequal` (unsorted) stays outside the strategy framework.
- NEWS.md adds the new-functionality section and the four-way
bug-fix section with concrete before/after behaviour.
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
…ty page
Six small cleanups grouped under one commit:
1. **Typo**: rename `bracketstrictlymontonic` to `bracketstrictlymonotonic`
(and the `_first` companion). Internal helpers — no downstream impact.
2. **FFE_IR unification**: replace the ~60-line literal `FFE_IR` string
with `_simd_scan_ir("i64", "eq")`, using the same SIMD-scan IR
template that backs the four `>`/`>=` predicates for `SIMDLinearScan`.
All five SIMD primitives now share a single IR source. The new
`_simd_scan_ir` is now defined at the top of the module so `FFE_IR`
can refer to it; the duplicate copy further down has been removed.
3. **Docstring refresh**:
- `findfirstequal` now documents the unsorted-input contract, the
Int64 SIMD specialization vs. the generic `findfirst(==(x), v)`
fallback, and points at `findequal` / `findfirstsortedequal` for
the sorted-equality alternatives.
- `findfirstsortedequal` now points at `findequal(BisectThenSIMD(), …)`
as the preferred sentinel-returning strategy-framework variant and
explains its role as the dedicated `Union{Int64, Nothing}`-returning
name.
4. **OffsetArray sentinel note**: `findequal` docstring spells out that
the "not found" sentinel is `firstindex(v) - 1` (= `0` for 1-based
vectors, adjusts for OffsetArrays), and instructs callers to test
against `firstindex(v)` rather than the literal `0`.
5. **`SearchProperties(v; linear_tolerance = 1e-3)`** exposes the
sampled-linearity probe's tolerance as a kwarg, symmetric to
`Guesser`'s `looks_linear_threshold`. The internal
`_sampled_looks_linear` is now `(v, tol::Float64)` with `tol`
defaulting to `_AUTO_LINEAR_REL_TOLERANCE` (= 1e-3) for callers that
don't pass one. Loosening to `1e-2` widens the regime where `Auto`
picks `InterpolationSearch`; tightening to `1e-4` is more
conservative.
6. **Exports**: 2.0 now exports the public API surface
(`SearchStrategy`, every concrete strategy, `SearchProperties`,
`Guesser`, `looks_linear`, `searchsortedfirst!`, `searchsortedlast!`,
`findequal`, `findfirstequal`, `findfirstsortedequal`). Previously
the package exported nothing, so callers had to fully qualify every
reference (`FindFirstFunctions.LinearScan()`). `using FindFirstFunctions`
is now sufficient to access the full API; previously qualified call
sites continue to work.
**New `equality.md` doc page**: the `findfirstequal` /
`findfirstsortedequal` docs lived as an appendix on `strategies.md`,
which was wrong because these functions don't match the strategy-dispatch
contract (different return type, different question). Moved to their own
page with a chooser table, the sentinel-vs.-Nothing discussion, and the
shared SIMD-primitive notes. `strategies.md` keeps a one-paragraph stub
pointing at the new page; `index.md` and `make.jl` updated to register
and surface the new page in the nav.
84120 tests pass (unchanged — these are all internal renames or doc
changes).
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Comprehensive rebenchmark of `Auto`'s batched decision tree against the
shipped strategies (`LinearScan`, `SIMDLinearScan`, `BracketGallop`,
`ExpFromLeft`, `InterpolationSearch`) on a 1080-cell sweep covering
5 `v` patterns × 4 query patterns × 5 `n` sizes × 6 `m` sizes × 2 element
types (`Int64`, `Float64`).
Baseline (`Auto` before this commit): median 1.18x, p95 2.09x, max 2.78x
This commit: median 1.04x, p95 1.38x, max 2.18x
With `Auto(SearchProperties(v))`: median 1.03x, p95 1.38x, max 2.09x
New decision-tree branches:
- **`SIMDLinearScan` in the medium-gap regime.** When `v` is
`DenseVector{Int64}` or `DenseVector{Float64}` (NaN-free for Float64),
`Auto` now dispatches to `SIMDLinearScan` for `gap ∈ (4, _auto_simd_gap_max(v)]`.
`_auto_simd_gap_max` is 64 for both eltypes — the eltype-specific
accessor leaves room to lower the Int64 ceiling later without changing
the Float64 path. For Float64 the NaN check consults
`SearchProperties.has_nan` when a populated cache is attached;
otherwise no-NaN is assumed (same trust contract Base uses for
sortedness). The bench shows `SIMDLinearScan` winning 25% of cells
overall, with median 1.94× speedup over `LinearScan` in the cells it
wins.
- **`BracketGallop` preferred over `ExpFromLeft` at large gaps.** New
constant `_AUTO_GALLOP_GAP_MIN = 16`. At gap ≥ 16, `ExpFromLeft`'s 5
up-front linear probes are guaranteed to miss (the answer is much
more than 5 elements past the previous-result hint), and
`BracketGallop`'s doubling-from-`hint` walk is strictly faster.
- **Tiered linearity probe for `InterpolationSearch`.** The strict
`_AUTO_LINEAR_REL_TOLERANCE = 1e-3` gates the
`_AUTO_INTERP_MIN_GAP ≤ gap < _AUTO_INTERP_LOOSE_GAP` range (8 to 256).
For `gap ≥ _AUTO_INTERP_LOOSE_GAP` (256), the loose
`_AUTO_LINEAR_LOOSE_TOLERANCE = 5e-2` applies — `InterpolationSearch`'s
cache benefit compensates for a worse guess at large gaps, and the
bounded binary-search refinement is still O(log n). This unlocks
`InterpolationSearch` on approximately linear data like sorted random
or jittered vectors at large `n`, while still rejecting genuinely
nonlinear data (log-spaced, two-scale) where `InterpolationSearch`
loses 2–3×.
Bug fix in `_estimate_avg_gap`: the previous code fell back to `n ÷ m`
whenever the skew flag was set, which caused `SIMDLinearScan` to be
picked for tightly-clustered queries (`span_q ≈ 0`) where `LinearScan`'s
scalar walk is 5× faster. The skew flag now serves its intended purpose
as a binary `InterpolationSearch`-unsuitability signal, while the gap
value is always the span-based estimate.
Reproducibility:
- `bench/auto_sweep.jl`: the full sweep. `julia --project=bench
bench/auto_sweep.jl` writes `bench/results.csv` and prints a summary
plus the worst 30 Auto-slack cells.
- `bench/analyze.jl`: reads `results.csv` and prints `SIMDLinearScan`
win-by-regime tables, the n/m gap-bucket distribution, and `Auto`'s
misprediction histogram. Useful for tuning future constants.
- `bench/.gitignore`: keeps `results.csv` and `Manifest.toml` out of
version control.
Docs:
- `auto.md` rewrites the decision-tree section, replaces the
crossover-constants table (now includes `_auto_simd_gap_max`,
`_AUTO_GALLOP_GAP_MIN`, `_AUTO_INTERP_LOOSE_GAP`,
`_AUTO_LINEAR_LOOSE_TOLERANCE`), updates the headline-results table
with the 1080-cell sweep numbers, and links to the actual bench
script. The inline script is kept as a minimum-viable standalone
reproducer.
- `NEWS.md` adds the retuning section with before/after numbers, the
new branches, and the bug-fix description.
All 84120 tests pass (no behavioural test changes — the retuning is
internal to Auto's branch selection).
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Investigated a `BitInterpolationSearch` strategy that would reinterpret
positive `DenseVector{Float64}` as `DenseVector{UInt64}` and run linear
extrapolation on the IEEE bit pattern. The hypothesis was that this would
beat `InterpolationSearch` on log-spaced data, since IEEE bit patterns are
approximately linear in array index for geometric sequences.
Implemented, correctness-verified, and benchmarked across `m ∈ {4..4096}`
on `n = 65536` log-spaced Float64 with sparse queries. Result: the
strategy loses to every existing alternative at every `m` (65–88 ns/q vs
19–41 ns/q for `ExpFromLeft`, 16–199 ns/q for `SIMDLinearScan`). The
per-query division and Float64-from-UInt64 conversion cost more than the
saved bracket refinement; on non-log-spaced data the bit-pattern guess is
also strictly worse than the float-domain guess because the
denormal-to-normalized transition introduces a discontinuity.
`Auto` already serves log-spaced data well via `SIMDLinearScan` at medium
gaps and `BracketGallop` / `ExpFromLeft` at large gaps. No new dispatch
path was justified.
NEWS.md adds an "Investigated but rejected" section documenting the
exploration so future contributors don't re-derive the same negative
result.
84120 tests pass; no behavioural change.
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Reinstating BitInterpolationSearch after a more comprehensive bench than
my earlier sweep — that earlier conclusion ("never wins") was based on a
too-narrow regime grid (single v pattern, single q pattern). The new
sweep at `bench/bitinterp_sweep.jl` covers 1404 cells across 9 v patterns
(uniform, logspaced, logspaced_wide spanning 10⁻³ to 10¹⁵,
geometric_dense, geometric_sparse, power2, sqrt, two_decade, jittered_log)
× 4 q patterns (linear_uniform, log_uniform, dense_grid, log_grid) × 6 n
sizes up to 2²⁰ × 7 m sizes from 4 to 16384.
Results:
- BitInterp wins outright in 59/1404 cells (4.2%)
- Within 10% of per-cell best in 75 cells (5.3%)
- Within 20% of per-cell best in 105 cells (7.5%)
Wins concentrate in the regime BitInterp is designed for:
log-spaced/geometric Float64 data at small m (= 4) and large n (≥ 16384).
Margins range from 1.0× tie to 1.43× on
`logspaced_wide log_grid n=2²⁰ m=4`. Other wide-spread wins:
logspaced_wide log_grid n=2¹² m=4: 35.0 vs ExpFromLeft 47.5 (1.36×)
logspaced_wide log_grid n=2¹⁸ m=4: 47.5 vs ExpFromLeft 62.5 (1.32×)
logspaced_wide dense_grid n=2²⁰ m=4: 50.0 vs ExpFromLeft 65.0 (1.30×)
`Auto` does NOT pick BitInterp. The wins are narrow (4% of cells in a
bench specifically designed to probe BitInterp's regime), and adding the
eligibility check + sampled-log-linearity probe to Auto's hot path would
penalize the much larger set of non-log-spaced workloads. Users with
known log-spaced workloads can pin
`searchsortedlast!(out, v, queries; strategy = BitInterpolationSearch())`
once and get the win at zero heuristic cost.
Implementation:
- `_bit_interp_guess_f64(v, x, lo, hi)` does per-element bitcast rather
than reinterpret(UInt64, v) on the whole vector, avoiding the
ReinterpretArray wrapper overhead.
- Strict guard on positive + finite v[lo] and x; non-positive or
non-finite Float64 bit patterns are not monotonic with float value
under raw reinterpret, so the strategy falls back to BinaryBracket.
- Non-Float64 dense eltypes fall back to plain InterpolationSearch
(where the bit pattern equals the value).
Tests pass (84120). Docs updated:
- `strategies.md` chooser table adds BitInterp with eltype/regime caveats.
- `strategies.md` per-strategy section documents the win regime, the
sample wins table, and why Auto doesn't pick it.
- `NEWS.md` documents the bench findings and the "opt-in" decision.
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
…elocity CI
Four 2.0 followup additions:
1. **`queries_sorted` kwarg** on `searchsortedfirst!` / `searchsortedlast!`.
Caller-supplied override for the runtime `issorted(queries)` check.
`nothing` (default) keeps current behaviour; `true` skips the O(m)
probe and takes the sorted-loop path unconditionally; `false` skips
the probe and takes the unsorted-loop path. Measured savings:
~6.6 μs on m=4096 batched call on the bench machine (~1.6 ns/q,
~6% headroom on top of Auto's hot path).
Wrong-answer risk for `queries_sorted = true` on unsorted input is
documented in the docstring; the inner sorted loop uses
previous-result as a hint, which becomes invalid when queries jump
backward. Tests cover sorted-true on every shipped strategy plus
unsorted-false on unsorted data.
2. **`searchsortedrange(strategy, v, lo, hi[, hint]; order)`** returning
a `UnitRange{Int}` of all indices `i` with `lo ≤ v[i] ≤ hi`.
Implemented as `searchsortedfirst(strategy, v, lo) : searchsortedlast(strategy, v, hi)`,
with the hinted form passing the left-bound result as the hint for
the right-bound search (shared bracket discovery). Exported.
Tests cover all shipped strategies on several `(lo, hi)` pairs
including out-of-range cases, plus 200-trial random fuzz on Int64
vectors vs the Base composition.
3. **`is_log_linear::Bool` field on `SearchProperties`** + backing
`_sampled_looks_log_linear(v, tol)` probe. The probe is the same
9-point sampled check as `_sampled_looks_linear` but applied to
`log.(v)` — accepts geometric / log-spaced data, rejects linear /
two-scale. Requires strictly positive finite values throughout the
sample.
`Auto` does NOT consume `is_log_linear` (the wins from
`BitInterpolationSearch` are narrow per the bitinterp_sweep bench).
The field is populated for callers who want to manually pin
`BitInterpolationSearch` based on data shape; the cost is a single
sampled-log probe at `SearchProperties(v)` construction time
(~30 ns).
4. **AirspeedVelocity benchmark CI**. Adds:
- `.github/workflows/Benchmark.yml` — runs on PRs touching `src/`,
`bench/`, `Project.toml`, or the workflow itself. Uses
`benchpkg` / `benchpkgtable` from AirspeedVelocity 0.6 to time
both base and PR commits, post a markdown diff as a PR comment,
and upload raw results as an artifact.
- `benchmark/benchmarks.jl` — `SUITE = BenchmarkGroup()` with 40
entries across four groups: per-query micro-benchmarks across
every shipped strategy, batched in-place benchmarks on a (3 × 3 × 3)
grid of `(v_kind, q_kind, n, m)` including the Auto / Auto+sorted
/ Auto+props variants, `SearchProperties` construction cost on
three eltypes, and equality-search comparison
(`findequal(BinaryBracket)` vs `findequal(BisectThenSIMD)` vs
`findfirstsortedequal`).
84626 tests pass (+506 from the new test blocks).
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
The package source had grown to a single 1900-line file mixing every layer:
LLVM IR templates, equality-search primitives, strategy type definitions,
the per-strategy dispatch methods, Auto's heuristic decision tree, the
batched in-place API, the Guesser correlated-lookup helper, the findequal
strategy-framework wrapper, and the PrecompileTools workload. Splitting
into one file per logical layer makes the source navigable without changing
any behaviour.
src/
├── FindFirstFunctions.jl 39 lines module + exports + includes
├── simd_ir.jl 135 IR template + IR constants + SIMD
│ primitives (eq, sgt, sge, ogt, oge)
├── equality.jl 73 findfirstequal + findfirstsortedequal
├── strategies.jl 268 SearchStrategy abstract + concrete
│ strategy types + SearchProperties +
│ Auto (struct + docstrings)
├── search_properties.jl 120 _sampled_looks_linear /
│ _sampled_looks_log_linear / _has_nan
│ + populated SearchProperties ctor
├── dispatch.jl 583 Per-strategy searchsortedfirst/last
│ methods + bracketstrictlymonotonic*
│ + searchsortedfirstexp + _interp_guess
│ + _bit_interp_guess_f64 +
│ _simdscan_*_specialized
├── auto.jl 168 Auto crossover constants + per-query
│ Auto + _estimate_avg_gap +
│ _auto_simd_eligible /
│ _auto_simd_gap_max /
│ _auto_interp_eligible + _auto_pick
├── batched.jl 321 searchsortedfirst!/last! +
│ sorted/unsorted inner loops +
│ _take_sorted_path + generic
│ _batched! + Auto-specialized
│ _batched! + searchsortedrange
├── guesser.jl 109 looks_linear + Guesser + GuesserHint
│ dispatch (relies on BracketGallop)
├── findequal.jl 91 findequal generic + BisectThenSIMD
│ shortcut for DenseVector{Int64}
└── precompile.jl 58 PrecompileTools @setup_workload
Include order is significant — each file may depend on names from earlier
files. The order is documented in `FindFirstFunctions.jl` and is:
simd_ir → equality → strategies → search_properties → dispatch → auto →
batched → guesser → findequal → precompile. The two non-obvious
dependencies are:
- `auto.jl` references `_sampled_looks_linear` from `search_properties.jl`
(the `_auto_interp_eligible` loose tier).
- `batched.jl` references `_estimate_avg_gap`, `_auto_simd_eligible`,
`_auto_simd_gap_max`, `_auto_interp_eligible`, `_AUTO_*` constants from
`auto.jl` (the Auto-specialized `_batched!` lives in `batched.jl`
because it shares the sorted/unsorted-loop infrastructure with the
generic `_batched!`).
84626 tests pass; no behavioural change.
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Refactors the hand-rolled sorted-batch lockstep walk in the merged Akima batched evaluator and adds new batch evaluators for `LinearInterpolation` and `CubicSpline` (formerly draft PR #527). The new helpers route per-query index lookups through `FindFirstFunctions.searchsortedlast!` so the algorithm adapts to the n/m ratio automatically. Why this matters: the previous merged Akima batched eval walked the knot index linearly via `while idx < n - 1 && tt[i] > t[idx + 1]; idx += 1`, which is O(gap) per query. For sparse queries on long `t` (e.g. `n = 65 536, m = 10`) that's O(n) of wasted walking — 36 µs per batch in practice on this CPU, dwarfing the 30 ns of useful work. The FFF strategy dispatch ([SciML/FindFirstFunctions.jl#65]) picks `LinearScan` for dense queries and `ExpFromLeft` / `InterpolationSearch` for sparse, so the per-query cost stays appropriate for every regime. Implementation: - `_sorted_batch_extrapolation_ok(A)` — shared predicate (None / Constant / Linear / Extension on each side). - `_find_last_interior_index(tt, …)` — O(log m) binary search to split `tt` into left-ext / interior / right-ext. - `_eval_interior_adaptive!(out, t, tt, i_first, i_last, n, eval_at)` — dispatches based on `32 * n_interior < n`: - dense → inline lockstep walk fused with the eval lambda - sparse → allocate an Int buffer, batched `FindFirstFunctions.searchsortedlast!`, then run the eval on the indexed positions. The 1/32 crossover was determined by the FFF bench sweep on uniform data; below that ratio the FFF batched call cleanly beats the linear walk; above it, the inline walk's tight 1-2-op-per-query inner loop wins. The Linear and CubicSpline call sites hand-inline the adaptive dispatch (rather than passing a callable through `_eval_interior_adaptive!`) because `get_parameters` inside the per-segment callable wasn't inlining through the closure boundary — costing ~70× on small-m sparse cases. Akima passes a `_AkimaEvaluator` struct (its evaluator is pure array indexing + `@evalpoly`) which inlines cleanly. Benchmarks (Julia 1.10, BenchmarkTools `minimum`): | type | n=65k,m=10 | n=65k,m=4096 | n=4k,m=4k | n=1M,m=10 | |--------|------------|--------------|-----------|-----------| | Akima | 36 µs → 706 ns (51×) | 106 µs → 104 µs | 34 µs → 33 µs | (unmeasured) → 822 ns | | Linear | n/a → 737 ns | n/a → 112 µs | n/a → 42 µs | n/a → 851 ns | | Cubic | n/a → 809 ns | n/a → 163 µs | n/a → 77 µs | n/a → 960 ns | Where master had no batched evaluator (Linear, Cubic), the comparison is to `map!(A, out, tt)` (the existing default), which uses the `Guesser`-based correlated search per-query. The new fast path is competitive at the sparse end (the existing per-query path is already fast there via the iguesser hint) and 2–7× faster on dense batches. Drops the previous `any(isnan, A.u)` check from `LinearInterpolation`'s fast-path gate (added defensively in the original sorted-batch code, but it's O(n) and was making the fast path slower than `map!` on sparse cases). Users with NaN in `u` continue to get NaN through normal arithmetic; the original per-point NaN special-case was for ForwardDiff propagation through `t/t` not for user-supplied NaN data. Project.toml: bumps FindFirstFunctions compat from `1.3` to `1.9`. This PR depends on SciML/FindFirstFunctions.jl#65 (introduces the `searchsortedlast!` API). Supersedes draft PRs #527 (Linear + CubicSpline batch evaluators) and partly #528 (the 5 remaining types not yet ported can follow the same pattern in a follow-up). Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Summary
Reformulates the existing sorted-search infrastructure as explicit strategy types and adds an in-place batched API.
Strategies (all
<: SearchStrategy):LinearScan— walk ±1 from the hintBracketGallop—bracketstrictlymontonic+ bounded binary search (the algorithm already backingsearchsorted*correlated)BinaryBracket— plainsearchsortedlast/searchsortedfirst, ignores any hintAuto— heuristically picks among the above based on hint validity andlength(v)Dispatched API:
Batched in-place API:
For sorted
queries, walksvin lockstep using the previous result as the hint for the next — total cost O(length(v) + length(queries)) underBracketGallop. For unsortedqueries, falls back to per-element search with no hint regardless of strategy.Why
Downstream callers like SciML/DataInterpolations.jl (#525-#528) have been hand-rolling sorted-batch lockstep walks with a manually-maintained
idxand a linearwhile idx < n - 1 && tt[i] > t[idx + 1]; idx += 1advance. That works well when the queries are dense relative to the knots, but regresses badly (sometimes by 100-1000×) when there are very few queries on a long vector — the linear walk is O(n) regardless of how many queries actually live in that gap.The right home for the "which strategy to use" heuristic is here, not duplicated in every consumer. With this PR,
searchsortedlast!does the right thing in every regime, and downstream code becomes a one-line replacement.Backwards compatibility
All existing exports keep working — they're now thin wrappers:
searchsortedfirstcorrelated(v, x, guess)≡searchsortedfirst(BracketGallop(), v, x, guess)searchsortedlastcorrelated(v, x, guess)≡searchsortedlast(BracketGallop(), v, x, guess)searchsortedfirstvec(v, x)allocates and callssearchsortedfirst!searchsortedlastvec(v, x)allocates and callssearchsortedlast!The
Guesserpath,Rangefast paths,Reverseordering, andsearchsortedfirstexpare untouched.Version bumped to 1.9.0 (new feature, additive).
Auto heuristic
The 32-element threshold is chosen because at that length the binary-search bracket-setup overhead is comparable to a worst-case linear walk, and below it
LinearScansaves the bracket call entirely.What is in this PR
src/FindFirstFunctions.jl: strategy types + dispatchedBase.searchsortedlast/Base.searchsortedfirstmethods +searchsortedlast!/searchsortedfirst!+ reformulated existing wrappers.test/runtests.jl: two new safetestsets covering strategy equivalence acrossn ∈ {0, 1, 2, 8, 33, 257}, every hint position,Reverseorder, the batched API on sorted/unsorted/empty inputs,DimensionMismatch, and sparse queries on a long vector to exercise the gallop hint path.Project.toml: bump to 1.9.0.Test plan
Pkg.test()locally: 26814 / 26814 passing on Julia 1.10 (was 25481 before — +1333 new assertions across the two new testsets)Please ignore until reviewed by @ChrisRackauckas. The follow-up plan is small PRs to DataInterpolations.jl swapping the hand-rolled walks in #525-#528's merged Akima and the in-flight #527/#528 for
searchsortedlast!calls.🤖 Generated with Claude Code