Skip to content

feat(sql): close out Epic 14 — analytical-SQL parser regression audit (Story 14.6)#105

Merged
fupelaqu merged 6 commits into
mainfrom
release-r1
Jun 8, 2026
Merged

feat(sql): close out Epic 14 — analytical-SQL parser regression audit (Story 14.6)#105
fupelaqu merged 6 commits into
mainfrom
release-r1

Conversation

@fupelaqu

@fupelaqu fupelaqu commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Gating verification story for Epic 14 (Analytical SQL, R1). Stories 14.1–14.5
added 18 new SQL keywords to the parser; this story verifies they introduce
no regression and reconciles the docs/help indexes.

  • Cross-compile green on Scala 2.12.20 + 2.13.16
  • ParserSpec, ElasticConversionSpec, es8 + es6 bridge SQLQuerySpec all green
  • Keyword-collision audit clean: no fixture rename required (rank/row_number
      bare-identifier uses exist only in dead block-comment code; live fixtures
      alias to rnum/rk/drk)
  • Case-insensitivity confirmed via (?i) TokenRegex + new lowercased-keyword
      ParserSpec test covering all 18 Epic-14 keywords
  • WindowFunctionSpec live-green on all 4 ES majors (es6/es7/es8 default JDK,
      es9 under JDK 17)
  • Docs reconciled in both repos: backfilled functions_conditional.md
      (GREATEST/LEAST) and functions_aggregate.md (ROW_NUMBER/RANK/DENSE_RANK)
  • Help index reconciled: indexed real SAFE_CAST help file + added the
      REGEXP_LIKE help file that resolves a dangling string/_index.json ref
  • scalafmtAll clean

Epic 14 stories:

Closed Issue #104

fupelaqu and others added 6 commits June 5, 2026 10:39
Translate the ANSI NULLS FIRST / NULLS LAST clause to Elasticsearch's
sort.missing parameter (_first / _last). Add NullOrdering AST + parser
support (case-insensitive), and apply .missing(...) on the FieldSort
builder across the ES7/8/9 bridge template and the hand-maintained ES6
bridge.

Reject NULLS ordering on aggregation / GROUP BY ORDER BY (ES terms
aggregations have no missing parameter) rather than silently dropping it.

Covered by ParserSpec round-trip/reject cases, SQLQuerySpec JSON-emission
assertions, and per-version integration specs (es6/jest, es6/rest,
es7/rest, es8/java, es9/java). Docs updated in dql_statements.md.

Closed Issue #99

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
N-ary GREATEST(e1, e2, …) / LEAST(e1, e2, …) with ANSI null handling:
NULL args are ignored; result is NULL only when every arg is NULL. Numeric
scope. Emitted as a right-folded nested-ternary Painless script field over
Math.max / Math.min.

- Nullability-aware guard: non-nullable args (literals) are not wrapped in
  `== null` (Painless rejects `<primitive> == null`).
- validate() accepts numeric and unresolved (SQLAny) args, rejecting only
  definitively non-numeric types.
- Help JSON, docs (dql_statements.md), parser/bridge/integration tests.

Closed Issue #100

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
ANSI ranking windows over the existing top_hits enrichment path:
- ROW_NUMBER / RANK / DENSE_RANK with required ORDER BY inside OVER,
  optional PARTITION BY
- rank ordinals computed Scala-side in searchWithWindowEnrichment and
  injected per base-query row via (partitionKey, _id) lookup
- top-N push-down via inline LIMIT N inside OVER -> top_hits.size = N
  (default cap = index.max_inner_result_window, 100)

Review fixes: outer LIMIT no longer shrinks the window, LIMIT 0/negative
guarded, dotted ORDER BY field fallback, null enrichment under SELECT alias.

Closed Issue #101

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Accept and translate the six ANSI statistical aggregates (STDDEV,
STDDEV_POP, STDDEV_SAMP, VARIANCE, VAR_POP, VAR_SAMP), all mapping to
Elasticsearch's extended_stats aggregation via one ExtendedStatsAgg case
class parameterised by a 6-variant ExtendedStatsKind ADT.

- STDDEV = STDDEV_SAMP, VARIANCE = VAR_SAMP (ANSI default = sample).
- Sample variants project the _sampling keys (ES 7.7+); population variants
  project the un-suffixed keys (ES 6+). On ES < 7.7 sample variants log a
  warning and return null. Gated by ElasticsearchVersion.supportsStdDevVariance.
- Result key carried on ClientAggregation.aggResultField, set at conversion
  time and projected in extractMetrics; Stats branch gated on isEmpty to
  avoid fallthrough.
- OVER (PARTITION BY ...) supported via the aggregation window pipeline.
- Help JSON (6 entries + _index), docs (dql_statements, functions_aggregate),
  bridge + es6/bridge SQLQuerySpec JSON validation, testkit integration test.

Closed Issue #102

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Maps both ANSI ordered-set percentile aggregates to the Elasticsearch
`percentiles` aggregation (TDigest). One `PercentileAgg` case class; five
syntax forms (WITHIN GROUP, OVER PARTITION BY/ORDER BY, top-level GROUP BY,
(col, p) shorthand, bare whole-result-set), case-insensitive, p in [0,1]
enforced at compile time.

Multiple percentile calls on the same value column/partition coalesce into a
single ES agg (sourceAgg delegate, split back per SQL alias at extraction).
Nested `values.<key>` projection with numeric-proximity fallback for drifted
fractional keys; locale-independent percent label. PERCENTILE_DISC is
continuous-backed (documented). es6/bridge mirrored by hand.

Closed Issue #103

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… (Story 14.6)

Gating verification story for Epic 14 (Analytical SQL, R1). Stories 14.1–14.5
added 18 new SQL keywords to the parser; this story verifies they introduce
no regression and reconciles the docs/help indexes.

- Cross-compile green on Scala 2.12.20 + 2.13.16
- ParserSpec, ElasticConversionSpec, es8 + es6 bridge SQLQuerySpec all green
- Keyword-collision audit clean: no fixture rename required (rank/row_number
  bare-identifier uses exist only in dead block-comment code; live fixtures
  alias to rnum/rk/drk)
- Case-insensitivity confirmed via (?i) TokenRegex + new lowercased-keyword
  ParserSpec test covering all 18 Epic-14 keywords
- WindowFunctionSpec live-green on all 4 ES majors (es6/es7/es8 default JDK,
  es9 under JDK 17)
- Docs reconciled in both repos: backfilled functions_conditional.md
  (GREATEST/LEAST) and functions_aggregate.md (ROW_NUMBER/RANK/DENSE_RANK)
- Help index reconciled: indexed real SAFE_CAST help file + added the
  REGEXP_LIKE help file that resolves a dangling string/_index.json ref
- scalafmtAll clean

Epic 14 stories:
- #99  Story 14.1 — ORDER BY ... NULLS FIRST / NULLS LAST
- #100 Story 14.2 — GREATEST / LEAST conditional functions
- #101 Story 14.3 — ROW_NUMBER / RANK / DENSE_RANK ranking windows
- #102 Story 14.4 — STDDEV / VARIANCE statistical aggregates
- #103 Story 14.5 — PERCENTILE_CONT / PERCENTILE_DISC percentiles
- #104 Story 14.6 — parser regression audit (this commit)

Closed Issue #104

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@fupelaqu fupelaqu marked this pull request as ready for review June 8, 2026 05:54
@fupelaqu fupelaqu merged commit 64afda9 into main Jun 8, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant