Add same-metrics dataset for time series UI test scenarios#7
Add same-metrics dataset for time series UI test scenarios#7kpatticha wants to merge 8 commits intosimianhacker:mainfrom
Conversation
- Introduced a new dataset for generating time series test scenarios with the same metric in two streams, supporting different units and dimensions. - Implemented the SameMetricsSimulator, SameMetricsMetricsGenerator, and SameMetricsFormatter to handle the new dataset. - Added configuration and template generation for same metrics data streams. - Updated README.md to include details about the new dataset and usage examples.
📝 WalkthroughWalkthroughAdds two new datasets ("same-metrics" and "edge-cases") with types, config/metrics generators, formatters, template builders, and simulators; integrates dataset support into the main CLI/initialization and updates README usage/docs. Changes
Sequence DiagramsequenceDiagram
participant CLI as CLI/User
participant Simulator as Simulator (SameMetrics / EdgeCases)
participant ConfigGen as ConfigGenerator
participant MetricsGen as MetricsGenerator
participant Formatter as Formatter
participant ES as Elasticsearch
CLI->>Simulator: start(dataset, options)
Simulator->>ConfigGen: generateConfig(entityId)
ConfigGen-->>Simulator: Config
loop per-timestamp
Simulator->>MetricsGen: generateMetrics(config, timestamp)
MetricsGen-->>Simulator: Metrics
Simulator->>Formatter: formatMetrics(metrics)
Formatter-->>Simulator: FormatterResult[docs]
Simulator->>ES: setupElasticsearchTemplates() / putIndexTemplate
ES-->>Simulator: ack
Simulator->>ES: bulk index documents
ES-->>Simulator: ack
end
Simulator-->>CLI: stop/complete
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
src/types/same-metrics-types.ts (1)
11-15: Consider making dimension fields optional for flexibility.As noted in the config generator,
SameMetricsDimensionsis used with partial data (e.g., onlyhost.namefor some streams). Making these fields optional would align with actual usage:export interface SameMetricsDimensions { - 'host.name': string; - region: string; - environment: string; + 'host.name'?: string; + region?: string; + environment?: string; }This mirrors the optional fields already defined in
SameMetricsDocument.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/types/same-metrics-types.ts` around lines 11 - 15, Make the three properties on the SameMetricsDimensions interface optional to match actual usage and the existing SameMetricsDocument shape: update the SameMetricsDimensions declaration so 'host.name', region, and environment are optional (e.g., mark them as optional properties) to allow partial dimension objects when only some fields (like host.name) are present.src/formatters/same-metrics-formatter.ts (1)
18-22: Type-safe dimension assignment possible.Since
SameMetricsDimensionshas fixed keys (host.name,region,environment) that match the optional fields inSameMetricsDocument, you could avoid the double cast by spreading or directly assigning:- if (metrics.dimensions) { - for (const [key, value] of Object.entries(metrics.dimensions)) { - (doc as unknown as Record<string, unknown>)[key] = value; - } - } + if (metrics.dimensions) { + Object.assign(doc, metrics.dimensions); + }The current implementation works correctly, so this is purely a readability refinement.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/formatters/same-metrics-formatter.ts` around lines 18 - 22, The loop that assigns metrics.dimensions into the document uses a double-cast and can be simplified for type-safety and readability: replace the manual for-loop that iterates over metrics.dimensions (in same-metrics-formatter.ts) with a direct, type-safe assignment or spread so SameMetricsDimensions (keys: "host.name", "region", "environment") are merged into the SameMetricsDocument without casting; specifically, merge metrics.dimensions into the existing doc object (e.g., Object.assign(doc, metrics.dimensions) or doc = { ...doc, ...metrics.dimensions }) ensuring types align with SameMetricsDocument/SameMetricsDimensions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/simulators/same-metrics-config-generator.ts`:
- Around line 25-31: The function buildDimensionsFromNames constructs a partial
dimensions object but unsafely casts it to SameMetricsDimensions; change the
types or the casting so callers can't assume all fields exist: either update
SameMetricsDimensions to make fields optional (host.name, region, environment)
and adjust SameMetricsConfig.dimensions accordingly, or keep
SameMetricsDimensions strict and change buildDimensionsFromNames to return
Partial<SameMetricsDimensions> (and update any consumer types/usages). Locate
buildDimensionsFromNames, DEFAULT_DIMENSION_VALUES, and
SameMetricsConfig/SameMetricsDimensions in the types/diff and ensure the
returned type and downstream type annotations are updated consistently to avoid
the unsafe cast.
In `@src/simulators/same-metrics-simulator.ts`:
- Around line 104-107: Replace the invalid "body" parameter passed to the
Elasticsearch client's indices.putIndexTemplate calls with the flattened
"template" parameter expected by `@elastic/elasticsearch` v9.0.2; locate the
putIndexTemplate invocations in same-metrics-simulator (the call using
dataStream and template), and make the argument { name: dataStream, template:
template } instead of { name: dataStream, body: template }, and apply the same
change to the identical putIndexTemplate calls in weather-simulator.ts and
histograms-simulator.ts so all simulators use the "template" parameter.
---
Nitpick comments:
In `@src/formatters/same-metrics-formatter.ts`:
- Around line 18-22: The loop that assigns metrics.dimensions into the document
uses a double-cast and can be simplified for type-safety and readability:
replace the manual for-loop that iterates over metrics.dimensions (in
same-metrics-formatter.ts) with a direct, type-safe assignment or spread so
SameMetricsDimensions (keys: "host.name", "region", "environment") are merged
into the SameMetricsDocument without casting; specifically, merge
metrics.dimensions into the existing doc object (e.g., Object.assign(doc,
metrics.dimensions) or doc = { ...doc, ...metrics.dimensions }) ensuring types
align with SameMetricsDocument/SameMetricsDimensions.
In `@src/types/same-metrics-types.ts`:
- Around line 11-15: Make the three properties on the SameMetricsDimensions
interface optional to match actual usage and the existing SameMetricsDocument
shape: update the SameMetricsDimensions declaration so 'host.name', region, and
environment are optional (e.g., mark them as optional properties) to allow
partial dimension objects when only some fields (like host.name) are present.
ℹ️ Review info
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
README.mdsrc/formatters/same-metrics-formatter.tssrc/index.tssrc/simulators/same-metrics-config-generator.tssrc/simulators/same-metrics-metrics-generator.tssrc/simulators/same-metrics-simulator.tssrc/simulators/same-metrics-template-builder.tssrc/types/same-metrics-types.ts
| await this.elasticsearchClient.indices.putIndexTemplate({ | ||
| name: dataStream, | ||
| body: template | ||
| }); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat package.json | jq '.dependencies["@elastic/elasticsearch"], .devDependencies["@elastic/elasticsearch"]'Repository: simianhacker/simian-forge
Length of output: 82
🏁 Script executed:
rg -n "putIndexTemplate" --type ts -C 3Repository: simianhacker/simian-forge
Length of output: 1642
🌐 Web query:
@elastic/elasticsearch 9.0.2 putIndexTemplate API documentation
💡 Result:
For @elastic/elasticsearch v9.0.2, the Node.js client method is:
client.indices.putIndexTemplate({ name, index_patterns, composed_of, template, data_stream, priority, version, _meta, ... })(creates/updates a composable index template). [1]
Official docs:
- JavaScript client API reference →
client.indices.putIndexTemplate(arguments + behavior). [1] - Elasticsearch v9 REST API → “Create or update an index template” (
PUT /_index_template/{name}) with request/response examples. [2]
Example (Node.js):
await client.indices.putIndexTemplate({
name: 'template_1',
index_patterns: ['template*'],
priority: 1,
template: { settings: { number_of_shards: 2 } }
})[1][2]
Use template parameter instead of body for v9.0.2 client compatibility.
The @elastic/elasticsearch v9.0.2 client uses flattened parameters. The body parameter is not valid; use template instead:
await this.elasticsearchClient.indices.putIndexTemplate({
name: dataStream,
template: template
});This same issue appears in weather-simulator.ts (lines 82-85) and histograms-simulator.ts (lines 104-107) and should be fixed consistently across all simulators.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/simulators/same-metrics-simulator.ts` around lines 104 - 107, Replace the
invalid "body" parameter passed to the Elasticsearch client's
indices.putIndexTemplate calls with the flattened "template" parameter expected
by `@elastic/elasticsearch` v9.0.2; locate the putIndexTemplate invocations in
same-metrics-simulator (the call using dataStream and template), and make the
argument { name: dataStream, template: template } instead of { name: dataStream,
body: template }, and apply the same change to the identical putIndexTemplate
calls in weather-simulator.ts and histograms-simulator.ts so all simulators use
the "template" parameter.
|
Moving it back to draft so I can refine the data and create a more realistic environment (ex now I just generate total_cost for all indices :D ) |
- Introduced a new dataset for generating time series data with edge cases, simulating real-world scenarios where metric types, field types, or units change mid-backfill. - Implemented EdgeCasesSimulator, EdgeCasesMetricsGenerator, and EdgeCasesFormatter to manage the new dataset. - Added configuration and template generation for edge case data streams. - Updated README.md to include details about the new edge cases dataset and usage examples.
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (4)
src/formatters/same-metrics-formatter.ts (1)
15-16: Inconsistent error handling: formatter silently defaults while template builder throws.When
metrics.dataStreamis not found inSTREAM_TEMPLATE_CONFIGS, this code silently falls back to"double". However,buildIndexTemplateinsame-metrics-template-builder.ts(lines 145-147) throws an error for unknown data streams. This asymmetry could mask bugs if a new stream is added to the config but not properly exported.Consider making the formatter fail explicitly for unknown streams:
♻️ Proposed fix for consistency
formatMetrics( metrics: SameMetricsMetrics, ): FormatterResult<SameMetricsDocument>[] { const streamConfig = STREAM_TEMPLATE_CONFIGS[metrics.dataStream]; - const fieldType = streamConfig?.metricFieldType ?? "double"; + if (!streamConfig) { + throw new Error(`Unknown same-metrics data stream: ${metrics.dataStream}`); + } + const fieldType = streamConfig.metricFieldType;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/formatters/same-metrics-formatter.ts` around lines 15 - 16, The formatter currently defaults to "double" when STREAM_TEMPLATE_CONFIGS[metrics.dataStream] is missing, creating an asymmetry with buildIndexTemplate which throws for unknown streams; update the formatter to check streamConfig (from STREAM_TEMPLATE_CONFIGS and metrics.dataStream) and throw a clear Error if undefined (including the metrics.dataStream value) instead of silently using the fallback, then derive fieldType from streamConfig.metricFieldType only after the existence check; reference STREAM_TEMPLATE_CONFIGS, metrics.dataStream, streamConfig, fieldType and the buildIndexTemplate behavior to keep the handling consistent.src/simulators/edge-cases-config-generator.ts (1)
5-15: Silent fallback to empty string may cause issues if dimensions expand.Currently
SHARED_DIMENSIONSonly contains"host.name", which is covered byDEFAULT_DIMENSION_VALUES. However, ifSHARED_DIMENSIONSis later extended with additional dimension names not present inDEFAULT_DIMENSION_VALUES, they will silently receive empty string values, which could produce invalid or confusing test data.Consider logging a warning or throwing an error for unmapped dimensions:
♻️ Proposed defensive check
function buildDimensionsFromNames(dimNames: string[]): EdgeCasesDimensions { const result: Record<string, string> = {}; for (const name of dimNames) { + if (!(name in DEFAULT_DIMENSION_VALUES)) { + console.warn(`No default value for dimension "${name}", using empty string`); + } result[name] = DEFAULT_DIMENSION_VALUES[name] ?? ""; } return result; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/simulators/edge-cases-config-generator.ts` around lines 5 - 15, The buildDimensionsFromNames function silently uses "" for any dimNames not in DEFAULT_DIMENSION_VALUES; update it to detect unmapped names (using DEFAULT_DIMENSION_VALUES) and either throw an Error or log a clear warning so missing defaults don't produce invalid test data; modify buildDimensionsFromNames to iterate dimNames, and for each name if DEFAULT_DIMENSION_VALUES[name] is undefined then call console.warn or throw (choose consistent behavior for the module) with a message referencing the missing dimension name and SHARED_DIMENSIONS context, otherwise assign the default value from DEFAULT_DIMENSION_VALUES.src/simulators/edge-cases-template-builder.ts (1)
131-186: Consider whethertest.scenariofield is needed for parity with same-metrics.The
same-metricsformatter includes bothtest.scenarioandtest.data_streamfields (as documented in README Line 339), but this template only definestest.data_stream. While the README (Line 368) indicatesmetric.nameandtest.data_streamare sufficient for filtering edge cases, you may want to confirm this asymmetry is intentional.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/simulators/edge-cases-template-builder.ts` around lines 131 - 186, The index template built by buildIndexTemplate currently defines "test.data_stream" in the mappings but omits "test.scenario", causing asymmetry with the same-metrics formatter; add a "test.scenario" property to the mappings (properties object) with the same mapping as "test.data_stream" (e.g., { type: "keyword" }) so edge-case documents have parity with same-metrics, and ensure any consumers filtering on test.scenario will work consistently with the existing metric.name and test.data_stream fields.src/simulators/edge-cases-simulator.ts (1)
32-37: Unsafe double cast throughunknownbypasses type checking.The cast
this.metricsGenerator as unknown as EdgeCasesMetricsGeneratorloses compile-time safety. If the base class orcreateMetricsGenerator()changes, this would silently break.Consider storing the generator instance before calling
super()or adding a type guard.Proposed refactor
export class EdgeCasesSimulator extends BaseSimulator< EdgeCasesConfig, EdgeCasesMetrics, EdgeCasesDocument > { private formatter: EdgeCasesFormatter; - private edgeCasesMetricsGenerator: EdgeCasesMetricsGenerator; + private readonly edgeCasesMetricsGenerator: EdgeCasesMetricsGenerator; constructor(options: BaseSimulatorOptions) { + const metricsGen = new EdgeCasesMetricsGenerator(); super(options); this.formatter = new EdgeCasesFormatter(); - this.edgeCasesMetricsGenerator = - this.metricsGenerator as unknown as EdgeCasesMetricsGenerator; + this.edgeCasesMetricsGenerator = metricsGen; } + protected createMetricsGenerator(): MetricsGenerator< + EdgeCasesConfig, + EdgeCasesMetrics + > { + // Return the instance we already created, or create a new one if called before constructor completes + return this.edgeCasesMetricsGenerator ?? new EdgeCasesMetricsGenerator(); + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/simulators/edge-cases-simulator.ts` around lines 32 - 37, The constructor is using an unsafe double-cast this.metricsGenerator as unknown as EdgeCasesMetricsGenerator which bypasses type checking; fix by ensuring a properly typed metrics generator is created or validated before assignment: either accept a typed generator in options or call/create the generator (via createMetricsGenerator or similar) and store it in a local variable, perform a runtime type guard/assertion that it implements EdgeCasesMetricsGenerator, then assign that validated instance to this.edgeCasesMetricsGenerator; update the constructor (and any factory that produces metricsGenerator) to avoid the unknown cast and use a safe typed conversion or guard.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@README.md`:
- Line 195: Update the --dataset help text to include the missing "edge-cases"
option so the list matches the rest of the README and the code; specifically
edit the help string that currently ends with "same-metrics (default:
\"hosts\")" to read something like "same-metrics, edge-cases (default:
\"hosts\")" so the flag documentation includes the edge-cases dataset.
In `@src/formatters/histogram-helpers.ts`:
- Around line 27-46: The exponential histogram builder
buildExponentialHistogramValue computes base as Math.abs(value % 1000) which can
be 0 for value === 0 or multiples of 1000; change the base calculation to
include the same fallback used in buildHistogramValue/buildTdigestValue (e.g.,
use Math.abs(value % 1000) || 1) so sum, min, and max are never zero and the
histogram isn't degenerate.
In `@src/simulators/edge-cases-metrics-generator.ts`:
- Around line 9-13: counterValueForStream currently always reads
getStreamConfig(dataStream).phase1.baseValue so phase2 metrics will be wrong if
base values differ; change counterValueForStream to accept a phase parameter
(e.g., phase: 'phase1' | 'phase2') and obtain the base value via
getPhaseConfig(dataStream, phase).baseValue (or similar helper) instead of
hardcoding phase1, and then update the call site(s) in generateMetrics to pass
the current phase into counterValueForStream so the correct phase-specific
baseValue is used.
In `@src/types/same-metrics-types.ts`:
- Around line 34-41: SameMetricsDocument's request_duration is typed too
narrowly as number; update its type to allow the histogram/tdigest shapes
returned by formatValueForFieldType used by SameMetricsFormatter (e.g. change
request_duration to a union like number | Histogram | TDigest or a more
permissive unknown/object union) so callers of SameMetricsFormatter and any
consumers of SameMetricsDocument accept histogram objects returned by
formatValueForFieldType.
---
Nitpick comments:
In `@src/formatters/same-metrics-formatter.ts`:
- Around line 15-16: The formatter currently defaults to "double" when
STREAM_TEMPLATE_CONFIGS[metrics.dataStream] is missing, creating an asymmetry
with buildIndexTemplate which throws for unknown streams; update the formatter
to check streamConfig (from STREAM_TEMPLATE_CONFIGS and metrics.dataStream) and
throw a clear Error if undefined (including the metrics.dataStream value)
instead of silently using the fallback, then derive fieldType from
streamConfig.metricFieldType only after the existence check; reference
STREAM_TEMPLATE_CONFIGS, metrics.dataStream, streamConfig, fieldType and the
buildIndexTemplate behavior to keep the handling consistent.
In `@src/simulators/edge-cases-config-generator.ts`:
- Around line 5-15: The buildDimensionsFromNames function silently uses "" for
any dimNames not in DEFAULT_DIMENSION_VALUES; update it to detect unmapped names
(using DEFAULT_DIMENSION_VALUES) and either throw an Error or log a clear
warning so missing defaults don't produce invalid test data; modify
buildDimensionsFromNames to iterate dimNames, and for each name if
DEFAULT_DIMENSION_VALUES[name] is undefined then call console.warn or throw
(choose consistent behavior for the module) with a message referencing the
missing dimension name and SHARED_DIMENSIONS context, otherwise assign the
default value from DEFAULT_DIMENSION_VALUES.
In `@src/simulators/edge-cases-simulator.ts`:
- Around line 32-37: The constructor is using an unsafe double-cast
this.metricsGenerator as unknown as EdgeCasesMetricsGenerator which bypasses
type checking; fix by ensuring a properly typed metrics generator is created or
validated before assignment: either accept a typed generator in options or
call/create the generator (via createMetricsGenerator or similar) and store it
in a local variable, perform a runtime type guard/assertion that it implements
EdgeCasesMetricsGenerator, then assign that validated instance to
this.edgeCasesMetricsGenerator; update the constructor (and any factory that
produces metricsGenerator) to avoid the unknown cast and use a safe typed
conversion or guard.
In `@src/simulators/edge-cases-template-builder.ts`:
- Around line 131-186: The index template built by buildIndexTemplate currently
defines "test.data_stream" in the mappings but omits "test.scenario", causing
asymmetry with the same-metrics formatter; add a "test.scenario" property to the
mappings (properties object) with the same mapping as "test.data_stream" (e.g.,
{ type: "keyword" }) so edge-case documents have parity with same-metrics, and
ensure any consumers filtering on test.scenario will work consistently with the
existing metric.name and test.data_stream fields.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 06cd9dd4-dbdf-45dc-8925-1e1dceff9c07
📒 Files selected for processing (16)
README.mdsrc/formatters/edge-cases-formatter.tssrc/formatters/histogram-helpers.tssrc/formatters/same-metrics-formatter.tssrc/index.tssrc/simulators/base-simulator.tssrc/simulators/edge-cases-config-generator.tssrc/simulators/edge-cases-metrics-generator.tssrc/simulators/edge-cases-simulator.tssrc/simulators/edge-cases-template-builder.tssrc/simulators/same-metrics-config-generator.tssrc/simulators/same-metrics-metrics-generator.tssrc/simulators/same-metrics-simulator.tssrc/simulators/same-metrics-template-builder.tssrc/types/edge-cases-types.tssrc/types/same-metrics-types.ts
✅ Files skipped from review due to trivial changes (1)
- src/types/edge-cases-types.ts
🚧 Files skipped from review as they are similar to previous changes (2)
- src/simulators/same-metrics-config-generator.ts
- src/simulators/same-metrics-simulator.ts
| --backfill <value> How far back to backfill data (e.g., now-1h) (default: "now-5m") | ||
| --count <number> Number of entities to generate (default: "10") | ||
| --dataset <name> Name of the dataset: hosts, weather, unique-metrics, histograms (default: "hosts") | ||
| --dataset <name> Name of the dataset: hosts, weather, unique-metrics, histograms, same-metrics (default: "hosts") |
There was a problem hiding this comment.
The --dataset help text is missing edge-cases.
Line 195 shows same-metrics but omits edge-cases from the dataset list, though it's documented elsewhere in the README and supported by the code.
📝 Proposed fix
- --dataset <name> Name of the dataset: hosts, weather, unique-metrics, histograms, same-metrics (default: "hosts")
+ --dataset <name> Name of the dataset: hosts, weather, unique-metrics, histograms, same-metrics, edge-cases (default: "hosts")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| --dataset <name> Name of the dataset: hosts, weather, unique-metrics, histograms, same-metrics (default: "hosts") | |
| --dataset <name> Name of the dataset: hosts, weather, unique-metrics, histograms, same-metrics, edge-cases (default: "hosts") |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@README.md` at line 195, Update the --dataset help text to include the missing
"edge-cases" option so the list matches the rest of the README and the code;
specifically edit the help string that currently ends with "same-metrics
(default: \"hosts\")" to read something like "same-metrics, edge-cases (default:
\"hosts\")" so the flag documentation includes the edge-cases dataset.
| export function buildExponentialHistogramValue( | ||
| value: number, | ||
| ): Record<string, unknown> { | ||
| const base = Math.abs(value % 1000); | ||
| return { | ||
| scale: 3, | ||
| sum: base * 10, | ||
| min: base * 0.5, | ||
| max: base * 1.5, | ||
| zero: { threshold: 0, count: 0 }, | ||
| positive: { | ||
| indices: [0, 1, 2, 3, 4], | ||
| counts: [1, 2, 4, 2, 1], | ||
| }, | ||
| negative: { | ||
| indices: [], | ||
| counts: [], | ||
| }, | ||
| }; | ||
| } |
There was a problem hiding this comment.
Missing || 1 fallback causes degenerate histogram when value is 0 or multiple of 1000.
Line 30 uses Math.abs(value % 1000) without the || 1 fallback present in buildHistogramValue (line 9) and buildTdigestValue (line 20). When value is 0 or any multiple of 1000, base becomes 0, resulting in min: 0, max: 0, sum: 0.
Proposed fix
export function buildExponentialHistogramValue(
value: number,
): Record<string, unknown> {
- const base = Math.abs(value % 1000);
+ const base = Math.abs(value % 1000) || 1;
return {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| export function buildExponentialHistogramValue( | |
| value: number, | |
| ): Record<string, unknown> { | |
| const base = Math.abs(value % 1000); | |
| return { | |
| scale: 3, | |
| sum: base * 10, | |
| min: base * 0.5, | |
| max: base * 1.5, | |
| zero: { threshold: 0, count: 0 }, | |
| positive: { | |
| indices: [0, 1, 2, 3, 4], | |
| counts: [1, 2, 4, 2, 1], | |
| }, | |
| negative: { | |
| indices: [], | |
| counts: [], | |
| }, | |
| }; | |
| } | |
| export function buildExponentialHistogramValue( | |
| value: number, | |
| ): Record<string, unknown> { | |
| const base = Math.abs(value % 1000) || 1; | |
| return { | |
| scale: 3, | |
| sum: base * 10, | |
| min: base * 0.5, | |
| max: base * 1.5, | |
| zero: { threshold: 0, count: 0 }, | |
| positive: { | |
| indices: [0, 1, 2, 3, 4], | |
| counts: [1, 2, 4, 2, 1], | |
| }, | |
| negative: { | |
| indices: [], | |
| counts: [], | |
| }, | |
| }; | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/formatters/histogram-helpers.ts` around lines 27 - 46, The exponential
histogram builder buildExponentialHistogramValue computes base as Math.abs(value
% 1000) which can be 0 for value === 0 or multiples of 1000; change the base
calculation to include the same fallback used in
buildHistogramValue/buildTdigestValue (e.g., use Math.abs(value % 1000) || 1) so
sum, min, and max are never zero and the histogram isn't degenerate.
| function counterValueForStream(dataStream: string, timestamp: Date): number { | ||
| const base = getStreamConfig(dataStream).phase1.baseValue; | ||
| const minutes = Math.floor(timestamp.getTime() / (60 * 1000)); | ||
| return base + minutes * 5; | ||
| } |
There was a problem hiding this comment.
counterValueForStream always uses phase1.baseValue, ignoring the current phase.
The function reads phase1.baseValue unconditionally, but the configuration defines separate baseValue entries for both phases. While all current configurations have identical phase1/phase2 base values, this creates an inconsistency: if base values ever diverge, phase 2 metrics will incorrectly use phase 1's base value.
Consider accepting the phase as a parameter and using getPhaseConfig:
🐛 Proposed fix to use phase-specific baseValue
-function counterValueForStream(dataStream: string, timestamp: Date): number {
- const base = getStreamConfig(dataStream).phase1.baseValue;
+import { getPhaseConfig } from "./edge-cases-template-builder";
+
+function counterValueForStream(dataStream: string, timestamp: Date, phase: EdgeCasesPhase): number {
+ const base = getPhaseConfig(dataStream, phase).baseValue;
const minutes = Math.floor(timestamp.getTime() / (60 * 1000));
return base + minutes * 5;
}Then update the call site in generateMetrics:
- counterValue: counterValueForStream(config.dataStream, timestamp),
+ counterValue: counterValueForStream(config.dataStream, timestamp, phase),🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/simulators/edge-cases-metrics-generator.ts` around lines 9 - 13,
counterValueForStream currently always reads
getStreamConfig(dataStream).phase1.baseValue so phase2 metrics will be wrong if
base values differ; change counterValueForStream to accept a phase parameter
(e.g., phase: 'phase1' | 'phase2') and obtain the base value via
getPhaseConfig(dataStream, phase).baseValue (or similar helper) instead of
hardcoding phase1, and then update the call site(s) in generateMetrics to pass
the current phase into counterValueForStream so the correct phase-specific
baseValue is used.
| export interface SameMetricsDocument { | ||
| "@timestamp": string; | ||
| "metric.name": string; | ||
| request_duration: number; | ||
| "test.scenario": string; | ||
| "test.data_stream": string; | ||
| [key: string]: unknown; | ||
| } |
There was a problem hiding this comment.
Type mismatch: request_duration can be a histogram object, not just a number.
The SameMetricsFormatter uses formatValueForFieldType() which returns histogram/tdigest objects for non-numeric field types. The current type request_duration: number is too narrow.
Proposed fix
export interface SameMetricsDocument {
"@timestamp": string;
"metric.name": string;
- request_duration: number;
+ request_duration: unknown;
"test.scenario": string;
"test.data_stream": string;
[key: string]: unknown;
}Alternatively, define a union type for the histogram shapes if stricter typing is desired.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| export interface SameMetricsDocument { | |
| "@timestamp": string; | |
| "metric.name": string; | |
| request_duration: number; | |
| "test.scenario": string; | |
| "test.data_stream": string; | |
| [key: string]: unknown; | |
| } | |
| export interface SameMetricsDocument { | |
| "@timestamp": string; | |
| "metric.name": string; | |
| request_duration: unknown; | |
| "test.scenario": string; | |
| "test.data_stream": string; | |
| [key: string]: unknown; | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/types/same-metrics-types.ts` around lines 34 - 41, SameMetricsDocument's
request_duration is typed too narrowly as number; update its type to allow the
histogram/tdigest shapes returned by formatValueForFieldType used by
SameMetricsFormatter (e.g. change request_duration to a union like number |
Histogram | TDigest or a more permissive unknown/object union) so callers of
SameMetricsFormatter and any consumers of SameMetricsDocument accept histogram
objects returned by formatValueForFieldType.
Add same-metrics and edge-cases datasets for time series UI test scenarios
Summary
Introduces two new datasets for testing the Metrics experience in Discover:
same-metrics-- 9 scenario pairs (15 data streams) where each pair shares the metric namerequest_durationbut differs along one axis (dimensions, ES metric type, field type, histogram variant, or unit).edge-cases-- 6 data streams that each undergo a mid-backfill rollover to change their mapping, simulating real-world metric type, field type, or unit updates over time.Build first
npm run buildDataset:
same-metrics./forge --dataset same-metrics [--interval 1m] [--backfill now-1h] [--purge]--countis ignored; all 15 streams are always populated.same-metric-identical-asame-metric-identical-bsame-metric-different-dims-host-regionsame-metric-different-dims-host-envsame-metric-different-estype-gaugesame-metric-different-estype-countersame-metric-different-estype-histogramsame-metric-different-estype-gauge(shared)same-metric-different-fieldtype-longsame-metric-different-fieldtype-doublesame-metric-different-histogram-tdigestsame-metric-different-histogram-exponentialsame-metric-different-histogram-tdigest(shared)same-metric-different-histogram-legacysame-metric-different-unit-nullsame-metric-different-unit-mssame-metric-different-unit-ms(shared)same-metric-different-unit-secondsShared metric:
metric.name=request_duration, fieldrequest_duration. Each document includestest.scenarioandtest.data_streamfor filtering in Discover.Dataset:
edge-cases./forge --dataset edge-cases [--interval 1m] [--purge]--countand--backfillare ignored; the dataset generates a fixed 10-minute window (5 min per phase).Each stream starts with one mapping (phase 1), then the index template is updated, the data stream is rolled over, and data continues with the new mapping (phase 2). Each stream uses its stream name as both
metric.nameand the metric field name.edge-case-gauge-to-counteredge-case-histogram-to-gaugeedge-case-histogram-to-tdigestedge-case-tdigest-to-exponentialedge-case-unit-null-to-msedge-case-unit-ms-to-sOther changes
buildHistogramValue,buildTdigestValue,buildExponentialHistogramValue) intosrc/formatters/histogram-helpers.ts, used by both datasets.runBackfillStreamandrunRealTimeStreaminBaseSimulatorprotected with optional start/end params so subclasses can implement multi-phase backfill.This PR was prepared with AI assistance.
Summary by CodeRabbit
Release Notes
New Features
same-metricsandedge-casesas new--datasetCLI options for data generationDocumentation