Skip to content

bundle/schema/jsonschema.json: duplicate enum in SseEncryptionDetailsAlgorithm breaks strict JSON Schema validators #5713

Description

@shahsmit14

Describe the issue

JSON Schema (bundle/schema/jsonschema.json) and the release artifact contain duplicate values in an enum for catalog.SseEncryptionDetailsAlgorithm:

"enum": ["AWS_SSE_S3", "AWS_SSE_KMS", "AWS_SSE_KMS", "AWS_SSE_S3"]

JSON Schema requires enum values to be unique. Validators that enforce this (e.g. Open Policy Agent v1.5.1) fail to compile the schema with:

rego_type_error: unable to compile the schema: enum items must be unique

This is not a deploy/runtime bug — it affects schema consumers (CI type-checking, OPA, strict JSON Schema tooling). The underlying Go enum in databricks-sdk-go is correct (only AWS_SSE_S3 and AWS_SSE_KMS); the bug appears to be in schema generation of the committed JSON artifact.

There is only one duplicate-enum location in the full schema .

Configuration

No bundle deploy configuration is required. This reproduces from the published schema alone.

Minimal consumer setup (what we use in CI):

  1. schemas/adapter/databricks-asset-bundle.json
{
  "$ref": "https://json.schemastore.org/databricks-asset-bundles.json"
}
  1. policy/adapter/databricks-asset-bundle.rego (schema metadata)
# METADATA
# scope: rule
# schemas:
# - input: schema.adapter["databricks-asset-bundle"]

Note: https://json.schemastore.org/databricks-asset-bundles.json currently $refs CLI releases/latest/download/jsonschema.json.

Steps to reproduce the behavior

Option A — inspect the schema directly

  1. Download the schema:
curl -sL -o jsonschema.json \
  https://github.com/databricks/cli/releases/latest/download/jsonschema.json
  1. Find the duplicate enum:
python3 -c "
import json
from collections import Counter
s = json.load(open('jsonschema.json'))
def find(obj, path=''):
    if isinstance(obj, dict):
        if 'enum' in obj and any(c>1 for c in Counter(obj['enum']).values()):
            print('PATH:', path)
            print('ENUM:', obj['enum'])
        for k,v in obj.items(): find(v, path+'/'+k if path else k)
    elif isinstance(obj, list):
        for i,v in enumerate(obj): find(v, path+f'[{i}]')
find(s)
"

Expected output:

PATH: $defs/github.com/databricks/databricks-sdk-go/service/catalog.SseEncryptionDetailsAlgorithm/oneOf[0]
ENUM: ['AWS_SSE_S3', 'AWS_SSE_KMS', 'AWS_SSE_KMS', 'AWS_SSE_S3']

Option B — OPA (strict JSON Schema compiler)

  1. Install OPA v1.5.1 (or v0.70.0 — same failure).
  2. Point a Rego rule at the live schemastore URL (or downloaded jsonschema.json) and run:
opa test policy --schema schemas -v
  1. See error:
policy/adapter/databricks-asset-bundle.rego:13: rego_type_error: unable to compile the schema: enum items must be unique

Schema location in repo: bundle/schema/jsonschema.json (~lines 4702–4712), embedded via bundle/schema/embed.go and served by databricks bundle schema.

Expected Behavior

  • enum for SseEncryptionDetailsAlgorithm should contain unique values only, e.g.:
"enum": ["AWS_SSE_S3", "AWS_SSE_KMS"]

Actual Behavior

  • The schema contains duplicated enum entries: AWS_SSE_S3 and AWS_SSE_KMS each appear twice.
  • OPA (and similar tools) reject the entire schema at compile time with:
enum items must be unique
  • Downstream CI pipelines that type-check bundle input against the published schema fail before any policy or deploy logic runs.

Affected snippet in bundle/schema/jsonschema.json:

"catalog.SseEncryptionDetailsAlgorithm": {
  "oneOf": [
    {
      "type": "string",
      "description": "SSE algorithm to use for encrypting S3 objects",
      "enum": [
        "AWS_SSE_S3",
        "AWS_SSE_KMS",
        "AWS_SSE_KMS",
        "AWS_SSE_S3"
      ]
    }
  ]
}

OS and CLI version

  • OS: N/A for this issue (schema artifact bug; repro does not require databricks bundle deploy)
  • CLI versions checked (via release jsonschema.json asset): v0.299.1, v0.299.2, v1.0.0, v1.1.0, v1.2.1, v1.3.0, v1.4.0, v1.5.0 — all contain the duplicate enum
  • Validator: Open Policy Agent v1.5.1 (opa test --schema)

Is this a regression?

Yes, for consumers of the published schema.

  • Before ~Apr 2026: JSON Schema Store served a self-contained inline copy of the DAB schema with no duplicate enums — OPA CI passed against that URL.
  • Apr 24, 2026: SchemaStore #5612 updated the DAB schema chain; the new schema included this duplicate enum.
  • May 7, 2026: SchemaStore #5624 changed databricks-asset-bundles.json to $ref CLI releases/latest/download/jsonschema.json directly.
  • CLI v0.299.1 (May 7, 2026) appears to be among the first releases shipping jsonschema.json as a release asset; verified broken from v0.299.1 onward.

We have not verified an older CLI release with a clean jsonschema.json asset (earlier tags return 404 for that file).

Consumer workaround: pinned $ref to SchemaStore commit 5ecefe98 (2025-09-15), last known dup-free inline revision.

Debug Logs

Not applicable — no CLI command failure. Repro is schema inspection / OPA compile only.

Suggested fix: deduplicate enum values when generating bundle/schema/jsonschema.json; add CI check that no enum array contains duplicates.

Metadata

Metadata

Assignees

No one assigned

    Labels

    DABsDABs related issues

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions