Skip to content

generate variables.json from variables.yaml to avoid yaml dependency#3932

Closed
oharboe wants to merge 1 commit intoThe-OpenROAD-Project:masterfrom
Pinata-Consulting:variables-json
Closed

generate variables.json from variables.yaml to avoid yaml dependency#3932
oharboe wants to merge 1 commit intoThe-OpenROAD-Project:masterfrom
Pinata-Consulting:variables-json

Conversation

@oharboe
Copy link
Collaborator

@oharboe oharboe commented Feb 27, 2026

generate-variables-docs.py now also outputs variables.json alongside the FlowVariables.md documentation. This allows defaults.py, non_stage_variables.py, and AutoTuner/utils.py to use the built-in json module instead of depending on the external yaml module.

generate-variables-docs.py now also outputs variables.json alongside
the FlowVariables.md documentation. This allows defaults.py,
non_stage_variables.py, and AutoTuner/utils.py to use the built-in
json module instead of depending on the external yaml module.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
@oharboe
Copy link
Collaborator Author

oharboe commented Feb 27, 2026

@maliberty @vvbandeira Unrelated CI error

@hzeller
Copy link

hzeller commented Feb 27, 2026

this is the precondition for The-OpenROAD-Project/OpenROAD#9558 to work

@maliberty
Copy link
Member

If you really want to do this you should remove variables.yaml. It seems easier to just add yaml to

local pkgs="pandas numpy firebase_admin click pyyaml yamlfix"

@hzeller
Copy link

hzeller commented Feb 28, 2026

If you really want to do this you should remove variables.yaml. It seems easier to just add yaml to

local pkgs="pandas numpy firebase_admin click pyyaml yamlfix"

But you also need to then have the package compiled in bazel to be used there. You can use pip in bazel, but having less dependencies (and Python in general..., personal opinion) is always good.

@maliberty
Copy link
Member

We don't use bazel in ORFS so I'm not sure why it is relevant here?

@maliberty
Copy link
Member

(You are going to lose the Python battle)

@hzeller
Copy link

hzeller commented Feb 28, 2026

We don't use bazel in ORFS so I'm not sure why it is relevant here?

but why did the bazel build fail when updating the Python library in the bazel dependencies ?

@maliberty
Copy link
Member

There is no bazel build of ORFS. Are you talking about some other OR PR?

@hzeller
Copy link

hzeller commented Feb 28, 2026

When I updated the rules_python in bazel: The-OpenROAD-Project/OpenROAD#9558
The CI failed. It did not fail locally for me, but apparently in some orfs part that used yaml.

@maliberty
Copy link
Member

I see this is OR -> bazel-ORFS -> ORFS for testing. In that case it would seem to be https://github.com/The-OpenROAD-Project/bazel-orfs/blob/main/requirements.in to add it to.

@oharboe
Copy link
Collaborator Author

oharboe commented Feb 28, 2026

I see this is OR -> bazel-ORFS -> ORFS for testing. In that case it would seem to be https://github.com/The-OpenROAD-Project/bazel-orfs/blob/main/requirements.in to add it to.

The problem is that variables.yaml needs to be read by Bazel repository rules, which happens before the bazel build really begins and we can use requiremements.in.

Hence it is nice to have the variables.yaml (human readable/editable) and a variables.json (computer ingestible automatically generated that stock python before dependencies are easy to use from repository rules).

I have been considering fixing this for some time, but untl @hzeller ran into it and I had Claude to do it, I never got around to it.

@oharboe
Copy link
Collaborator Author

oharboe commented Feb 28, 2026

In fact, it is not Python, but Starlark that needs to ingest the variables.json file.

.json is more computer friendly than .yaml.

@maliberty
Copy link
Member

Fine the you need to remove the yaml file here

@oharboe
Copy link
Collaborator Author

oharboe commented Feb 28, 2026

Fine the you need to remove the yaml file here

I want to keep the yaml file, human friendly, I view ir as source code.

@maliberty
Copy link
Member

I don't see anything that connects the two and they will drift apart. There should be one source of truth. If you want yaml you'll have to make it usable.


json_path = os.path.join(dir_path, "variables.json")
with open(json_path, "w") as file:
json.dump(data, file, indent=2)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maliberty Wr already check that docs are regenerated and this script also updates .json file

@oharboe
Copy link
Collaborator Author

oharboe commented Feb 28, 2026

I don't see anything that connects the two and they will drift apart. There should be one source of truth. If you want yaml you'll have to make it usable.

automatically regenerated and enforced in CI via doc generation script.

@maliberty
Copy link
Member

So in order to add/modify a variable or run the check you have to have yaml installed. This feels like an odd wart on the system. Why does variables.yaml have to be read so early that you can't install yaml first?

@oharboe
Copy link
Collaborator Author

oharboe commented Feb 28, 2026

So in order to add/modify a variable or run the check you have to have yaml installed. This feels like an odd wart on the system. Why does variables.yaml have to be read so early that you can't install yaml first?

Bazel starlark repository rules use system dependencies, not dependencies in MODULE.bazel.

@oharboe
Copy link
Collaborator Author

oharboe commented Feb 28, 2026

Gemini explains... A bit verbose, but I think it reads well.

Why .yaml is a Pain for Bazel Repository Rules

The move to include a .json version of our variables isn't about preference—it’s about Bazel's bootstrapping architecture. Here is why using .yaml directly in repository rules causes friction:

1. The "Chicken-and-Egg" Dependency Loop

Bazel Repository Rules (the logic that fetches dependencies) run during the Loading Phase.

  • To parse .yaml, you generally need an external library like pyyaml.
  • However, Bazel hasn't fetched your Python dependencies yet because it's still running the rules that define those dependencies.
  • This forces a reliance on the host system's Python, requiring every developer and CI runner to have pip install pyyaml pre-installed globally.

2. Starlark is YAML-Blind

The language used for Bazel configuration (Starlark) is a subset of Python, but:

  • JSON is Native: Starlark has a built-in json module. Reading a file is a one-liner: json.decode(repository_ctx.read(path)).
  • YAML is Foreign: There is no native YAML parser in Starlark. Parsing it requires either a complex custom implementation or calling an external system process, which breaks hermeticity.

3. Hermeticity and Portability

By using a .json file that is automatically generated from our human-friendly .yaml source:

  • Zero Host Requirements: We no longer care if the user has yaml installed on their local machine.
  • Deterministic Builds: Bazel can ingest the variables immediately and natively during the fetch phase.

Summary: We keep variables.yaml for human readability (comments, structure) and treat variables.json as a machine-readable artifact. Our CI ensures they never drift by enforcing that the doc-generation script (which produces the JSON) is run.

Would you like me to help draft a specific CI check to ensure the .json and .yaml files stay in sync?

@oharboe
Copy link
Collaborator Author

oharboe commented Mar 1, 2026

@hzeller @maliberty Asked Claude to ingest the .yaml without requiring anything locally The-OpenROAD-Project/bazel-orfs#508

@hzeller
Copy link

hzeller commented Mar 1, 2026

(A bit of outside-the-box idea here)

Json is not easy to read and write as human (no comments, no trailing commas, obnoxiously annoying quoting of field names).

YAML can be slightly more readable, but also very hard to edit as a human (very error-prone finicky white-space handling with no error detection).

If there is something that should be easy to edit from humans but also easy to process by Python-like languages, why not have something that looks like bunch of Python-style tuples and can probably be ingested somewhat easily in Python and Starlark ?

declare_variable(
    name = "GENERATE_SOME_THING",
    description = '''
some long winded description.
    ''',
    # comments work as well
    stages = [
        "foo",
        "bar",
        "baz",
    ],
)

@oharboe
Copy link
Collaborator Author

oharboe commented Mar 1, 2026

I think yaml is easy to edit and we have the CI flow.

Closing this as I think I have the fix merged in bazel-orfs.

@oharboe oharboe closed this Mar 1, 2026
@oharboe oharboe reopened this Mar 2, 2026
@oharboe oharboe closed this Mar 2, 2026
@oharboe oharboe reopened this Mar 2, 2026
@oharboe
Copy link
Collaborator Author

oharboe commented Mar 2, 2026

hmmm... this PR is probably an improvement, but I will tweak bazel-orfs to handle yaml dependency.

@oharboe oharboe closed this Mar 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants