This repository contains a layered MLIR tooling stack:
src/TableGenParses and evaluates TableGen syntax.src/MLIR.ODSImports interpreted TableGen records into an internal ODS model.src/MLIR.GeneratorsRoslyn incremental source generator that turns ODS models into C#.src/MLIRRuntime library for MLIR CST, parsing, printing, semantics, dialect registration, and transforms.tools/TableGenDebugDebugging utility that loads a TableGen file, evaluates it with the embedded MLIR.Generators prelude, and prints matching records.tools/TdToCSharpDebugging and inspection utility that compiles standalone.tdfiles into generated C# dialect sources.
The intended flow is:
- TableGen source ->
TableGen - Interpreted records ->
MLIR.ODS - ODS model ->
MLIR.Generators - Generated C# ->
MLIR
Prefer implementing features at the earliest correct layer.
- If a
.tdconstruct cannot be parsed or evaluated correctly, fixTableGenfirst. - Do not add importer hacks in
MLIR.ODSto compensate for missing TableGen support. - Do not add generator hacks in
MLIR.Generatorsto compensate for missing ODS model support.
Prefer actual MLIR ODS/TableGen shapes over repo-local simplifications.
When ODS or TableGen behavior is unclear:
- first check whether the answer is already obvious from local tests or repo conventions
- otherwise consult the mainline MLIR/LLVM ODS/TableGen definitions and documentation instead of inventing a local approximation
- treat upstream MLIR as the semantic reference for syntax and modeling intent unless this repo has an explicit, documented divergence
- if this repo intentionally diverges, document that in tests and code comments
Supported direction of travel:
def X : Dialectclass Y_Op<string mnemonic, list<Trait> traits = []> : Op<DialectDef, mnemonic, traits>;let arguments = (ins ...)let results = (outs ...)let assemblyFormat = "..."
When extending ODS support:
- preserve real inherited base-class structure
- preserve dialect references as record references when appropriate
- keep
cppNamespace,summary,description, traits, and declarative assembly format available in the ODS model when they are present - prefer matching upstream record structure and field names over adding compatibility shims
- add tests that mirror real upstream-style examples when possible
Generated C# namespaces come from cppNamespace.
- Split C++ namespaces on
:: - Pascal-case each segment for C#
- If the first segment is
mlir, map it toMLIR
Example:
::mlir::arith->MLIR.Arith::mlir::foo_bar->MLIR.FooBar
The runtime is intentionally layered:
- CST remains the source of truth for parsing and printing
- semantic operations are typed operation classes, not a separate generic AST wrapper
- custom assembly should be represented as CST transforms, not printer-only behavior
Prefer these boundaries:
Parserparses text into CSTPrinterprints CSTBinderbinds CST into typed semantic nodesConcreteSyntaxBuilderrewrites semantic modules to CST and can be configured to prefer custom assembly or the generic format while deciding whether existing CST nodes should be reused or rebuiltGenericSyntaxBuilderrewrites custom CST back to generic CST
Keep Printer syntax-focused. If a change sounds like "print known ops differently," it probably belongs in a CST transform or dialect assembly hook instead.
There are four important test layers:
tests/TableGen.TestsLanguage-level TableGen parsing and evaluation tests.tests/MLIR.Generators.TestsODS importer and source-generation tests.tests/DialectTestsAnalyzer-backed integration tests using real.tdfiles and generated types.tests/MLIR.TestsRuntime tests for CST, parser, printer, binder, and semantic/runtime behavior.
When changing behavior:
- add
TableGen.Testsfor new language constructs - add
MLIR.Generators.Testsfor importer/model/emission changes - add
DialectTestswhen generated code should work in a normal consumer build - add
MLIR.Testswhen runtime behavior changes
Preferred validation commands:
dotnet test tests/TableGen.Tests/TableGen.Tests.csprojdotnet test tests/MLIR.Generators.Tests/MLIR.Generators.Tests.csprojdotnet test tests/DialectTests/DialectTests.csprojdotnet test tests/MLIR.Tests/MLIR.Tests.csprojdotnet run --project tools/TableGen.Benchmarks/TableGen.Benchmarks.csproj -c Release -- run --output artifacts/benchmarks/local.jsondotnet build samples/GeneratedDialectConsumer/GeneratedDialectConsumer.csprojdotnet test MLIR.slnx -m:1
Important:
- Prefer sequential test/build runs when touching
TableGenorMLIR.Generators. - Parallel
dotnetruns can cause DLL lock failures inobj/. - If a parallel run fails with "cannot open ... for writing," rerun sequentially before assuming the code is broken.
The MLIR runtime includes TableGen-backed dialect APIs generated by the Roslyn
source generator from src/MLIR/Dialects/**/*.td. Treat the generated C# as an
inspection artifact, not as source to edit.
For quick one-off inspection of standalone .td inputs, prefer tools/TdToCSharp
over creating a temporary consumer project. The tool uses the same ODS import,
dialect merge, symbol resolution, and emission pipeline as the source generator,
but exposes it directly from the command line.
To generate emitted C# directly from one or more .td files:
dotnet run --project tools/TdToCSharp/TdToCSharp.csproj -- path/to/file.td --stdout
dotnet run --project tools/TdToCSharp/TdToCSharp.csproj -- path/to/file.td -o artifacts/generated/td2cs
dotnet run --project tools/TdToCSharp/TdToCSharp.csproj -- a.td b.td --dialect mydialect --include-preludeUse tools/TdToCSharp when:
- you want the final generated
.g.csfor a specific.tdfile - you want to inspect how multiple
.tdfragments merge into one dialect - you want to debug generator output without building a full consumer project
Use tools/TableGenDebug instead when:
- you only need to inspect evaluated TableGen records
- you are debugging parsing, evaluation, inheritance, or field values before ODS import
To inspect the generated code for the runtime project:
dotnet build src/MLIR/MLIR.csproj -m:1 -p:EmitCompilerGeneratedFiles=true -p:CompilerGeneratedFilesOutputPath=obj/Generated
find src/MLIR/obj/Generated/MLIR.Generators/MLIR.Generators.DialectGenerator -maxdepth 1 -type f -name '*.g.cs' -printThe emitted files currently have one generated file per discovered dialect, for example:
src/MLIR/obj/Generated/MLIR.Generators/MLIR.Generators.DialectGenerator/ArithDialectRegistration.g.cssrc/MLIR/obj/Generated/MLIR.Generators/MLIR.Generators.DialectGenerator/BuiltinDialectRegistration.g.cssrc/MLIR/obj/Generated/MLIR.Generators/MLIR.Generators.DialectGenerator/FuncDialectRegistration.g.cssrc/MLIR/obj/Generated/MLIR.Generators/MLIR.Generators.DialectGenerator/PreludeDialectRegistration.g.cs
Use rg to jump to the generated type, registration class, assembly-format
helper, or namespace you care about:
rg -n "namespace MLIR.Arith|public sealed class Arith_AddIOp|ArithDialectRegistration" src/MLIR/obj/Generated/MLIR.Generators/MLIR.Generators.DialectGenerator/ArithDialectRegistration.g.csIf you need a smaller end-to-end generator example, build the sample consumer and inspect its generated dialect output:
dotnet build samples/GeneratedDialectConsumer/GeneratedDialectConsumer.csproj -m:1
find samples/GeneratedDialectConsumer/obj/Generated/MLIR.Generators/MLIR.Generators.DialectGenerator -maxdepth 1 -type f -name '*.g.cs' -printWhen correlating generated C# back to TableGen, start from the input .td file
under src/MLIR/Dialects/, then follow the pipeline in order:
src/TableGenif parsing or evaluation looks wrong.src/MLIR.ODSif the evaluated records do not become the right ODS model.src/MLIR.Generatorsif the model is correct but the emitted C# is wrong.
Do not edit obj/Generated files. Make changes in the earliest correct layer,
rebuild with the commands above, and re-open the generated .g.cs output to
confirm the effect.
Interpreter-focused benchmarks live in tools/TableGen.Benchmarks.
- Use
dotnet run --project tools/TableGen.Benchmarks/TableGen.Benchmarks.csproj -c Release -- run --output artifacts/benchmarks/local.jsonto generate a local benchmark report. - On pull requests, CI runs the benchmark tool on both the PR head and the PR base, then publishes a relative comparison in the GitHub Actions step summary.
- On same-repository pull requests, CI also publishes the comparison as a sticky PR comment so the latest benchmark table stays visible on the conversation thread.
- The benchmark JSON is intended to be machine-readable; if you are making
TableGeninterpreter changes, prefer checking the benchmark summary instead of inferring performance fromdotnet testwall-clock time. - Treat changes within roughly 5% as noise unless the benchmark scenario is especially stable or repeated measurements show a consistent shift.
- If a change is meant to improve interpreter performance, mention which benchmark cases should move and verify them explicitly before concluding the work helped.
- Benchmark scenarios are directory-backed, not baked into the runner. Add new cases under
tools/TableGen.Benchmarks/Cases/with their.tdinputs undertools/TableGen.Benchmarks/Inputs/. - Fork-based pull requests may not have permission to write PR comments with
GITHUB_TOKEN; in those cases rely on the workflow summary and uploaded artifacts instead of assuming the sticky comment will appear.
- Do not edit
bin/,obj/,TestResults/, or generated outputs underobj/Generated. - Keep generated source logic centralized in
src/MLIR.Generators. - Prefer adding model richness in
MLIR.ODSover embedding ad hoc parsing rules in the emitter. - Prefer explicit tests for new TableGen constructs such as dags, code blocks, record references, traits, or assembly formats.
Treat documentation as part of the implementation, not polish to add only for public APIs.
- Add XML doc comments for non-public types and members when they carry behavior, invariants, caching rules, evaluation order, layering boundaries, or other logic a reader would need to understand the code confidently.
- Optimize comments for reader understanding rather than API formality. Explain responsibilities, data flow, captured assumptions, and why an algorithm is structured the way it is.
- Add inline comments for non-obvious control flow, subtle semantic choices, memoization, scope capture, inheritance merging, parser quirks, or behavior chosen to match upstream MLIR/TableGen semantics.
- Do not add comments that merely restate the code line-by-line. Prefer comments that help a future maintainer build the right mental model.
- When touching older code with weak documentation, improve it as you go, especially around internal helpers and private state that would otherwise require reverse engineering.
- If the repo intentionally diverges from upstream MLIR/TableGen behavior, document that near the code and in tests.
samples/GeneratedDialectConsumer is the canary for real analyzer usage.
If generator behavior changes, make sure this sample still:
- builds in a normal project
- consumes generated types directly
- uses
.tdfiles throughAdditionalFiles
Ask:
- Is this a TableGen language issue?
- Is this an ODS interpretation/modeling issue?
- Is this a generator emission issue?
- Is this a runtime/CST/semantic issue?
Put the change in the earliest layer that can express it correctly.