Skip to content

feature: project metadata#30

Merged
cscheid merged 30 commits intomainfrom
feature/project-metadata
Mar 12, 2026
Merged

feature: project metadata#30
cscheid merged 30 commits intomainfrom
feature/project-metadata

Conversation

@gordonwoodhull
Copy link
Collaborator

_quarto.yml and _metadata.yml contribute to metadata across runtimes.

Includes smoke-all tests for the wasm runtime. quarto-hub uses common render function render_qmd

The browser runtime is a little different - maybe it would be worth also running the smoke tests there or instead of the current wasm smoke-all tests. It took a few more steps to get this running in the browser, tested manually.

Filing as draft for CI until we discuss, but I think this is ready to merge.

Metadata Merge Layers

During the AstTransformsStage, metadata from multiple sources is merged
into a single flat config for the target format. Each layer is first
flattened via resolve_format_config: top-level keys are kept, the
format key is removed, and format.{target}.* keys are merged on top
(overriding same-named top-level keys). The flattened layers are then
merged in order — later layers override earlier ones.

Merge order (lowest → highest precedence)

  1. Project_quarto.yml metadata
  2. Directory_metadata.yml files between project root and document
    directory, walked root-to-leaf (deeper directories override shallower).
    The project root directory itself is excluded (the walk starts from the
    first subdirectory toward the document).
  3. Document — YAML frontmatter of the .qmd file
  4. Runtime — injected by the host environment (e.g., hub-client sets
    format.html.source-location: full for scroll sync)

Layers 1–2 require a project config (_quarto.yml) to exist.
Layer 4 is optional and comes from SystemRuntime::runtime_metadata().

Compatibility with TS Quarto

This matches the TS Quarto merge order in directoryMetadataForInputFile
(src/project/project-shared.ts), which walks from project root toward
the input directory, plus project and CLI metadata layers. TS Quarto's
CLI --metadata / --metadata-file flags correspond to our runtime
metadata layer.

Note: TS Quarto's directory walk uses relative(projectDir, inputDir)
split by path separator, which means root _metadata.yml is only checked
when the relative path is non-empty (i.e., the document is in a
subdirectory). Our implementation matches this behavior.

Implementation

crates/quarto-core/src/stage/stages/ast_transforms.rs (merge logic)
crates/quarto-core/src/project.rsdirectory_metadata_for_document
crates/quarto-config/src/format.rsresolve_format_config

@gordonwoodhull gordonwoodhull force-pushed the feature/project-metadata branch from 36c23a1 to d80cc8c Compare March 12, 2026 10:22
@gordonwoodhull gordonwoodhull marked this pull request as ready for review March 12, 2026 10:22
@gordonwoodhull
Copy link
Collaborator Author

Progress since PR was filed

Theme CSS in the pipeline. Theme CSS compilation (SCSS → CSS via Bootstrap/Bootswatch) now runs inside a CompileThemeCssStage in the render pipeline, after metadata merge. This means theme: set in _quarto.yml or _metadata.yml actually affects CSS output — previously only document frontmatter was consulted. Includes:

  • A general SystemRuntime caching interface (filesystem on native, IndexedDB on WASM) so SCSS compilation results are cached across renders
  • Removal of the pre-pipeline CSS extraction in render_to_file.rs and JS-side compileAndInjectThemeCss in hub-client

MetadataMergeStage extracted. Metadata merging (project → directory → document → runtime layers, with format flattening) is now its own pipeline stage, separate from AstTransformsStage. This gave CompileThemeCssStage a clean place to slot in after merge but before transforms.

All pipeline consumers migrated off Format.metadata. Transforms like AppendixTransform, FootnoteTransform, TitleBlockTransform, and template selection now read from doc.ast.meta (ConfigValue) instead of the old Format.metadata (serde_json::Value). The Format.metadata field, extract_format_metadata(), and all supporting machinery have been removed — Format is now just identifier + extension + pipeline flag.

Smoke-all Playwright E2E tests. 34 smoke-all fixtures now run as browser E2E tests through the full Automerge pipeline (hub server → sync → WASM render → preview). Uses document readiness polling instead of fixed sleeps.

Bug fixes. Project-level !path resolution for SCSS themes, ProjectContext.config made non-optional to fix single-file format flattening, and appendix-style metadata inheritance from project config.

Add resolve_format_config() to extract format-specific settings from metadata.
This enables proper merging of _quarto.yml settings with document frontmatter,
where format.html.* settings override top-level settings when rendering to HTML.

Changes:
- New quarto-config/src/format.rs with resolve_format_config() function
- ProjectConfig stores full metadata as ConfigValue (replaces raw + format_config)
- AstTransformsStage flattens both project and document metadata for target format
- TocGenerateTransform reads from ast.meta instead of format metadata
- WASM updated to use ProjectConfig::with_metadata()

Merge precedence (lowest to highest):
1. Project top-level settings
2. Project format-specific settings (format.{target}.*)
3. Document top-level settings
4. Document format-specific settings (format.{target}.*)

Includes unit tests for format resolution and integration tests for
metadata merging in AstTransformsStage.
Add support for _metadata.yml files in directory hierarchies, matching
TS Quarto behavior. Directory metadata is discovered by walking from
project root to document's parent directory.

Core implementation:
- Add directory_metadata_for_document() in project.rs
- Walk directory hierarchy, parse _metadata.yml files
- Support both .yml and .yaml extensions
- Return layers in root-to-leaf order for merging
- Resolve relative paths in _metadata.yml against their source directory

Merge integration:
- Update ast_transforms.rs to include directory metadata
- Merge order: project -> dir[0] -> dir[1] -> ... -> document
- Each layer flattened for target format before merging

Smoke-all tests for basic inheritance, multi-level hierarchy merging,
document overrides, and path resolution.

Also adds default noErrorsOrWarnings assertion to smoke-all tests
(matching TS Quarto conventions) and supportPath file existence checks.
Replace the hardcoded single-file ProjectContext in render_qmd() with
ProjectContext::discover(), enabling _quarto.yml and _metadata.yml
support in hub-client WASM rendering.

Key changes:
- Unify the two WasmRuntime instances: the global VFS singleton is now
  stored as Arc<WasmRuntime> and shared with the rendering pipeline
  (previously each render created a fresh empty WasmRuntime)
- Port directory_metadata_for_document() from std::fs to SystemRuntime
  trait, enabling it to work with both native filesystem and WASM VFS
- render_qmd() now calls ProjectContext::discover() to find _quarto.yml
  in VFS parent directories (render_qmd_content* variants remain
  single-file since they receive inline content)

Includes 6 end-to-end WASM tests verifying project title inheritance,
document override precedence, parent directory discovery, and directory
metadata merging.
Vitest-based test runner that exercises all 15 smoke-all fixtures through
the WASM rendering pipeline, verifying feature parity with the native
Rust test runner. Implements all assertion types: ensureFileRegexMatches,
ensureHtmlElements (via jsdom), noErrors, noErrorsOrWarnings, shouldError,
and printsMessage. Filesystem assertions (fileExists, folderExists,
pathDoesNotExist) are parsed but treated as no-ops in WASM context.

One skip: expected-error.qmd's printsMessage check is bypassed because
WASM formats the error message differently than native render_to_file.
The shouldError assertion still runs for that fixture.

Adds `yaml` devDependency for frontmatter parsing.
Introduce a general-purpose mechanism for runtimes to inject metadata into the
configuration merge pipeline at the highest precedence, matching how quarto-cli
handles --metadata flags. Any runtime (WASM, native, sandboxed) can now provide
arbitrary metadata without per-feature API additions.

Changes:
- Add runtime_metadata() to SystemRuntime trait (returns Option<serde_json::Value>)
- Implement runtime metadata storage in WasmRuntime with set/get methods
- Add vfs_set_runtime_metadata/vfs_get_runtime_metadata WASM entry points
- Update AstTransformsStage to merge runtime metadata as highest-precedence layer
- Relax merge gate to also trigger when runtime metadata present (no project needed)
- Fix pre-existing bug: pampa HTML writer now reads top-level source-location key
  (consistent with Pandoc, which receives flattened metadata after format resolution)

Merge precedence (lowest to highest):
  Project → Directory → Document → Runtime

Tests: 6 Rust unit tests + 12 WASM integration tests, all passing.
Full suite: 6550 Rust tests + 47 WASM tests green.
Replace render_qmd_content (content string, no project context) with
render_qmd (VFS path, full project context) for the Preview component.
This gives live preview access to _quarto.yml, _metadata.yml, themes,
and all metadata layers.

Key changes:
- renderToHtml() now takes {documentPath} instead of (content, opts)
- Add renderContentToHtml() for standalone rendering (AboutTab)
- Add setScrollSyncEnabled() via runtime metadata (not per-render)
- Add setRuntimeMetadata/getRuntimeMetadata wrappers
- Remove renderQmdContentWithOptions and WasmRenderOptions

Known issue: compileAndInjectThemeCss causes "too much recursion" crash
in dart-sass when called after renderQmd. Needs investigation — theme
CSS compilation works but triggers infinite recursion in a Vite chunk.
The hub server's file discovery only recognized _quarto.yml/yaml,
.qmd files, and binary resources. Directory metadata files
(_metadata.yml/_metadata.yaml) were silently ignored, so they were
never added to the Automerge index, never synced to clients, and
never appeared in the VFS or file browser.

Add _metadata.yml and _metadata.yaml to the config file discovery
filter alongside _quarto.yml/yaml.
directory_metadata_for_document uses strip_prefix to compute the
relative path from project root to document directory. This requires
both paths to be in the same canonical form. ProjectContext::discover
always canonicalizes project.dir, but callers could pass relative or
non-canonical document paths (e.g., WASM render_qmd with VFS paths
like "chapters/chapter1.qmd"), causing strip_prefix to fail silently
and return no directory metadata layers.

Canonicalize the document_path inside the function using the runtime,
matching the established pattern (ProjectContext::discover,
ThemeContext, compile_document_css all canonicalize internally).

Also fix the test helper to canonicalize project.dir, matching the
invariant that discover enforces in production. On macOS, TempDir
paths go through /tmp -> /private/tmp symlinks, so without
canonicalization the test helper created ProjectContexts that didn't
match real-world behavior.
This function is superseded by the runtime metadata layer (ddcc1236).
Source location and other render options are now configured via
vfs_set_runtime_metadata() instead of a separate render function.

Removes WasmRenderOptions struct, the function itself, and all
references from TS interfaces and type definitions. Updates ad-hoc
WASM test scripts to use runtime metadata pattern instead.
The metadata merge (project + directory + document + runtime → single
flattened config) was inside AstTransformsStage::run. This is
architecturally wrong — metadata merge is a configuration resolution
step, not an AST transform. It also blocks future work: any pipeline
stage that needs merged metadata (e.g., CSS compilation) had to run
after AstTransformsStage.

This extracts the merge into its own MetadataMergeStage that runs
immediately before AstTransformsStage in all pipeline builders:

  ParseDocument → EngineExecution → MetadataMerge → AstTransforms → ...

Pure refactor — no behavior change. The same merge logic runs at the
same point in the pipeline, just in a dedicated stage.

Files changed:
- New: stages/metadata_merge.rs (stage + 12 moved tests)
- stages/ast_transforms.rs: removed merge logic and merge tests
- stages/mod.rs, stage/mod.rs: re-export MetadataMergeStage
- pipeline.rs: insert MetadataMergeStage in all 4 pipeline builders,
  update stage count assertions and doc comments
Add a platform-abstracted caching API to SystemRuntime with four async
methods: cache_get, cache_set, cache_delete, and cache_clear_namespace.
Default implementations are no-ops, allowing runtimes without caching
support (or with no cache dir configured) to silently skip caching.

NativeRuntime implements filesystem-backed caching at a configurable
cache directory (typically {project_dir}/.quarto/cache/{namespace}/{key}).
Writes are atomic via tempfile + rename. Namespace and key validation
prevents path traversal (alphanumeric + hyphen + underscore only).

- Add CacheError variant to RuntimeError
- Add validate_cache_key/validate_cache_namespace public helpers
- Add NativeRuntime::with_cache_dir() constructor and cache_dir() accessor
- SandboxedRuntime uses trait defaults (caching disabled), matching
  the existing pattern for SASS methods
- 16 new cache tests plus validation and default-impl tests (97 total
  in quarto-system-runtime, 6582 workspace-wide, all passing)
Implement Phase 3 of the runtime cache plan: JS IndexedDB bridge
functions and WasmRuntime cache method implementations.

- cache.js: IndexedDB bridge with lazy db init, composite keys
- cache.d.ts: TypeScript declarations for the bridge
- cache.test.ts: 5 vitest tests (roundtrip, miss, isolation, clear, delete)
- wasm.rs: wasm_bindgen extern block + cache_get/set/delete/clear_namespace
  implementations with validation and Uint8Array marshalling
Phase 1: ThemeConfig reads flattened metadata (top-level `theme` instead
of `format.html.theme`), matching MetadataMergeStage output. Old nested
test helpers replaced with flattened versions.

Phase 2: New CompileThemeCssStage runs after MetadataMergeStage in all
HTML pipelines. It assembles SCSS via new `assemble_theme_scss` public
API, compiles with platform-specific backends (grass native, dart-sass
WASM), and caches results via SystemRuntime cache interface. Falls back
to DEFAULT_CSS on errors. ApplyTemplateStage now skips default CSS if
the artifact already exists.

WASM compatibility: PipelineStage trait and all impls now use conditional
`#[cfg_attr]` for async_trait — Send on native, ?Send on WASM — matching
SystemRuntime's existing pattern. This allows stages to await non-Send
WASM futures from cache_get/cache_set/compile_sass.

Also fixes compute_theme_content_hash WASM entry point to call
resolve_format_config before ThemeConfig extraction, since it receives
raw frontmatter (not flattened config).

Includes parent plan and all three sub-plans (A: core, B: migration,
C: tests).
Parses rendered HTML for <link rel="stylesheet"> tags, reads the linked
CSS files (from disk on native, from VFS on WASM), concatenates content,
and runs regex match/no-match patterns. Same two-array YAML format as
ensureFileRegexMatches.
… CSS

Phase 3 (native): Replace write_themed_resources with prepare_html_resources
+ post-pipeline artifact write. CSS is written once from the pipeline's
css:default artifact. CLI sets NativeRuntime cache dir for SASS caching.

Phase 4 (WASM): Remove JS-side compileAndInjectThemeCss, compileDocumentCss,
computeThemeContentHash and all supporting Rust WASM entry points. CSS
version now computed by hashing the VFS CSS artifact after render.

Fix WASM SASS compilation: smoke-all test was missing setVfsCallbacks() for
the dart-sass VFS importer, causing all @use/@import resolution to fail
silently. Added VFS callback setup in test beforeAll().

Theme inheritance smoke-all tests (6 fixtures) verify the full metadata
hierarchy (project → directory → document) produces correct theme CSS on
both native and WASM.
Phase 5 of CSS-in-pipeline plan: fill test gaps for theme CSS compilation
through the full render pipeline.

- New WASM test (themeCss.wasm.test.ts): runtime metadata theme override
  correctly produces darkly CSS instead of document's flatly theme
- 3 native pipeline tests (pipeline.rs): project theme, document-overrides-
  project theme, and no-theme-uses-DEFAULT_CSS through render_qmd_to_html()
- Updated plan with corrected pipeline stage ordering and review notes
The move from SassCacheManager to CompileThemeCssStage (f9bf8782) was
the right architectural change, but it regressed the caching strategy:
SHA-256 was replaced with DefaultHasher (64-bit, unstable across Rust
versions), the hash input became the full assembled SCSS instead of
individual theme identities, and LRU eviction was dropped.

This commit restores and refines those features:

- Move SCSS_RESOURCES_HASH from wasm-quarto-hub-client/build.rs to
  quarto-sass/build.rs so both native and WASM share one build-time hash
- Replace DefaultHasher with SHA-256 in cache_key()
- Compute cache key from theme specs + custom file contents before
  assembly, so cache hits skip assemble_theme_scss entirely
- Add LRU eviction (200 entries / 50MB) with touch-on-read to
  IndexedDB cache.js
- Remove dead SassCacheManager code: sassCache.ts, sassCache.test.ts,
  SassCacheEntry type, and unused WASM exports (compileScss,
  get_scss_resources_version, etc.)
- Add smoke-all test fixture for theme array (vapor + custom.scss)
Run smoke-all test fixtures through the real Quarto Hub pipeline
(Automerge sync → VFS → WASM render → Preview iframe) instead of
bypassing Automerge like the WASM Vitest smoke-all tests do.

Infrastructure:
- globalSetup/Teardown starts real hub server via cargo run --bin hub
- projectFactory creates Automerge projects via quarto-sync-client
  and seeds IndexedDB via page.evaluate for browser-side loading
- previewExtraction extracts HTML, CSS, and diagnostics from the
  live preview iframe (handles data: URIs and source-tracking spans)

Test suites (34 tests total):
- smoke-all.spec.ts: 23 fixtures from crates/quarto/tests/smoke-all/
- theme-subdir-e2e.spec.ts: 7 theme+SCSS subdirectory variations
- project-loading.spec.ts: Automerge project loading pipeline
- preview-extraction.spec.ts: HTML/diagnostics extraction
- smoke.spec.ts: basic app + server connectivity (updated)
Add scraper crate (html5ever + selectors) to parse rendered HTML and
verify element presence/absence via CSS selectors. This replaces the
previous no-op that silently skipped ensureHtmlElements assertions in
the Rust test runner.

The implementation supports the full CSS Selectors Level 3 spec
including combinators, pseudo-classes (:nth-child, :not, :only-child),
and attribute selectors with operators — matching the selector
complexity used across the ~392 smoke-all fixtures in quarto-cli.
Instead of waiting a fixed 2 seconds for the hub server to persist
Automerge documents after project creation, poll the server's HTTP API
(/api/documents/{id}) for all documents (index + every file) in
parallel. Typically resolves in <200ms.

Playwright E2E suite: 19.6s → 12.1s (38% faster).
Three issues fixed:

1. Project _quarto.yml !path values are now rebased to document directory
   in MetadataMergeStage, matching the existing behavior for _metadata.yml.
   This fixes subdirectory documents not inheriting project-level custom
   SCSS themes.

2. Added !path tags to custom SCSS references in _quarto.yml and
   _metadata.yml test fixtures so the path adjustment system can
   correctly rebase them when inherited across directories.

3. Fixed CSS color minification in test assertions: hex colors like
   #cc5500 get shortened to #c50 by minifiers, breaking regex matches.
   Replaced with non-condensable values (#cc5501, #aabb12, #112234).

New smoke-all test fixtures for SCSS theme inheritance:
- theme-project-scss: project-level custom SCSS (root + subdirectory)
- theme-project-scss-relpath: project-level SCSS with relative paths
- theme-metadata-scss: directory _metadata.yml custom SCSS
- theme-metadata-scss-relpath: directory metadata SCSS with relative paths
- theme-crossdir-scss: cross-directory SCSS references

All 32 smoke-all tests pass across all three runners (Rust, WASM, E2E).
…ttening

Single-file renders (no _quarto.yml) had config: None, which caused
MetadataMergeStage to skip format flattening entirely. Keys like
format.html.toc stayed nested instead of being flattened to top-level
toc, making them invisible to downstream stages (theme compilation,
AST transforms, etc.).

Fix: change ProjectContext.config from Option<ProjectConfig> to
ProjectConfig. Every context now always has a config — real projects
get their parsed _quarto.yml, single-file renders get
ProjectConfig::default(). The merge stage gate is removed so format
flattening always runs. The type system prevents reintroduction.

Also restores the weakened theme test assertion in render_to_file.rs
(cosmo theme now correctly produces compiled Bootstrap CSS).
All transforms and template selection now read metadata from the fully
merged ConfigValue (doc.ast.meta) instead of Format.metadata
(serde_json::Value). This ensures project-level and directory-level
metadata reaches pipeline consumers, not just document frontmatter.

Changes:
- Add is_minimal_html(meta: &ConfigValue) free function in format.rs
- select_template() now takes bool instead of &Format
- render_with_format() computes minimal from meta, not format
- TitleBlockTransform::should_add_h1() reads from ast.meta
- AppendixStructureTransform helpers read from ast.meta
- FootnotesTransform::get_reference_location() reads from ast.meta
- Eliminated as_object() calls in favor of ConfigValue::get()
- Updated all tests to put metadata in ast.meta ConfigValue

Format.metadata is still populated but no longer read by pipeline code,
ready for removal in a subsequent plan.
Three fixtures test that appendix-style flows through metadata layers:
- project-appendix-plain: project _quarto.yml sets appendix-style: plain
- dir-appendix-none: directory _metadata.yml overrides to appendix-style: none
- doc-override-default: document frontmatter overrides back to default
All pipeline consumers now read from doc.ast.meta (ConfigValue) instead
of Format.metadata (serde_json::Value). Theme CSS compilation moved into
CompileThemeCssStage. The Format.metadata field was populated but never
read — this removes it and all supporting machinery.

Removed:
- extract_format_metadata() function and serde_yaml dependency
- Format.metadata field, with_metadata(), get_metadata*(), use_minimal_html()
- RenderContext::format_metadata() and StageContext::format_metadata()
- All associated tests (9 extract_format_metadata, 8 use_minimal_html,
  3 get_metadata, 1 with_metadata, 2 format_metadata context tests)
- Call sites in render_to_file.rs, wasm-quarto-hub-client, and quarto CLI

Format is now a simple struct: identifier + output_extension + native_pipeline.
Add entries for theme CSS pipeline, WASM theme tests, SASS cache
improvements, and Playwright E2E tests.
@gordonwoodhull gordonwoodhull force-pushed the feature/project-metadata branch from d80cc8c to 49640a3 Compare March 12, 2026 14:32
@cscheid cscheid merged commit 7b56f5d into main Mar 12, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants