feat: add structural tree parser for conditional-aware spec editing#215
feat: add structural tree parser for conditional-aware spec editing#215trungams wants to merge 9 commits into
Conversation
Adds the design RFC documenting the problem (straddling conditionals break naive line-range edits), the two evaluated approaches (enriched structure vs structural tree), and the chosen direction (tree model). Also documents known limitations in the overlays user guide: section-scoped operations cannot reach content past a wrapper %endif when the section header is inside the wrapper (workaround: spec-search-replace with anchored regex), and macro-generated sections are invisible to the static parser.
Two-pass parser that builds a block tree from raw spec lines: - Pass 1 (collectConditionalPairs in edit.go): collects conditional pairs, skipping %if/%endif inside %define/%global macro continuation bodies. Shell-level continuations still process %if as structural. - Pass 2 (buildBlockChildren): builds nested block tree, classifying conditionals as wrappers (contain section headers) vs content blocks. Block kinds: rootBlock, sectionBlock, conditionalBlock, textBlock, macroDefBlock. Preamble content is wrapped in an implicit sectionBlock with empty Name for uniform access via findSectionBlock. All tree types and helpers are kept unexported so the parsed representation remains an internal implementation detail. Includes serializeTree for lossless round-tripping, query helpers (findSectionBlock, findAllSectionBlocks, findAllSectionBlocksByPackage), mutation helpers (removeBlockFromParent), and validateSectionRemoval for detecting unsafe removal patterns (orphaned content). Validation checks: - Preceding-section: wrapper has text AND preceding section is removed. - Empty-wrapper + adjacent content: all sections removed from wrapper AND adjacent content conditional would be orphaned. Cross-branch asymmetry is intentionally allowed (removing from %if while %else retains sections is a valid overlay operation).
…tch of edit operations Add a tree interface in tree_api.go so edit.go callers no longer reach into the parse tree directly. specTree wraps the parsed *block root and exposes query helpers (Section, Sections, SectionsByPackage, RemoveSections); sectionHandle wraps a section block with append/prepend operations. Spec.mutateTree handles parse -> mutate -> serialize so call sites become 3-line orchestrations. Three edit.go operations now go through the interface as a starting point: AppendLinesToSection, RemoveSection, and RemoveSubpackage. Remaining Visit-based operations will migrate in follow-up commits.
Move HasSection off the Visit-based scan and onto the tree interface. Adds Spec.inspectTree (the read-only sibling of mutateTree) and specTree.HasSection, which walks the parsed tree (including sections inside conditional wrappers, matching previous Visit semantics).
Add line-level iteration to the tree interface (specTree.VisitAllLines / sectionHandle.VisitLines + lineHandle) plus a tag-aware insertion helper (sectionHandle.InsertTag), and use them to rewrite the tag-mutating operations on Spec. No behavior change for callers. Removes findInsertTagPosition, skipPastConditional, and insertTagScanResult.
c4e7ac6 to
46a9ae0
Compare
Convert the last group of Visit-based mutators to mutateTree + section or line handles, then remove the Visit infrastructure entirely: Migrated operations: - PrependLinesToSection -> sectionHandle.PrependLines. - AddChangelogEntry -> sectionHandle.PrependLines on %changelog. - SearchAndReplace -> searchReplaceBlock, a recursive tree walker that visits ALL line types (text, macros, conditional headers, %endif) so patterns targeting %define/%global or %if lines are found correctly. Wrapper else-branch section filter is relaxed for loose content. - removePatchlistEntriesMatching -> sectionHandle.VisitLines on %patchlist. New API: - GetTag(packageName, tag) for read-only tag lookup via inspectTree. Removed (Visit infrastructure): - Visit, VisitTags, VisitTagsPackage from edit.go. - Context, VisitTarget, VisitTargetType, Visitor, InsertLinesBefore, InsertLinesAfter, RemoveLine, ReplaceLine from spec.go. - parseState, inContinuation, newParseState, parseSpecLine, newParsedLine, parseLogicalLine from spec.go. Dead code from line-range approach also removed: - sectionLineRange, collectSectionRanges, balanceRange, validateNoContentAfter, isBlankOrComment, removeRanges, etc. Migrated external caller: release.go GetReleaseValue now uses GetTag instead of VisitTagsPackage.
Stand up 13 hand-crafted minimized .spec fixtures under internal/rpm/spec/testdata/specs/ distilling real-world patterns. testdata_test.go drives fixtures through two tiers: - Tier 1: lossless round-trip (byte-for-byte identity). - Tier 3: per-pattern edit-operation regression tests. Synthetic generator composes primitives into random spec inputs (64 round-trip + 32 AddTag seeds). Also includes SearchAndReplace tests for macro definitions, conditional headers, and wrapper else-branch section filter.
When a %define/%global directive lives inside a %package subpackage block but the macro it declares is referenced by surviving content, RemoveSections now automatically hoists it to the root level. Two regression tests validate both the positive case (macro hoisted) and the negative case (unreferenced macro dropped cleanly).
…overlay Add a new PrependLinesToAllSections method that prepends lines to every section matching a given name and package, unlike PrependLinesToSection which targets only the first match. This is exposed as the new spec-prepend-all-lines overlay type. Update the check-disablement synthesizer to use the new overlay type so that specs with multiple %check sections (e.g., python-azure-mgmt-media, python-zmq) get exit 0 prepended to all of them.
46a9ae0 to
56a4c63
Compare
|
The gdb render failure in e2e is expected as we'll need to update the overlay |
| | `%{lua:...}` blocks | Pervasive (via `%autorelease`) | Inline macro expansion; parser sees the containing `%define` line | | ||
| | Comments (`#` lines) inside conditionals | Pervasive | Parser treats as empty lines; `conditionalDepthChange` ignores them | | ||
|
|
||
| ### Summary |
There was a problem hiding this comment.
Could we add some mechanism to control which branch we go down? For example, many specs have something like
%if <condition>
Release: 1
%else
Release: %autorelease
%endifIt would be nice to be able to target one.
Maybe a mode enum? Something like First, All, SpecificN, RequireUnique maybe?
This could apply generally to both read and write operations.
There was a problem hiding this comment.
It's definitely possible since each branch will be its own block. My question is whether we want to allow users to select branches based on condition (requires evaluation) or just ordering of the branches. Also how should we target these conditional blocks, since there may be multiple with similar conditions in the same spec file.
There was a problem hiding this comment.
Is there a way to map the blocks back to the line numbers? Maybe have each block have a line offset inside its parent? That would provide a very basic way to find First and SpecificN (if we even WANT that one) based on sorting. All just applies everywhere (I think this is sort of the implicit default right now), and RequireUnique just errors out if there is any ambiguity.
|
|
||
| 1. **`spec-append-lines` boundary bug**: `AppendLinesToSection` uses `SectionEndTarget` to find where to insert. When a `%if` sits between two sections (wrapping the next one), Visit considers it part of the current section, so appended lines land inside the conditional. `spec-insert-tag` has a workaround (`skipPastConditional`); `spec-append-lines` does not. | ||
|
|
||
| 2. **`%define`/`%global` dropped during removal** (#203): `spec-remove-subpackage` deletes the entire block without checking whether macro definitions inside it are referenced from surviving sections. The macro silently vanishes, causing build failures downstream. |
There was a problem hiding this comment.
Probably out of scope for this doc, but curious how we might handle recursive definitions
%define a %{b}
%define b some_valueIf I understand the algorithm right we would move %define a %{b} to the global scope, but might now have a dangling link to %{b}.
| 1. **`spec-append-lines` boundary bug**: `AppendLinesToSection` uses `SectionEndTarget` to find where to insert. When a `%if` sits between two sections (wrapping the next one), Visit considers it part of the current section, so appended lines land inside the conditional. `spec-insert-tag` has a workaround (`skipPastConditional`); `spec-append-lines` does not. | ||
|
|
||
| 2. **`%define`/`%global` dropped during removal** (#203): `spec-remove-subpackage` deletes the entire block without checking whether macro definitions inside it are referenced from surviving sections. The macro silently vanishes, causing build failures downstream. | ||
|
|
There was a problem hiding this comment.
Also, should hoisting be an opt-in flag, or on by default? Direction I've been getting is "no magic" because it gets confusing to reason about. Randomly moving variables might be surprising. If the flag is instead a list, would also fix the recursive definition thing above.
| return set | ||
| } | ||
|
|
||
| // collectMacrosInSections gathers every [macroDefBlock] reachable from any of |
There was a problem hiding this comment.
This pattern seems to repeat a bunch, thoughts on making a walkBlocks() function, or at least an interface.
func collectMacrosInSections(sections []*block) []*block {
var macros []*block
for _, sec := range sections {
walk(sec, func(b *block) bool {
if b.Kind == macroDefBlock {
macros = append(macros, b)
}
return true
})
}
return macros
}Mutating funcs (removeBlocKFromParent etc.) feel harder, maybe leave them as-is?
Summary
Replaces the flat Visit-based spec line walker with a structural tree parser for all spec editing operations. This makes conditional-boundary bugs structurally impossible and fixes several long-standing issues (#144, #193, #203).
Closes #214
Changes
New: tree parser (
tree.go)Two-pass parser that builds a
blocktree from raw spec lines:%if/%endifpairs, skipping pairs inside%define/%globalmacro continuation bodiesBlock kinds:
rootBlock,sectionBlock,conditionalBlock,textBlock,macroDefBlock. Lossless round-trip viaserializeTree.New: tree API (
tree_api.go)specTree/sectionHandletypes wrap the parsed tree.Spec.mutateTreehandles parse → mutate → serialize;Spec.inspectTreefor read-only access. The publicSpec.*API is unchanged for callers.New:
SearchAndReplacerewrite (searchReplaceBlock)Dedicated recursive walker that visits ALL line types (text, macros, conditional headers,
%endif). The oldVisitAllLines-based implementation skipped%define/%globaland conditional directive lines. Also relaxes the section filter for wrapper else-branch content.New: macro hoisting (
tree_hoist.go)RemoveSectionsnow automatically hoists%define/%globalmacros from removed sections when they're referenced by surviving content (#203). Scans for%{name},%{?name},%{!?name}, bare%name, and conditional headers like%if 0%{?with_name}.New:
GetTagandPrependLinesToAllSectionsGetTag(packageName, tag)— read-only tag lookup viainspectTree, replacesVisitTagsPackageusagePrependLinesToAllSections+spec-prepend-all-linesoverlay — prepends to every matching section (used by check-skip to disable all%checksections)New: testdata fixtures + synthetic stress
13 curated
.specfixtures distilling real-world patterns. Two test tiers (lossless round-trip + per-pattern edit-operation regression). Synthetic generator with 64+32 deterministic seeds.Removed: Visit infrastructure
Visit,VisitTags,VisitTagsPackage,Context,VisitTarget,VisitTargetType,Visitor,parseState,parseSpecLine,newParsedLine,parseLogicalLine— all deleted. The sole external caller (release.go) migrated toGetTag.Also removed:
sectionLineRange,collectSectionRanges,balanceRange,validateNoContentAfter,skipPastConditional,findInsertTagPosition,isBlankOrComment,removeRanges, and related helpers.Docs
docs/developer/rfc/docs/user/reference/config/overlays.mdValidation
Full render against Azure Linux 4.0 (~7,400 components):
Rendered output:
tvuong/4.0/feat/azldev-parsetreeKnown limitations
%ifwrapper with content past%endifis invisible to section-scoped overlays. Workaround:spec-search-replace. Affects 1 spec (gdb).%ghc_lib_subpackage,%fontpkg, etc. are invisible to the static parser.