T-200: Decouple R-loop search from MorphyLib; add concavity parameter#216
Open
ms609 wants to merge 16 commits intocpp-searchfrom
Open
T-200: Decouple R-loop search from MorphyLib; add concavity parameter#216ms609 wants to merge 16 commits intocpp-searchfrom
ms609 wants to merge 16 commits intocpp-searchfrom
Conversation
- New R/NativeSearch.R: PrepareNativeData, CleanNativeData, NativeLength, NativeBootstrap (4 exported functions) - TreeSearch(), Ratchet(), Jackknife() now default to native scorer - New 'concavity' parameter on all three functions - Jackknife() rewritten to support both native and morphy paths - EdgeListSearch() default scorer changed to NativeLength - Deprecated: PhyDat2Morphy, UnloadMorphy, MorphyLength, MorphyTreeLength, MorphyBootstrap - Vignette custom.Rmd IW section rewritten (was ~150 lines of MorphyLib scaffolding, now 30 lines using concavity parameter) - test-NativeSearch.R: 7 tests covering all new functions - test-CustomSearch.R: updated to use native defaults
Adds NativeSearch.Rd and updates Jackknife.Rd, Ratchet.Rd, TreeSearch.Rd with new concavity parameter and default changes.
…ilder Build random binary tree that satisfies topological constraints by: 1. Ordering constraint splits smallest-to-largest 2. Assigning each tip to its tightest enclosing split 3. Bottom-up: randomly resolve each split's children into binary subtree 4. Wire root-level items (unconstrained tips + top-level splits) Replaces Wagner fallback (T-214 workaround) for RANDOM_TREE strategy, restoring uniform random topology sampling diversity under constraints. Tests: 916 constraint + 24 random-constrained + 373 simplify/driven pass.
Three issues in impose_one_pass / impose_constraint: 1. Bail-out threshold n_tip/4 was too aggressive. For n_tip=5 the threshold was 1, so any split requiring 2+ moves caused an immediate bail-out before making any repairs. Raised to n_tip. 2. impose_one_pass returned 0 both for 'no violations' and 'bailed out', so impose_constraint couldn't distinguish the two. Now returns -1 on bail-out, allowing the caller to know the repair may be incomplete. 3. Documented the root-child limitation: spr_clip() doesn't fully handle root children, so impose_one_pass skips them. The fuse path's post-repair verification guard (T-214) catches these cases. Tests: 940 constraint + 308 simplify/driven/sector pass.
spr_clip() can't detach root children (root is its own parent, so the bypass logic fails). All search callers skip root children, but impose_constraint needs them for constraint repair. New topology_spr() helper in ts_constraint.cpp handles the root-child case by absorbing the sibling into root and repurposing it as the insertion node. No changes to spr_clip or any search caller. Also removes the build_postorder call that was missing between individual moves within impose_one_pass — each topology_spr is now followed by a postorder rebuild so edge enumeration stays valid. Tests: 942 constraint (90 impose-constraint, incl 2 new root-child tests) + 308 simplify/driven/sector pass.
- T-196: cherry-picked NA+IW fix from feature/parallel-temper - T-198-201: closed (PT ruled out; PCSA landed via PR #227) - Removed PT section from to-do.md - Deleted feature/parallel-temper and feature/pt-eval (remote+local) - PT findings preserved in .positai/expertise/pt-evaluation.md
Add perturbStopFactor parameter to SearchControl(): stop after nTip * K consecutive replicates that fail to improve the best score. Default 0 (disabled). Small values (1-3) are typical. Inspired by IQ-TREE's unsuccessful-perturbation stopping rule (Nguyen et al. 2015), adapted from per-perturbation to per-replicate granularity. Implemented in both serial (ts_driven.cpp) and parallel (ts_parallel.cpp) paths. The serial path tracks consecutive unsuccessful replicates directly; the parallel path monitors score improvement in the main thread's polling loop. Two new tests: verify the rule stops search early when enabled, and that it's a no-op when disabled (factor=0).
Benchmarked perturbStopFactor across 10 morphobank/inapplicable datasets (23-213 tips). Key findings: - PSF=2 gives 2.4-6.9x speedup on converged searches with zero score loss - Complementary to targetHits: on hard landscapes where few replicates hit the best score, PSF fires first; on easy landscapes targetHits fires first (PSF is irrelevant) - PSF=5 provides smaller speedups and is too conservative for large trees Changed default from 0 (disabled) to 2 in SearchControl() and compat wrapper. Updated docs and fixed timeout test (needs PSF=0 to test the timeout path, not convergence).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Agent B
Summary
Replaces MorphyLib scoring in
TreeSearch(),Ratchet(),Jackknife()withnative C++ engine (
ts_fitch_score()), while retaining backward compatibilityfor users passing MorphyLib scorers explicitly.
New exports
PrepareNativeData()— replacesPhyDat2Morphy()CleanNativeData()— replacesUnloadMorphy()NativeLength()— replacesMorphyLength()NativeBootstrap()— replacesMorphyBootstrap()Changed defaults
TreeSearch,Ratchet,Jackknife, andEdgeListSearchnow default tonative scorers. New
concavityparameter (defaultInf= equal weights)makes implied-weights searches trivial:
Deprecations
PhyDat2Morphy,UnloadMorphy,MorphyLength,MorphyTreeLength,MorphyBootstrapemit.Deprecated()warnings.Vignette
custom.RmdIW section rewritten from ~150 lines of MorphyLib scaffoldingto ~30 lines.
Tests
7 new tests in
test-NativeSearch.R. Existing tests updated to use nativedefaults. 50 pass / 0 fail.
Note
Native scorer uses simple IW (
fit = e / (e + k)), matchingTreeLength(..., extended_iw = FALSE). XPIWE (Extension 3) remains in theC++ search path.