Cleaning up the init pipeline by charlesknipp · Pull Request #1299 · TuringLang/DynamicPPL.jl

charlesknipp · 2026-02-26T22:17:24Z

Following some comments I had in #1298, I decided to clean up the internal interface to make creating new strategies quite a bit easier.

Besides running unit tests, I haven't benchmarked anything. Although, since I rely on dispatch as opposed to conditional evaluation, this should theoretically be preferred.

Feel free to drop comments and lmk what you think.

github-actions · 2026-02-26T22:22:20Z

Benchmark Report

this PR's head: d3a58fed07231171a95df8b58eb9a1b7651c6196
base branch: 0c1a6292c60902e03aa05a23d4cdca8e707dc5be

Computer Information

Julia Version 1.11.9
Commit 53a02c0720c (2026-02-06 00:27 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 4 × AMD EPYC 7763 64-Core Processor
  WORD_SIZE: 64
  LLVM: libLLVM-16.0.6 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 4 virtual cores)

Benchmark Results

┌───────────────────────┬───────┬─────────────┬────────┬───────────────────────────────┬───────────────────────────┬─────────────────────────────────┐
│                       │       │             │        │       t(eval) / t(ref)        │     t(grad) / t(eval)     │        t(grad) / t(ref)         │
│                       │       │             │        │ ─────────┬──────────┬──────── │ ──────┬─────────┬──────── │ ──────────┬───────────┬──────── │
│                 Model │   Dim │  AD Backend │ Linked │     base │  this PR │ speedup │  base │ this PR │ speedup │      base │   this PR │ speedup │
├───────────────────────┼───────┼─────────────┼────────┼──────────┼──────────┼─────────┼───────┼─────────┼─────────┼───────────┼───────────┼─────────┤
│               Dynamic │    10 │    mooncake │   true │   400.11 │   448.93 │    0.89 │ 11.89 │    7.87 │    1.51 │   4758.32 │   3535.07 │    1.35 │
│                   LDA │    12 │ reversediff │   true │  2797.17 │  2772.44 │    1.01 │  2.11 │    2.76 │    0.77 │   5899.24 │   7638.76 │    0.77 │
│   Loop univariate 10k │ 10000 │    mooncake │   true │ 64645.39 │ 64316.62 │    1.01 │  5.35 │    6.68 │    0.80 │ 346174.27 │ 429724.57 │    0.81 │
├───────────────────────┼───────┼─────────────┼────────┼──────────┼──────────┼─────────┼───────┼─────────┼─────────┼───────────┼───────────┼─────────┤
│    Loop univariate 1k │  1000 │    mooncake │   true │  6458.89 │  6434.48 │    1.00 │  5.32 │    5.51 │    0.97 │  34347.50 │  35458.48 │    0.97 │
│      Multivariate 10k │ 10000 │    mooncake │   true │ 34405.15 │ 44826.40 │    0.77 │ 15.53 │    7.73 │    2.01 │ 534249.94 │ 346553.81 │    1.54 │
│       Multivariate 1k │  1000 │    mooncake │   true │  3599.89 │  4548.40 │    0.79 │  9.24 │    7.29 │    1.27 │  33262.66 │  33165.81 │    1.00 │
├───────────────────────┼───────┼─────────────┼────────┼──────────┼──────────┼─────────┼───────┼─────────┼─────────┼───────────┼───────────┼─────────┤
│ Simple assume observe │     1 │ forwarddiff │  false │     2.87 │     2.30 │    1.25 │  3.58 │    4.68 │    0.76 │     10.27 │     10.75 │    0.96 │
│           Smorgasbord │   201 │ forwarddiff │  false │  1110.55 │  1151.23 │    0.96 │ 65.56 │   75.02 │    0.87 │  72803.91 │  86362.46 │    0.84 │
│           Smorgasbord │   201 │      enzyme │   true │  1543.94 │  1952.62 │    0.79 │  6.17 │    3.28 │    1.88 │   9524.53 │   6407.25 │    1.49 │
├───────────────────────┼───────┼─────────────┼────────┼──────────┼──────────┼─────────┼───────┼─────────┼─────────┼───────────┼───────────┼─────────┤
│           Smorgasbord │   201 │ forwarddiff │   true │  1536.63 │  1957.05 │    0.79 │ 65.04 │   93.91 │    0.69 │  99942.87 │ 183792.74 │    0.54 │
│           Smorgasbord │   201 │    mooncake │   true │  1569.26 │  1952.03 │    0.80 │  5.30 │    4.39 │    1.21 │   8323.67 │   8576.44 │    0.97 │
│           Smorgasbord │   201 │ reversediff │   true │  1529.74 │  1969.98 │    0.78 │ 99.93 │   83.54 │    1.20 │ 152865.34 │ 164567.08 │    0.93 │
├───────────────────────┼───────┼─────────────┼────────┼──────────┼──────────┼─────────┼───────┼─────────┼─────────┼───────────┼───────────┼─────────┤
│              Submodel │     1 │    mooncake │   true │     3.34 │     3.18 │    1.05 │ 54.59 │   64.27 │    0.85 │    182.20 │    204.36 │    0.89 │
└───────────────────────┴───────┴─────────────┴────────┴──────────┴──────────┴─────────┴───────┴─────────┴─────────┴───────────┴───────────┴─────────┘

penelopeysm

Thanks!

In general I don't expect that there should be any performance difference between the conditionals vs dispatch -- for conditionals like if v isa V the Julia compiler will be able to statically infer which branch will be taken. Looking at the CI benchmarks, there's no noticeable difference (they're noisy and ±20% is normal), but it would be good to confirm these with proper benchmarks.

If my expectations are borne out, then this boils down to a stylistic choice. Personally, I prefer the conditionals (and wrote it that way) because it keeps the relevant logic all in one place, rather than having to jump between functions in different parts of the file. The extra indirection won't cause any issues in terms of Julia performance, but it means that a reader has to expand their context stack by two layers (init_transform + set_internal_transform).

penelopeysm · 2026-02-26T23:03:07Z

-            # In this case, we can't trust the transform stored in x because the _value_
-            # in x may have been changed via unflatten!! without the transform being
-            # updated. Therefore, we always recompute the transform here.
-            VectorValue(x.val, from_vec_transform(dist))


Could we retain this comment somewhere, since it is warning against potentially incorrect behaviour?

Of course. Since I implemented a new outer constructor for VectorValue, I thought this might be redundant. After a second thought, I was clearly wrong lol.

penelopeysm · 2026-02-26T23:03:13Z

 end
+function LinkedVectorValue(val::AbstractVector, dist::Distribution)
+    return LinkedVectorValue(val, from_linked_vec_transform(dist))
+end


Regardless of the implementation of the DynamicPPL.init methods, which I'm open to discussing, I would like to request that we not add extra methods for this. I recognise this is a common pattern in Julia, but this is one of my most strongly held opinions and I have consistently argued against this sort of convenience methods (e.g. #780 (comment)). The reason is because these extra methods pretty much exist just to save characters, and I don't think that the saved characters outweigh the negatives:

An extra level of indirection makes it harder to keep everything in one's brain. Maybe in the future when coding is all done with LLMs this won't matter, but right now I still read the code.

Defining more methods for LinkedVectorValue makes it harder for tooling, e.g. LSP, to determine where the definition of a method is. LanguageServer.jl right now at least cannot differentiate between methods based on types, so when you use goto definition it just gives you a list of methods, and you have to figure out yourself which one is being called. It's basically a glorified grep, and it's easier to grep for something that exists once rather than twice.

1. An extra level of indirection makes it harder to keep everything in one's brain.

While this is valid, I think there's some value in identifying patterns within the AbstractTransformedValue stack. It forces the user to best understand the transformations performed. For example, and I may be woefully unaware here, but I don't see a time where we would call VectorValue(val, from_linked_vec_transform(dist)). More than just Julian style programming this deters the (albeit advanced) user from doing something they perhaps shouldn't.

2. Defining more methods for `LinkedVectorValue` makes it harder for tooling, e.g. LSP, to determine where the definition of a method is

That makes sense, I figured there was likely a reason for this.

Sure, I do see the point. At first I was going to say that I'd be happier if we made it an inner constructor rather than an outer constructor (the latter is merely a suggestion, the former actually statically prevents a user from making a wrong thing); then I realised that this bit of horrendous code precludes that.

DynamicPPL.jl/src/varinfo.jl

Lines 329 to 341 in 0c1a629

function set_transformed!!(vi::VarInfo, linked::Bool, vn::VarName)

old_tv = getindex(vi.values, vn)

new_tv = if linked

LinkedVectorValue(old_tv.val, old_tv.transform)

else

VectorValue(old_tv.val, old_tv.transform)

end

new_values = setindex!!(vi.values, new_tv, vn)

new_transform_strategy = update_transform_strategy(

vi.transform_strategy, isempty(vi), vn, linked

)

return VarInfo(new_transform_strategy, new_values, vi.accs)

end

Anyway, happy to go with the outer constructor. I'm mildly hopeful that in the future the transform field will just be completely removed too, which would eliminate the possibility of writing something incorrect (#1249 (comment)).

penelopeysm · 2026-02-26T23:05:59Z

-        # Need to return an unlinked value. We _could_ generate a VectorValue here, with the
-        # vectorisation transform. However, sometimes that's not needed (e.g. when
-        # evaluating with an OnlyAccsVarInfo). So we just return an UntransformedValue. If a
-        # downstream function requires a VectorValue, it's on them to generate it.
-        (raw_value, UntransformedValue(raw_value), zero(LogProbType))
-    else
-        error("unknown target transform $target")
-    end
+    raw_value = get_raw_value(tv, dist)
+    return apply_transform(raw_value, target, tv, dist)
 end

-function apply_transform_strategy(


Same with these comments -- I'd like to retain them there because I'll have forgotten why I wrote these things 3 months down the road.

One option instead of the comments could be to put a docstring on the internal apply_transform function.

codecov · 2026-02-27T01:03:02Z

Codecov Report

❌ Patch coverage is 93.87755% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 79.13%. Comparing base (69d87df) to head (d3a58fe).
⚠️ Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
src/transformed_values.jl	90.62%	3 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1299      +/-   ##
==========================================
+ Coverage   79.03%   79.13%   +0.10%     
==========================================
  Files          48       48              
  Lines        3596     3585      -11     
==========================================
- Hits         2842     2837       -5     
+ Misses        754      748       -6

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

charlesknipp · 2026-02-27T15:53:37Z

@penelopeysm I have no problem throwing this one away in favor of conditionals. I just think some of the transformations encourage some potentially undesirable behavior like I mentioned here. This is actually a mistake I made while writing my own InitContext; granted it was a typo, but the error was not indicative of an issue with the transformation.

I would be more than happy to close if this is too much of an issue.

guess who forgot to run the formatter again... Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

github-actions · 2026-02-27T15:57:05Z

DynamicPPL.jl documentation for PR #1299 is available at:
https://TuringLang.github.io/DynamicPPL.jl/previews/PR1299/

penelopeysm · 2026-02-27T23:28:58Z

    # This block here is why we store transform_strategy inside the InitFromVector, as it
    # allows for type stability.


Removing the conditional block means that this comment is dangling.

penelopeysm · 2026-02-27T23:32:03Z

Sorry to be obstinate, but I still prefer conditionals! Happy to get a third opinion though-- @sunxd3 do you have a preference between the old

function f(v)
    return if v isa V1
        result1
    elseif v isa V2
        result2
    end
end

vs

get_result(v::V1) = result1
get_result(v::V2) = result2

[...]

function f(v)
    return get_result(v)
end

in this PR?

sunxd3 · 2026-02-28T07:14:43Z

I like conditionals better (i.e. the former)

charlesknipp · 2026-03-02T16:22:51Z

I like conditionals better (i.e. the former)

the tribe has spoken

reduce conditionals in the init pipeline

97b0790

github-actions Bot assigned charlesknipp Feb 26, 2026

penelopeysm reviewed Feb 26, 2026

View reviewed changes

apply batch

d3a58fe

guess who forgot to run the formatter again... Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

penelopeysm reviewed Feb 27, 2026

View reviewed changes

charlesknipp closed this Mar 2, 2026

	function set_transformed!!(vi::VarInfo, linked::Bool, vn::VarName)
	old_tv = getindex(vi.values, vn)
	new_tv = if linked
	LinkedVectorValue(old_tv.val, old_tv.transform)
	else
	VectorValue(old_tv.val, old_tv.transform)
	end
	new_values = setindex!!(vi.values, new_tv, vn)
	new_transform_strategy = update_transform_strategy(
	vi.transform_strategy, isempty(vi), vn, linked
	)
	return VarInfo(new_transform_strategy, new_values, vi.accs)
	end

		# This block here is why we store transform_strategy inside the InitFromVector, as it
		# allows for type stability.

Conversation

charlesknipp commented Feb 26, 2026

Uh oh!

github-actions Bot commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark Report

Computer Information

Benchmark Results

Uh oh!

penelopeysm left a comment

Choose a reason for hiding this comment

Uh oh!

penelopeysm Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

charlesknipp Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

penelopeysm Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

charlesknipp Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

penelopeysm Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

penelopeysm Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

penelopeysm Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

charlesknipp commented Feb 27, 2026

Uh oh!

github-actions Bot commented Feb 27, 2026

Uh oh!

penelopeysm Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

penelopeysm commented Feb 27, 2026

Uh oh!

sunxd3 commented Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

charlesknipp commented Mar 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions Bot commented Feb 26, 2026 •

edited

Loading

penelopeysm Feb 26, 2026 •

edited

Loading

charlesknipp Feb 27, 2026 •

edited

Loading

codecov Bot commented Feb 27, 2026 •

edited

Loading

sunxd3 commented Feb 28, 2026 •

edited

Loading