Skip to content

Cleaning up the init pipeline#1299

Closed
charlesknipp wants to merge 2 commits intomainfrom
ck/init
Closed

Cleaning up the init pipeline#1299
charlesknipp wants to merge 2 commits intomainfrom
ck/init

Conversation

@charlesknipp
Copy link
Copy Markdown
Member

Following some comments I had in #1298, I decided to clean up the internal interface to make creating new strategies quite a bit easier.

Besides running unit tests, I haven't benchmarked anything. Although, since I rely on dispatch as opposed to conditional evaluation, this should theoretically be preferred.

Feel free to drop comments and lmk what you think.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Feb 26, 2026

Benchmark Report

  • this PR's head: d3a58fed07231171a95df8b58eb9a1b7651c6196
  • base branch: 0c1a6292c60902e03aa05a23d4cdca8e707dc5be

Computer Information

Julia Version 1.11.9
Commit 53a02c0720c (2026-02-06 00:27 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 4 × AMD EPYC 7763 64-Core Processor
  WORD_SIZE: 64
  LLVM: libLLVM-16.0.6 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 4 virtual cores)

Benchmark Results

┌───────────────────────┬───────┬─────────────┬────────┬───────────────────────────────┬───────────────────────────┬─────────────────────────────────┐
│                       │       │             │        │       t(eval) / t(ref)        │     t(grad) / t(eval)     │        t(grad) / t(ref)         │
│                       │       │             │        │ ─────────┬──────────┬──────── │ ──────┬─────────┬──────── │ ──────────┬───────────┬──────── │
│                 Model │   Dim │  AD Backend │ Linked │     base │  this PR │ speedup │  base │ this PR │ speedup │      base │   this PR │ speedup │
├───────────────────────┼───────┼─────────────┼────────┼──────────┼──────────┼─────────┼───────┼─────────┼─────────┼───────────┼───────────┼─────────┤
│               Dynamic │    10 │    mooncake │   true │   400.11 │   448.93 │    0.89 │ 11.89 │    7.87 │    1.51 │   4758.32 │   3535.07 │    1.35 │
│                   LDA │    12 │ reversediff │   true │  2797.17 │  2772.44 │    1.01 │  2.11 │    2.76 │    0.77 │   5899.24 │   7638.76 │    0.77 │
│   Loop univariate 10k │ 10000 │    mooncake │   true │ 64645.39 │ 64316.62 │    1.01 │  5.35 │    6.68 │    0.80 │ 346174.27 │ 429724.57 │    0.81 │
├───────────────────────┼───────┼─────────────┼────────┼──────────┼──────────┼─────────┼───────┼─────────┼─────────┼───────────┼───────────┼─────────┤
│    Loop univariate 1k │  1000 │    mooncake │   true │  6458.89 │  6434.48 │    1.00 │  5.32 │    5.51 │    0.97 │  34347.50 │  35458.48 │    0.97 │
│      Multivariate 10k │ 10000 │    mooncake │   true │ 34405.15 │ 44826.40 │    0.77 │ 15.53 │    7.73 │    2.01 │ 534249.94 │ 346553.81 │    1.54 │
│       Multivariate 1k │  1000 │    mooncake │   true │  3599.89 │  4548.40 │    0.79 │  9.24 │    7.29 │    1.27 │  33262.66 │  33165.81 │    1.00 │
├───────────────────────┼───────┼─────────────┼────────┼──────────┼──────────┼─────────┼───────┼─────────┼─────────┼───────────┼───────────┼─────────┤
│ Simple assume observe │     1 │ forwarddiff │  false │     2.87 │     2.30 │    1.25 │  3.58 │    4.68 │    0.76 │     10.27 │     10.75 │    0.96 │
│           Smorgasbord │   201 │ forwarddiff │  false │  1110.55 │  1151.23 │    0.96 │ 65.56 │   75.02 │    0.87 │  72803.91 │  86362.46 │    0.84 │
│           Smorgasbord │   201 │      enzyme │   true │  1543.94 │  1952.62 │    0.79 │  6.17 │    3.28 │    1.88 │   9524.53 │   6407.25 │    1.49 │
├───────────────────────┼───────┼─────────────┼────────┼──────────┼──────────┼─────────┼───────┼─────────┼─────────┼───────────┼───────────┼─────────┤
│           Smorgasbord │   201 │ forwarddiff │   true │  1536.63 │  1957.05 │    0.79 │ 65.04 │   93.91 │    0.69 │  99942.87 │ 183792.74 │    0.54 │
│           Smorgasbord │   201 │    mooncake │   true │  1569.26 │  1952.03 │    0.80 │  5.30 │    4.39 │    1.21 │   8323.67 │   8576.44 │    0.97 │
│           Smorgasbord │   201 │ reversediff │   true │  1529.74 │  1969.98 │    0.78 │ 99.93 │   83.54 │    1.20 │ 152865.34 │ 164567.08 │    0.93 │
├───────────────────────┼───────┼─────────────┼────────┼──────────┼──────────┼─────────┼───────┼─────────┼─────────┼───────────┼───────────┼─────────┤
│              Submodel │     1 │    mooncake │   true │     3.34 │     3.18 │    1.05 │ 54.59 │   64.27 │    0.85 │    182.20 │    204.36 │    0.89 │
└───────────────────────┴───────┴─────────────┴────────┴──────────┴──────────┴─────────┴───────┴─────────┴─────────┴───────────┴───────────┴─────────┘

Copy link
Copy Markdown
Member

@penelopeysm penelopeysm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

In general I don't expect that there should be any performance difference between the conditionals vs dispatch -- for conditionals like if v isa V the Julia compiler will be able to statically infer which branch will be taken. Looking at the CI benchmarks, there's no noticeable difference (they're noisy and ±20% is normal), but it would be good to confirm these with proper benchmarks.

If my expectations are borne out, then this boils down to a stylistic choice. Personally, I prefer the conditionals (and wrote it that way) because it keeps the relevant logic all in one place, rather than having to jump between functions in different parts of the file. The extra indirection won't cause any issues in terms of Julia performance, but it means that a reader has to expand their context stack by two layers (init_transform + set_internal_transform).

Comment thread src/contexts/init.jl
# In this case, we can't trust the transform stored in x because the _value_
# in x may have been changed via unflatten!! without the transform being
# updated. Therefore, we always recompute the transform here.
VectorValue(x.val, from_vec_transform(dist))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we retain this comment somewhere, since it is warning against potentially incorrect behaviour?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course. Since I implemented a new outer constructor for VectorValue, I thought this might be redundant. After a second thought, I was clearly wrong lol.

Comment thread src/transformed_values.jl
end
function LinkedVectorValue(val::AbstractVector, dist::Distribution)
return LinkedVectorValue(val, from_linked_vec_transform(dist))
end
Copy link
Copy Markdown
Member

@penelopeysm penelopeysm Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regardless of the implementation of the DynamicPPL.init methods, which I'm open to discussing, I would like to request that we not add extra methods for this. I recognise this is a common pattern in Julia, but this is one of my most strongly held opinions and I have consistently argued against this sort of convenience methods (e.g. #780 (comment)). The reason is because these extra methods pretty much exist just to save characters, and I don't think that the saved characters outweigh the negatives:

  1. An extra level of indirection makes it harder to keep everything in one's brain. Maybe in the future when coding is all done with LLMs this won't matter, but right now I still read the code.

  2. Defining more methods for LinkedVectorValue makes it harder for tooling, e.g. LSP, to determine where the definition of a method is. LanguageServer.jl right now at least cannot differentiate between methods based on types, so when you use goto definition it just gives you a list of methods, and you have to figure out yourself which one is being called. It's basically a glorified grep, and it's easier to grep for something that exists once rather than twice.

Copy link
Copy Markdown
Member Author

@charlesknipp charlesknipp Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1. An extra level of indirection makes it harder to keep everything in one's brain.

While this is valid, I think there's some value in identifying patterns within the AbstractTransformedValue stack. It forces the user to best understand the transformations performed. For example, and I may be woefully unaware here, but I don't see a time where we would call VectorValue(val, from_linked_vec_transform(dist)). More than just Julian style programming this deters the (albeit advanced) user from doing something they perhaps shouldn't.

2. Defining more methods for `LinkedVectorValue` makes it harder for tooling, e.g. LSP, to determine where the definition of a method is

That makes sense, I figured there was likely a reason for this.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I do see the point. At first I was going to say that I'd be happier if we made it an inner constructor rather than an outer constructor (the latter is merely a suggestion, the former actually statically prevents a user from making a wrong thing); then I realised that this bit of horrendous code precludes that.

function set_transformed!!(vi::VarInfo, linked::Bool, vn::VarName)
old_tv = getindex(vi.values, vn)
new_tv = if linked
LinkedVectorValue(old_tv.val, old_tv.transform)
else
VectorValue(old_tv.val, old_tv.transform)
end
new_values = setindex!!(vi.values, new_tv, vn)
new_transform_strategy = update_transform_strategy(
vi.transform_strategy, isempty(vi), vn, linked
)
return VarInfo(new_transform_strategy, new_values, vi.accs)
end

Anyway, happy to go with the outer constructor. I'm mildly hopeful that in the future the transform field will just be completely removed too, which would eliminate the possibility of writing something incorrect (#1249 (comment)).

Comment thread src/transformed_values.jl
Comment on lines -361 to -371
# Need to return an unlinked value. We _could_ generate a VectorValue here, with the
# vectorisation transform. However, sometimes that's not needed (e.g. when
# evaluating with an OnlyAccsVarInfo). So we just return an UntransformedValue. If a
# downstream function requires a VectorValue, it's on them to generate it.
(raw_value, UntransformedValue(raw_value), zero(LogProbType))
else
error("unknown target transform $target")
end
raw_value = get_raw_value(tv, dist)
return apply_transform(raw_value, target, tv, dist)
end

function apply_transform_strategy(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same with these comments -- I'd like to retain them there because I'll have forgotten why I wrote these things 3 months down the road.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One option instead of the comments could be to put a docstring on the internal apply_transform function.

@codecov
Copy link
Copy Markdown

codecov Bot commented Feb 27, 2026

Codecov Report

❌ Patch coverage is 93.87755% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 79.13%. Comparing base (69d87df) to head (d3a58fe).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/transformed_values.jl 90.62% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1299      +/-   ##
==========================================
+ Coverage   79.03%   79.13%   +0.10%     
==========================================
  Files          48       48              
  Lines        3596     3585      -11     
==========================================
- Hits         2842     2837       -5     
+ Misses        754      748       -6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@charlesknipp
Copy link
Copy Markdown
Member Author

@penelopeysm I have no problem throwing this one away in favor of conditionals. I just think some of the transformations encourage some potentially undesirable behavior like I mentioned here. This is actually a mistake I made while writing my own InitContext; granted it was a typo, but the error was not indicative of an issue with the transformation.

I would be more than happy to close if this is too much of an issue.

guess who forgot to run the formatter again...

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

DynamicPPL.jl documentation for PR #1299 is available at:
https://TuringLang.github.io/DynamicPPL.jl/previews/PR1299/

Comment thread src/contexts/init.jl
Comment on lines 337 to 338
# This block here is why we store transform_strategy inside the InitFromVector, as it
# allows for type stability.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing the conditional block means that this comment is dangling.

@penelopeysm
Copy link
Copy Markdown
Member

Sorry to be obstinate, but I still prefer conditionals! Happy to get a third opinion though-- @sunxd3 do you have a preference between the old

function f(v)
    return if v isa V1
        result1
    elseif v isa V2
        result2
    end
end

vs

get_result(v::V1) = result1
get_result(v::V2) = result2

[...]

function f(v)
    return get_result(v)
end

in this PR?

@sunxd3
Copy link
Copy Markdown
Member

sunxd3 commented Feb 28, 2026

I like conditionals better (i.e. the former)

@charlesknipp
Copy link
Copy Markdown
Member Author

I like conditionals better (i.e. the former)

the tribe has spoken

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants