feat(Query): add query complexity framework by kim-em · Pull Request #376 · leanprover/cslib

kim-em · 2026-02-27T01:47:04Z

This PR adds a framework for proving both upper and lower bounds on query complexity
(comparisons, oracle calls, etc.) of monad-generic algorithms.

Upper bounds (monad parametricity)

TimeT m monad transformer with tick counting, Costs predicate, and combinators
UpperBound/UpperBoundT predicates packaging the parametricity argument:
a monad-generic algorithm specialized to TimeM cannot observe tick instrumentation,
so bounds proved via TimeM hold in all monads
Demo: insertion sort with O(n²) query bound, merge sort with O(n log n) query bound

Lower bounds (decision trees)

QueryTree Q R α: free monad reifying query patterns as explicit decision trees
OracleQueryTree: type alias with WPMonad instance connecting to the Hoare-triple framework
LowerBound f size bound: for every input size n, there exists an input with size ≤ n
and an oracle making the algorithm perform at least bound n queries
QueryTree.exists_queriesOn_ge_clog: if n oracles produce n distinct results from a
query tree with Fintype responses, some oracle path has depth ≥ ⌈log_{|R|} n⌉
IsMonadicSort: specification structure for comparison sorts (permutation + sortedness)
IsMonadicSort.lowerBound_infinite: any correct comparison sort on an infinite type
has query complexity at least ⌈log₂(n!)⌉

🤖 Prepared with Claude Code

kim-em · 2026-02-27T01:57:42Z

I have intentionally kept this at the level of "single tick" counting of costs. It is a fairly straightforward change to make this more flexible, which I have not done here to ease reviewing of the essential idea of using monad parametricity.

kim-em · 2026-02-27T01:58:30Z

Cslib/Algorithms/Lean/Query/Basic.lean

+namespace Cslib.Query
+
+structure TickT.State where
+  count : Nat


I would like to make this private, but this breaks a proof below in a way I haven't yet understood.

I think you reported this issue to me in Dec (I hope it's the same one). I handed over the following reproducer to @Kha:

module public structure State where private count : Nat private example {P : State → Prop} (h : P ⟨State.count s⟩) : P ⟨State.count s⟩ := by set_option trace.Meta.Tactic.simp true in set_option trace.Debug.Meta.Tactic.simp true in set_option trace.Debug.Meta.Tactic.simp.congr true in set_option trace.Meta.realizeConst true in simp [*] private example {P : State → Prop} (h : P ⟨State.count s⟩) : P ⟨State.count s⟩ := by omega private example {P : State → Prop} (h : P ⟨State.count s⟩) : P ⟨State.count s⟩ := by grind

we traced the crash down to mkCongrSimp?, IIRC it was due to getFunInfo. But I think we never fixed it; should have kept better track of this.

Actually no, it appears this issue has been fixed! At least the reproducer works, and the following works as well (but proofs are broken on latest Mathlib due to DefEq changes...)

module public import Std.Do.Triple.Basic import Std.Tactic.Do open Std.Do public section structure TickM.State where private count : Nat @[expose] def TickM (α : Type) := StateM TickM.State α namespace TickM instance : Monad TickM := inferInstanceAs (Monad (StateM TickM.State)) instance : LawfulMonad TickM := inferInstanceAs (LawfulMonad (StateM TickM.State)) instance : Std.Do.WP TickM (.arg TickM.State .pure) := inferInstanceAs (Std.Do.WP (StateM TickM.State) _) def tick : TickM Unit := fun s => ⟨(), ⟨s.count + 1⟩⟩ @[spec] private theorem tick_spec {Q : PostCond Unit (.arg TickM.State .pure)} : Triple tick (fun s => Q.1 () ⟨s.count+1⟩) Q := by simp [tick, Triple, wp, Id.run]

So maybe try again to make the field private?

I've tried, but I couldn't get it working with private.

Cslib/Algorithms/Lean/Query/Basic.lean

sgraf812

Nice! It would be great if we could prove specs about Costs using mvcgen, though. Can take a look once I'm back from PTO

sgraf812 · 2026-02-27T02:30:26Z

Cslib/Algorithms/Lean/Query/Basic.lean

+namespace Cslib.Query
+
+structure TickT.State where
+  count : Nat


I think you reported this issue to me in Dec (I hope it's the same one). I handed over the following reproducer to @Kha:

module public structure State where private count : Nat private example {P : State → Prop} (h : P ⟨State.count s⟩) : P ⟨State.count s⟩ := by set_option trace.Meta.Tactic.simp true in set_option trace.Debug.Meta.Tactic.simp true in set_option trace.Debug.Meta.Tactic.simp.congr true in set_option trace.Meta.realizeConst true in simp [*] private example {P : State → Prop} (h : P ⟨State.count s⟩) : P ⟨State.count s⟩ := by omega private example {P : State → Prop} (h : P ⟨State.count s⟩) : P ⟨State.count s⟩ := by grind

we traced the crash down to mkCongrSimp?, IIRC it was due to getFunInfo. But I think we never fixed it; should have kept better track of this.

sgraf812 · 2026-02-27T02:33:12Z

Cslib/Algorithms/Lean/Query/Basic.lean

+  count : Nat
+
+/-- A monad transformer that adds tick-counting to any monad `m`. -/
+@[expose] def TickT (m : Type → Type) (α : Type) := StateT TickT.State m α


From my past (limited) experience with defeq abuse, it may make sense to make TickT @[irreducible] and add an explicit injection/projection pair.

I couldn't get it to work as @[irreducible], but I've nevertheless added the injection/projection pair to make it easier to try again!

Cslib/Algorithms/Lean/Query/Basic.lean

sgraf812 · 2026-02-27T03:11:55Z

Cslib/Algorithms/Lean/Query/RunsIn.lean

+/-- `RunsInT n f bound` asserts that when the monad-generic function `f`
+    is specialized to `TickT n`, with any query that calls `tick` at most once per invocation,
+    the total number of ticks is bounded by `bound x`.
+
+    The function `f` is generic over monads that extend `n` via `MonadLift`,
+    ensuring it cannot observe the tick instrumentation. -/
+@[expose] def RunsInT {n : Type → Type} {ps : PostShape} [Monad n] [WP n ps]
+    (f : ∀ {m : Type → Type} [Monad m] [MonadLiftT n m], (α → m β) → γ → m δ)
+    (bound : γ → Nat) : Prop :=
+  ∀ (query : α → TickT n β), (∀ a, TickT.Costs (query a) 1) →
+    ∀ x, TickT.Costs (f query x) (bound x)


Neat!!

I think you need to add somewhere (maybe to the top-level comment) that relying on parametricity is only credible when everything is computable. Otherwise you can have if h : α = Nat then ... else ... by Classical.choice.

I wonder if callers can provide an f that uses higher-order constructs such as for loops in do notation. Can't quite play it out in my head right now, but I think it's one of the reasons that the IteratorLoop class is so complicated.

Done in 3737e39 — added a "Computability caveat" section to the RunsIn module doc-string, noting that a noncomputable algorithm could use Classical.choice to subvert instrumentation, and that this framework targets upper bounds, not lower bounds.

sgraf812 · 2026-02-27T03:18:11Z

Cslib/Algorithms/Lean/Query/MonadicExample.lean

+  induction xs with
+  | nil => exact TickT.Costs.pure ()
+  | cons x xs ih =>
+    simp only [List.length]; rw [Nat.add_comm]
+    have ih : TickT.Costs (mapSum query xs) xs.length := ih
+    exact TickT.Costs.bind (hquery x) (fun y => by
+      have := TickT.Costs.bind
+        (TickT.Costs.monadLift (modify (· + y) : StateM Int Unit) (fun P => by mvcgen))
+        (fun _ => ih)
+      rwa [Nat.zero_add] at this)


I guess it would be nice to use mvcgen for this kind of proof. Can take a look when I'm back from PTO! You shouldn't need to replicate your own reasoning framework with Costs.pure/Costs.bind etc.

Again, I tried to make something work here, but couldn't come up with anything that reduced the number of lines. :-)

sgraf812 · 2026-02-27T03:20:22Z

Cslib/Algorithms/Lean/Query/MonadicExample.lean

+    The predicate family `pre c` captures "the Int state is c" within the
+    abstract postcondition shape `ps`. The hypotheses `hf` and `h_modify`
+    assert that `f` preserves this predicate and the lifted `modify` transitions it. -/


My gut says it yields simpler VCs if you write specs that are schematic in the post (like tick_spec) rather than the precondition

I couldn't get anything useful out of this, but if you have time to investigate I'd be happy to hear.

sgraf812 · 2026-02-27T03:23:00Z

Cslib/Algorithms/Lean/Query/MonadicExample.lean

+    (h_modify : ∀ v c, ⦃pre c⦄
+      (MonadLiftT.monadLift (modify (· + v) : StateM Int Unit) : m Unit)
+      ⦃⇓ _ => pre (c + v)⦄)
+    (xs : List Int) :
+    ∀ c, ⦃pre c⦄ mapSum f xs ⦃⇓ _ => pre (c + (xs.map g).sum)⦄ := by


This sounds to me like it could be expressed as a loop invariant lemma, similar to Spec.forIn_list etc.

Shreyas4991 · 2026-02-27T08:06:07Z

@kim-em : This is duplicating #372 (refinement of #275) is it not? Also there I have a specific meaning of queries which are pre-declared inductive types there. I think the reuse of the word query might cause confusion here.

@sgraf812 : Related question, could we set up mvcgen for the framework in #372 ? Essentially it runs code in Id monad and measures complexity in TimeM (which is just an additive writer monad).

Cslib/Algorithms/Lean/Query/Sort/Insertion/Defs.lean

kim-em · 2026-02-27T11:28:19Z

This is duplicating #372 (refinement of #275) is it not?

No, I am proposing that this is a better alternative.

Shreyas4991 · 2026-02-27T12:16:12Z

This is duplicating #372 (refinement of #275) is it not?

No, I am proposing that this is a better alternative.

I strongly disagree. You don't state your queries up front in your model. So you can't state reductions, lower bounds etc. all this makes use of explicit queries and custom cost functions. Instead you use a lean function as a parameter. This is a huge limitation in algorithms theory and complexity.

In the model of #372, one can formally state multiple RAM models and even DSLs for Turing machines and circuits from circuit complexity. Therefore we can already define uniform circuit classes in complexity which involves writing programs in two different models with two different cost models (TMs and circuits). The fact that I can write circuits tells me that I can also encode parallel algorithms and the equivalence between circuits and parallel algorithms (there are several standard equivalences between these models). I have kept my pr limited to simple examples to keep it focused (the maintainers explicitly asked this). In fact it can even talk about lower bounds.

In summary : the other PR allows for a comprehensive top down treatment of the multitude of models in algorithms and the relationships between them(one of the standard models like RAM and TM could be sink nodes in this relationship network).

What this PR is doing is equivalent to compressing the process of defining a query and an evaluation function (and a cost function) in my PR, by directly providing the interpreted version as a parameter (cmp). But it is losing a lot in the process. So, you get a slight generalisation from evaluating in Id as #372 does, to evaluating in a parametric monad m and adding mvcgen machinery. While this might be a positive, it could just as easily have been done on top of #372, with all the benefits of that approach. This suggests (per project contribution guidelines), that at the very least this PR should have been:

Discussed in Zulip first

However, for any major development, it is strongly recommended to discuss first on Zulip (or via a GitHub issue) so that the scope, dependencies, and placement in the library are aligned.

Build on top of the existing PR feat: query complexity model for algorithms theory #372 mentioned above.

New definitions should instantiate existing abstractions whenever appropriate

kim-em · 2026-03-01T22:56:40Z

So, you get a slight generalisation from evaluating in Id as #372 does, to evaluating in a parametric monad m

@Shreyas4991, you are misunderstanding the purpose of this PR, and how we achieve security against cost manipulation via parametricity.

New definitions should instantiate existing abstractions whenever appropriate

#372 doesn't "exist" for the purposes of that suggestion, as it hasn't been merged. I suggest comparison between #376 and #372 stay on Zulip, rather than in the PR discussions.

Shreyas4991 · 2026-03-01T23:02:47Z

So, you get a slight generalisation from evaluating in Id as #372 does, to evaluating in a parametric monad m

@Shreyas4991, you are misunderstanding the purpose of this PR, and how we achieve security against cost manipulation via parametricity.

In this instance I am certain I do. I am the domain expert here as far as algorithms is concerned. This is merely short-circuiting my approach and shares the exact same limitation mine does, with write-queries, and with a less than optimal design for the purposes my PR serves. The "generalization" is merely trying to hide the evaluation behind an arbitrary monad instead of a free monad with an uninterpreted query.

Shreyas4991 · 2026-03-01T23:19:32Z

New definitions should instantiate existing abstractions whenever appropriate

#372 doesn't "exist" for the purposes of that suggestion, as it hasn't been merged. I suggest comparison between

There is no suggestion in that guidelines that a contributing PRs work can be overriden without discussion if it isn't merged already. It is also a core principle of open source etiquette to discuss an idea first, especially if someone is already working on it before making one's own contribution. This ensures that the work and effort of contributors is not devalued or outright erased. I have adhered to this principle with my PR from the beginning via zulip discussions. This PR has violated it.

Open source guides:

Before doing anything, do a quick check to make sure your idea hasn’t been discussed elsewhere. Skim the project’s README, issues (open and closed), mailing list, and Stack Overflow. You don’t have to spend hours going through everything, but a quick search for a few key terms goes a long way.

kim-em · 2026-03-02T00:42:14Z

@Shreyas4991, you mentioned that "our models have the same security from misuse of return." I don't think that's the case. Here's a fully computable algorithm in the framework of #372 that "sorts in zero queries":

/-- Sorting with zero queries. -/
def freeSort {α : Type} (le : α → α → Prop) [DecidableRel le]
    (l : List α) : Prog (SortOpsInsertHead α) (List α) :=
  .pure (l.insertionSort le)

theorem freeSort_correct {α : Type} (le : α → α → Prop) [DecidableRel le] (l : List α) :
    (freeSort le l).eval (sortModel le) = l.insertionSort le := by
  simp [freeSort]

theorem freeSort_free {α : Type} (le : α → α → Prop) [DecidableRel le] (l : List α) :
    (freeSort le l).time (sortModel le) = 0 := by
  simp [freeSort]

These theorems together assert that sorting can be done correctly with zero comparisons. The root cause: sortModel le requires [DecidableRel le] to interpret queries, and that same instance gives algorithms direct access to comparisons via pure, bypassing the query mechanism entirely.

In the monad-parametric approach of #376, the comparison is an opaque parameter (α × α → m Bool). The cost theorem insertionSort_runsIn : RunsIn insertionSort (fun xs => xs.length * xs.length) has no DecidableRel, Ord, or BEq in its statement — the algorithm has no way to compare elements except through the provided function, and every such call is counted.

To be clear about the limitations of #376: because Lean has classical logic, a noncomputable algorithm could in principle use Classical.choice to distinguish TickM from other monads and cheat. This means we can't use RunsIn to prove lower bounds (a noncomputable counterexample would always exist). But that's not the goal here — the goal is proving upper bounds for specific algorithms. For computable (monad-polymorphic) algorithms, parametricity holds: they can't observe which monad they're running in, so every comparison genuinely goes through cmp, and the RunsIn bound is valid.

The exploit above is not about classical logic — freeSort is fully computable. The issue is structural: the free monad approach necessarily puts [DecidableRel le] in scope (because the model needs it), and nothing prevents algorithms from using it directly.

Shreyas4991 · 2026-03-02T00:45:42Z

@Shreyas4991, you mentioned that "our models have the same security from misuse of return." I don't think that's the case. Here's a fully computable algorithm in the framework of #372 that "sorts in zero queries":
/-- Sorting with zero queries. -/
def freeSort {α : Type} (le : α → α → Prop) [DecidableRel le]
    (l : List α) : Prog (SortOpsInsertHead α) (List α) :=
  .pure (l.insertionSort le)

theorem freeSort_correct {α : Type} (le : α → α → Prop) [DecidableRel le] (l : List α) :
    (freeSort le l).eval (sortModel le) = l.insertionSort le := by
  simp [freeSort]

theorem freeSort_free {α : Type} (le : α → α → Prop) [DecidableRel le] (l : List α) :
    (freeSort le l).time (sortModel le) = 0 := by
  simp [freeSort]
These theorems together assert that sorting can be done correctly with zero comparisons. The root cause: sortModel le requires [DecidableRel le] to interpret queries, and that same instance gives algorithms direct access to comparisons via pure, bypassing the query mechanism entirely.

In the monad-parametric approach of #376, the comparison is an opaque parameter (α × α → m Bool). The cost theorem insertionSort_runsIn : RunsIn insertionSort (fun xs => xs.length * xs.length) has no DecidableRel, Ord, or BEq in its statement — the algorithm has no way to compare elements except through the provided function, and every such call is counted.

To be clear about the limitations of #376: because Lean has classical logic, a noncomputable algorithm could in principle use Classical.choice to distinguish TickM from other monads and cheat. This means we can't use RunsIn to prove lower bounds (a noncomputable counterexample would always exist). But that's not the goal here — the goal is proving upper bounds for specific algorithms. For computable (monad-polymorphic) algorithms, parametricity holds: they can't observe which monad they're running in, so every comparison genuinely goes through cmp, and the RunsIn bound is valid.

The exploit above is not about classical logic — freeSort is fully computable. The issue is structural: the free monad approach necessarily puts [DecidableRel le] in scope (because the model needs it), and nothing prevents algorithms from using it directly.

It's not an exploit because you are including an instance that the model explicitly does not require . You can do this in your model as well. Concretely take your decidable LE instance and lift, choose a specific monad with a liftM function and you can also lift "let cmp := liftM fun x y => decide <| LE.le x y) into your monad. I use these instances because I find the trade off between the simplicity of these algorithms and detection of cheating trivial.

You are inserting an extra typeclass instance into my def. I can conversely insert a special monad into yours and erase the parametricity. Neither of these are "exploits"

If providing extra instances is fair game then so is specializing the monad parameter to yours. And to be clear, based on my discussion with type theorists, adding proper foolproof parametric polymorhism to a dependently typed system is an open problem (concretely something like ML style modules). Rocq's module system has been mentioned to me as an example (and they just uncovered a soundness bug there yesterday). So if you solve parametricity in lean, please submit it to POPL.

Shreyas4991 · 2026-03-02T01:00:02Z

@kim-em to be clear, a Prog should never have typeclass instances. That's the easiest red flag to catch.

please see the actual linearSearch example. It explicitly avoids giving DecidableEq instances and uses a Bool predicate. We can absolutely make this change for the sorting models as well. Nothing about this example is specific to my model and not yours. It is about the specific choice of type for the cmp operation. Your correctness theorem doesn't prove sorting yet. I found it convenient to use List.Pairwise to prove sortedness. In your PR you don't prove this yet. So this is not even a point that we can meaningfully compare on. If someone wrote a function with le : \a -> \a -> Prop in your model, they will face the exact same issue.

Secondly you made a claim about your goals. They are not my goals. The query model is a model that I am building based on what I know will work for algorithms theory. I am working on something that I can use for algorithms theory. The code verification stuff can be handled by Boole and is not my issue. It's arguably better suited for the problem.

Thirdly this PR is just reinventing WriterT monad transformer

Shreyas4991 · 2026-03-02T01:30:20Z

Lastly parametric polymorphism requires that I should not be able to supply an explicit m. In Lean you can always do this. So this is not even "parametric polymorphism". That requires something like ML modules. I should have caught this before. Typeclasses implement ad-hoc polymorphism viz the exact opposite.

Shreyas4991 · 2026-03-02T02:03:32Z

@kim-em : Update, I just changed the Prop model to a Bool le model for my sorting query and algorithms in cslib#372 . So no [Decidable le] instances are needed anymore. This has absolutely nothing to do with Free Monads and everything to do with me picking a convenient approach.

But as I have mentioned before, nothing stops someone from writing
def insertionSort ... (le : a -> a -> Prop) [DecidableRel le] in your model either (after all now you need to insert the Prop and DecidableRel version as extra parameters in mine now that I have completely switched all my examples to Bool).

So this "exploit" you have found is equivalently applicable in both models (or more clearly, not), since any monadic DSL will allow pure operations and bad instantiation of monads and other parameters (that's a typeclass vs module system problem). Human reviewing is the only guard. What both models do is substantially reduce the surface area for error.

Also (and this is exactly why I think this PR is misguided from a contribution guidelines standpoint), you could have just made a PR to change my model to a ProgT, which would put my queries behind a monad parameter m, if you felt that my method lacked enough safety.

Concretely, in a proper approach, you would have to make a PR on top of #372 :

Write a monad transformer version of TimeM (that's your TickT)
Write a monad transformer version of FreeM.
generalize Prog to ProgT.
change queries and models to evaluate to this monadic type.

Instead your PR lays claim to the idea of query complexity in the title, an idea that I have been promoting since even before the debate formalization, and then squanders the power of my model by losing its explicit inductive query types (while also depriving all of us authors the authorship credit that is due), which I have crafted to handle a huge chunk of algorithms theory literature in a top down fashion.

kim-em · 2026-03-02T03:38:19Z

So this "exploit" you have found is equivalently applicable in both models (or more clearly, not), since any monadic DSL will allow pure operations and bad instantiation of monads and other parameters (that's a typeclass vs module system problem). Human reviewing is the only guard. What both models do is substantially reduce the surface area for error.

You continue to misunderstand this PR. I'd prefer to leave this to reviewers at this point. I appreciate you don't like my implicit criticism of #372 here.

Shreyas4991 · 2026-03-02T07:46:41Z

So this "exploit" you have found is equivalently applicable in both models (or more clearly, not), since any monadic DSL will allow pure operations and bad instantiation of monads and other parameters (that's a typeclass vs module system problem). Human reviewing is the only guard. What both models do is substantially reduce the surface area for error.

You continue to misunderstand this PR. I'd prefer to leave this to reviewers at this point. I appreciate you don't like my implicit criticism of #372 here.

I would prefer criticism that was accurate and constructive and that actually improved my PR. Seeing as this was neither, and also making false claims about the problem being inherent to a free monad approach, but so blatantly supplying an instance to a Prog, yes I object. I also object to claims that this PR is better than 372 when 372's objectives are misunderstood.

Shreyas4991 · 2026-03-02T08:57:28Z

Cslib/Algorithms/Lean/MergeSort/MergeSort.lean

This PR should be deleting other models and their examples. These can actually be used.

Shreyas4991 · 2026-03-02T08:58:49Z

Cslib/Algorithms/Lean/Query/Basic.lean

+  count : Nat
+
+/-- A monad transformer that adds tick-counting to any monad `m`. -/
+@[expose] def TimeT (m : Type → Type) (α : Type) := StateT TimeT.State m α


This should be a writer monad. A time counting monad is an additive log.

Shreyas4991 · 2026-03-02T09:00:05Z

Cslib/Algorithms/Lean/TimeM.lean

Again. This PR is deleting a file based on the presumption that it is no longer needed. I don't believe we have settled on deleting all other models in favour of this PR. My model builds on top of this PR. So this deletion rests on undecided assumptions and breaks other models.

Yes, clearly the PRs are incompatible.

They are separate models. They can exist in separate folders. The code itself can be reused elsewhere. Deleting it is breaking other PRs including mine without fixes. The current Zulip discussion has focussed on retaining more than one model. This PR ignores that by deleting an existing model before any decision has been reached.

Shreyas4991 · 2026-03-02T09:22:01Z

@kim-em I guided GPT to find two hacks in this "parametricity" approach, with the same level of tricks as the aforementioned "hack" in mine. It took less than 5 minutes to type up the explanation to GPT and have it generate working code while I was simultaneously at a workshop understanding a different proof. I hope this clarifies that I actually understand parametricity better and we can do away these silly accusations that I misunderstand this PR. Especially since they were made when I pointed out how I'd reproduce your hacks in this PR and actually produced them. Parametric polymorphism in lean for algorithms was actually discussed in the lean discord in mid 2024.

Link to Zulip

EDIT: The docstring claims are also too strong. You clearly don't need to use the provided cmp operation 😄

kim-em · 2026-03-02T12:14:04Z

New definitions should instantiate existing abstractions whenever appropriate

#372 doesn't "exist" for the purposes of that suggestion, as it hasn't been merged. I suggest comparison between

There is no suggestion in that guidelines that a contributing PRs work can be overriden without discussion if it isn't merged already. It is also a core principle of open source etiquette to discuss an idea first, especially if someone is already working on it before making one's own contribution. This ensures that the work and effort of contributors is not devalued or outright erased. I have adhered to this principle with my PR from the beginning via zulip discussions. This PR has violated it.

Open source guides:

Before doing anything, do a quick check to make sure your idea hasn’t been discussed elsewhere. Skim the project’s README, issues (open and closed), mailing list, and Stack Overflow. You don’t have to spend hours going through everything, but a quick search for a few key terms goes a long way.

Shreyas, obviously proposing to do something in one way does not give you a veto on attempts to do that thing other ways. Equally obviously, I am aware of your work, you have talked about it extensively. I am not particularly interested in having a long discussion with you about this: words are cheaper for you than they are for me!

Shreyas4991 · 2026-03-02T12:34:16Z

Please note that I have broken your parametericity based security claims with two hacks very similar to yours and established my earlier claim that the models are equivalent in security. I have provided the link to the examples.

Not only have I disproved your claim that I misunderstand your pr. I have also established that you misunderstand parametric polymorphism, given

Your strong claims in the docstring.
Your claimed "hack" against my pr.
Your claim that this is hack is somehow inherent to free monads (reality check : it's inherent to any monad).
The fact that I produced two hacks of a very similar nature on your pr contradicting your claims so far in the docstring. In fact both hacks use the proposals I already made above which you dismissed as me misunderstanding your PR.
I have already switched to a Boolean le predicate, invalidating your currently claimed hack.

You say you value your words more. all I am seeing in this PR is false or no justifications for design decisions and a tonne of code. Code is cheap. Anybody can code. LLMs can code. Design is important. Design >>> code. Hence I talk before making big PRs which is the established OSS etiquette.

I rest my case. I have established that this PR offers no more "security against hacks" than mine. I fully recognize that a lightweight framework will invoke tradeoffs between automated guards and reviewers checking definitions.

I find this discussion increasingly toxic and I am forced to continue this to defend my work against incorrect and also not particularly relevant claims, among other things.

Shreyas4991 · 2026-03-02T12:38:36Z

Shreyas, obviously proposing to do something in one way does not give you a veto on attempts to do that thing other ways. Equally obviously, I am aware of your work, you have talked about it extensively. I am not particularly interested in having a long discussion with you about this: words are cheaper for you than they are for me!

This is not about vetoing. This about a lack of prior discussion about an overlapping PR similar to what I did before starting my PR. This is about exploring the possibility of building on top of existing approaches. This is basic OSS etiquette. Code is cheap. Design is expensive. This etiquette is very clear that words (discussions) about new and overlapping contributions are more valuable than just coding up a PR.

This PR is a clear violation of basic OSS etiquette. If someone is doing something one way and you are late to the party, then you do have to wait and work with whoever got the idea working first. What you've done is the OSS equivalent of learning somebody else's research idea at a tea party, and then running with it before they publish their paper. That's grossly unethical.

Add a monad-generic framework for proving upper and lower bounds on query complexity of algorithms, demonstrated on comparison sorting. Key components: - `Query.Basic`: WPMonad-based framework with `UpperBound`/`LowerBound` predicates - `QueryTree`: decision tree reification for lower bound proofs - `IsMonadicSort`: correctness specification for comparison sorts - Verified insertion sort (O(n²)) and merge sort (O(n log n)) implementations - `lowerBound_infinite`: any correct comparison sort on an infinite type requires ≥ ⌈log₂(n!)⌉ comparisons for every input size n Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Replace the `Fin n` + `Fintype.equivFin` re-indexing proof of `exists_queriesOn_ge_clog` with a Finset-based auxiliary (`exists_mem_queriesOn_ge_clog`) that works over an arbitrary `Finset ι`. This eliminates the `exists_large_fiber` helper and all `.val`/`.property`/`Subtype.val_injective` bookkeeping, since oracles never change between recursive calls. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The `_oracle` parameter is structurally unused in the abbrev body but is essential for the WP/WPMonad typeclass instances that depend on it. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Generalize `exists_queriesOn_ge_clog` from `Bool` responses to arbitrary `Fintype R`, giving a ⌈log_{|R|} n⌉ lower bound. Uses Mathlib's pigeonhole principle instead of a manual Bool-specific partition. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

This makes the predicate easier to satisfy: the witness input only needs size at most n, not exactly n. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

sgraf812 · 2026-03-03T10:21:40Z

Cslib/Algorithms/Lean/Query/LowerBound.lean

+The predicate `LowerBound f size bound` states: for every input size `n`, there exists
+an input `x` with `size x ≤ n` and an oracle such that the algorithm makes at least
+`bound n` queries. This is the worst-case formulation — for each size we exhibit a hard
+input, and choose the oracle after seeing the algorithm's strategy (the tree).


Why does this prevent noncomputable-related parametricity escape hatches? The answer to this question took me a while, but for me an essential insight was that the LowerBound proposition passes the oracle after the monadic algorithm has run. So the monadic algorithm cannot simply statically Classical.choose the right oracle and omit queries (that would yield an incorrect algorithm). Not sure if this explanation makes sense to you, but something like it would have helped me.

Couldn't we use this approach for upper bounds as well, for uniformity? Something like

@[expose] def UpperBound (f : ∀ {m : Type → Type} [Monad m], (α → m β) → γ → m δ) (size : γ → Nat) (bound : Nat → Nat) : Prop := ∀ n, ∀ x (_ : size x ≤ n) (oracle : α → β), (f QueryTree.ask x : QueryTree α β δ).queriesOn oracle <= bound n

(Hope I have gotten the quantifiers right.)

If so, the next question (devil's advocate) would be: Why force f to be monad-polymorphic? Why not focus on QueryTree if it's impossible to guess the right oracle anyway?

It can. This is how I do it in my lower bound proof in #372 which I uploaded 14 hours ago
It would bring your model one step closer to time.

focus on QueryTree if it's impossible to guess the right oracle anyway?

One answer that comes to mind is "you can instantiate with m := IO and get an executable program that the compiler has more power to optimize". Of course you have no proof that the specialization satisfies the complexity bounds you're proved for a different specialization, but the compiler isn't trusted anyway.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

eric-wieser · 2026-03-03T12:33:12Z

Cslib/Algorithms/Lean/Query/QueryTree.lean

+
+This is the free monad specialized to a single fixed-type operation, used to reify
+monad-generic algorithms as explicit trees for query complexity lower bounds. -/
+inductive QueryTree (Q : Type) (R : Type) (α : Type) where


Why not abbrev QueryTree Q R a := FreeM (fun _ : Q => R) a or similar?

Ah nevermind, I somehow missed this in the comment above. Though I'm not sure I understand how this avoids universe issues, when it is constrained to type 0.

eric-wieser · 2026-03-03T12:49:05Z

Cslib/Algorithms/Lean/Query/QueryTree.lean

+
+`QueryTree Q R` is isomorphic to `FreeM (fun _ => Q)` restricted to operations returning `R`,
+but is defined as a dedicated inductive to avoid universe issues with `FreeM`'s existential
+`ι` parameter (which would require producing values of arbitrary types during evaluation).


Could you elaborate on this?

eric-wieser · 2026-03-03T12:51:32Z

Cslib/Algorithms/Lean/Query/QueryTree.lean

+
+variable {Q R : Type} {oracle : Q → R}
+
+instance instWP : WP (OracleQueryTree Q R oracle) .pure where


Is this safe, given that OracleQueryTree is reducible? Doesn't this instance get tried for QueryTree itself?

Shreyas4991 · 2026-03-03T06:12:31Z

Cslib/Algorithms/Lean/Query/UpperBound.lean

+
+## Computability caveat
+
+The parametricity argument is only valid for **computable** algorithms. A `noncomputable`


My example 1 showed that all you need are Decidable instances. Saying that you need classical or noncomputable to break this model is a stretch. You need there to be no additional instances (same as Prog). The "parametricity" argument is not even valid for computable functions in general.

Parametricity is about restricting the set of possible implementations by demanding they have a certain type. Conversely, for a given type you get strong guarantees for the (intuitionistic) implementations that inhabit them. In that light, isMonadicSort from https://github.com/leanprover/cslib/pull/376/changes#diff-ef6ad783c16d76a2b9261bd88850b42a7136bc0fff2bb58b6e7722bc06d5482cR44 makes great sense to me. Just from looking at the types involved, I'm convinced that no computable instance can cheat. Your example rests on the assumption that you may freely change the type. Which you may not, because the type is fundamentally part of the spec. The more (instance) parameters you add, the more implementations inhabit the type, the weaker the spec. I do think there are valid concerns worth discussing in PR, but I don't think that the "jailbreak" type of exploit is one. (Personally, I'm not sure if it's even worth debating it in your approach when we end up needing the discussed trusted meta function anyway.)

My example was basically created to show that the same hack that Kim showed against my PR works in this one, and their explanation is false. In their case they supplied a [Decidable le] instance to a Prog which is also not supposed to be there. I encourage you to see my lower bound. I haven't had nearly as much time to polish it. beyond permutations, but I think it shows the hidden permutation approach (and it was published hours before this proof).

And fwiw, I agree that insertion of extra typeclasses weakens the claim. This is true for both #372 and #376. The docstring should say this.

Parametricity is about restricting the set of possible implementations by demanding they have a certain type. Conversely, for a given type you get strong guarantees for the (intuitionistic) implementations that inhabit them. In that light, isMonadicSort from https://github.com/leanprover/cslib/pull/376/changes#diff-ef6ad783c16d76a2b9261bd88850b42a7136bc0fff2bb58b6e7722bc06d5482cR44 makes great sense to me. Just from looking at the types involved, I'm convinced that no computable instance can cheat. Your example rests on the assumption that you may freely change the type. Which you may not, because the type is fundamentally part of the spec. The more (instance) parameters you add, the more implementations inhabit the type, the weaker the spec. I do think there are valid concerns worth discussing in PR, but I don't think that the "jailbreak" type of exploit is one. (Personally, I'm not sure if it's even worth debating it in your approach when we end up needing the discussed trusted meta function anyway.)

what trusted meta function? There is no trusted meta function in my PR. This seems to be a big misunderstanding of what I am doing. There is an explicit model, an object we can actively reason about (independent of even algorithms). This allows explicitly talking about reductions and composition in a way this model makes hideously complicated. And now this PR is reinventing custom free monads to talk about query trees, which reinvents my approach. There is no trusted meta function. I'll take further discussion to chat.

Well, instead of fighting this PR, how about you use your energy to draw inspiration from it just as it drew inspiration from yours? Or you try and find ways to improve it. To me it seems that both PRs begin to converge on a similar design. Personally, I'm still not 100% convinced that the FreeM stuff to eliminate a trust issue is worth it when that trust issue would also be solved by a meta function + plain TimeM. A meta function that is needed anyway to verify complexity of ~~bonafide~~ vanilla Lean code. At least I don't see how there is another solution to the latter problem.

FreeM is not merely about protecting against cheating as I have explained countless times on Zulip. It allows me to encode the zoo of models found in the algorithms theory literature, both implicit and explict, and reason about models independently of algorithms AND combine them while keeping their complexities separate. If you misunderstand this, then you misunderstand my PR. So far every claim made about the superiority of this PR to mine has proven to be false, and yet proclaimed strongly.

As for the rest of your comment, I would appreciate it if you corrected the record per discussion in DM. Fighting unethical behaviour that affects me is non-optional for me. These additional suggestions are not relevant to the review.

The relevant part is : Docstrings which are incorrect can be corrected. This docstring can be corrected.

@sgraf812
Further, I don't see any reason to contribute to an approach I consider too broken for uses in algorithms theory. It may be awesome for verifying pure lean functions, but that is of incidental interest (which my model can satisfy). I do consider it in my interest to do my best to explain why this approach is too broken for all the rest of it. My interest is in enabling the kind of top-down modular approach that can enable the kind of abstract essential reasoning about algorithms that prove useful for formalizing everything from undergrad textbooks to research papers without waiting for the library to start from undergrad algorithms and get there some day. The kind of modular reasoning good courses teach

Shreyas4991

This model is becoming a copy of mine for lower bounds.

Shreyas4991 · 2026-03-03T13:22:21Z

Cslib/Algorithms/Lean/Query/QueryTree.lean

+This is the free monad specialized to a single fixed-type operation, used to reify
+monad-generic algorithms as explicit trees for query complexity lower bounds. -/
+inductive QueryTree (Q : Type) (R : Type) (α : Type) where
+  /-- A completed computation returning value `a`. -/
+  | pure (a : α) : QueryTree Q R α
+  /-- A query node: asks query `q`, then continues based on the response. -/
+  | query (q : Q) (cont : R → QueryTree Q R α) : QueryTree Q R α
+


Now this is beginning to reinvent Prog. I proved both my upper and lower bound in one framework. Apparently you need to reinvent my model for lower bounds where i can just compute the trace of queries made. Because a prog is literally just an expression tree.

Replace @ notation with (m := M) named arguments for readability. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add `triple_mono` helper to weaken Hoare triple postconditions via a monotone function, replacing verbose `Triple.entails_wp_of_post` + `PostCond.entails_noThrow` + `SPred.pure_mono` boilerplate at 4 sites. Also move `let M` before `hcmp` in `queryTree_correct` to avoid repeating the OracleQueryTree monad type. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

eric-wieser · 2026-03-04T06:19:38Z

Cslib/Algorithms/Lean/Query/QueryTree.lean

+`QueryTree Q R α` is a dedicated two-constructor inductive with all parameters in `Type`,
+so algorithms and their decision trees need no universe annotations and proofs proceed by
+straightforward structural recursion. A general free monad over an operation family
+`(ι : Type) → (ι → Type)` would introduce an existential bumping the universe; since query


Does FreeM introduce such a universe bump?

Indeed it does not. I've removed mention of universes here. (I think not having any universes in this part of the code is probably the right choice, but happy to be convinced otherwise.)

In this case, can we make QueryTree an abbrev for a specialization of FreeM?

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

eric-wieser · 2026-03-04T07:41:45Z

Cslib/Algorithms/Lean/Query/QueryTree.lean

+## Design
+
+`QueryTree Q R α` is a dedicated two-constructor inductive with all parameters in `Type`,
+so proofs proceed by straightforward structural recursion.


This is also true of FreeM

kim-em force-pushed the feat/tickm-complexity-demo branch from a889930 to 5299afa Compare February 27, 2026 01:51

kim-em commented Feb 27, 2026

View reviewed changes

kim-em force-pushed the feat/tickm-complexity-demo branch from 61d9a60 to 6b6a308 Compare February 27, 2026 02:01

eric-wieser reviewed Feb 27, 2026

View reviewed changes

Cslib/Algorithms/Lean/Query/Basic.lean Outdated Show resolved Hide resolved

sgraf812 reviewed Feb 27, 2026

View reviewed changes

Shreyas4991 reviewed Feb 27, 2026

View reviewed changes

Cslib/Algorithms/Lean/Query/Sort/Insertion/Defs.lean Show resolved Hide resolved

Shreyas4991 reviewed Mar 2, 2026

View reviewed changes

kim-em force-pushed the feat/tickm-complexity-demo branch from 285f4ae to 7a39a7c Compare March 2, 2026 12:16

kim-em force-pushed the feat/tickm-complexity-demo branch from eddc66a to f250194 Compare March 3, 2026 05:21

kim-em marked this pull request as ready for review March 3, 2026 06:03

kim-em requested review from chenson2018 and fmontesi as code owners March 3, 2026 06:03

kim-em and others added 5 commits March 3, 2026 06:04

fix(Query): suppress unusedArguments lint for OracleQueryTree

871e515

The `_oracle` parameter is structurally unused in the abbrev body but is essential for the WP/WPMonad typeclass instances that depend on it. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix(Query): correct author name spelling

97b082f

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat(Query): relax LowerBound from size x = n to size x ≤ n

dc45a60

This makes the predicate easier to satisfy: the witness input only needs size at most n, not exactly n. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

proof cleanup

c30426b

sgraf812 reviewed Mar 3, 2026

View reviewed changes

kim-em and others added 2 commits March 3, 2026 11:13

proof cleanup: deduplicate queryTree_correct, inline hsplit

1b12960

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

proof cleanup: use grind in id_run proofs, inline set σ

7964db0

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

eric-wieser reviewed Mar 3, 2026

View reviewed changes

Shreyas4991 reviewed Mar 3, 2026

View reviewed changes

kim-em and others added 4 commits March 3, 2026 22:19

refactor: use named arguments in queryTree_correct

6e0df9e

Replace @ notation with (m := M) named arguments for readability. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

doc: explain QueryTree's Type-level design rationale

8121baa

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

chore: update Cslib.lean module imports

8c2fd78

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

eric-wieser reviewed Mar 4, 2026

View reviewed changes

kim-em and others added 2 commits March 4, 2026 06:21

chore: add Shreyas Srinivas to LowerBound authors

6b81400

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

doc: simplify QueryTree design rationale

e66846e

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

eric-wieser reviewed Mar 4, 2026

View reviewed changes


		variable {Q R : Type} {oracle : Q → R}

		instance instWP : WP (OracleQueryTree Q R oracle) .pure where


		## Computability caveat

		The parametricity argument is only valid for computable algorithms. A `noncomputable`

Conversation

kim-em commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Upper bounds (monad parametricity)

Lower bounds (decision trees)

Uh oh!

kim-em commented Feb 27, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sgraf812 Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sgraf812 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sgraf812 Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Shreyas4991 commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

kim-em commented Feb 27, 2026

Uh oh!

Shreyas4991 commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kim-em commented Mar 1, 2026

Uh oh!

Shreyas4991 commented Mar 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Shreyas4991 commented Mar 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kim-em commented Mar 2, 2026

Uh oh!

Shreyas4991 commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Shreyas4991 commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Shreyas4991 commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Shreyas4991 commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kim-em commented Mar 2, 2026

kim-em commented Feb 27, 2026 •

edited

Loading

sgraf812 Feb 27, 2026 •

edited

Loading

sgraf812 Feb 27, 2026 •

edited

Loading

Shreyas4991 commented Feb 27, 2026 •

edited

Loading

Shreyas4991 commented Feb 27, 2026 •

edited

Loading

Shreyas4991 commented Mar 1, 2026 •

edited

Loading

Shreyas4991 commented Mar 1, 2026 •

edited

Loading

Shreyas4991 commented Mar 2, 2026 •

edited

Loading

Shreyas4991 commented Mar 2, 2026 •

edited

Loading

Shreyas4991 commented Mar 2, 2026 •

edited

Loading

Shreyas4991 commented Mar 2, 2026 •

edited

Loading

Shreyas4991 commented Mar 2, 2026 •

edited

Loading

Shreyas4991 Mar 2, 2026 •

edited

Loading

Shreyas4991 Mar 2, 2026 •

edited

Loading

Shreyas4991 commented Mar 2, 2026 •

edited

Loading

Shreyas4991 commented Mar 2, 2026 •

edited

Loading

Shreyas4991 commented Mar 2, 2026 •

edited

Loading

Shreyas4991 Mar 3, 2026 •

edited

Loading

eric-wieser Mar 3, 2026 •

edited

Loading

eric-wieser Mar 3, 2026 •

edited

Loading