rfc: lockfiles for user reproducibility while testing "at the head"

I'm opening this RFC to address a fundamental conflict in our dependency management, which leads to a poor user experience and maintenance challenges. There was some internal discussion in June but I forgot to follow up then.

### The Conflict: "Live at the Head" vs. User Reproducibility

We have two core, conflicting goals:
1.  **"Live at the Head" (Maintainer Goal):** The cookbook should act as CI for the lab's packages, ensuring recipes work with the *latest* versions of `chemiscope`, `metatensor`, etc.
2.  **Reproducibility (User Goal):** A user who downloads a recipe `.zip` *must* get a working environment. They should not get "weird shit" just because a new, unrelated package broke a specific recipe.

Our current `noxfile.py` setup fails at *both*:

* **It fails reproducibility:** The `example` session installs dependencies (`sphinx-gallery`, `chemiscope`, etc.) *after* creating the environment from `environment.yml`. This "clobbers" any pins in the `environment.yml` (as Guillaume noted, we are resolving twice). The `environment.yml` in the user-facing `.zip` is therefore incomplete and misleading.
* **It fails "at the head" testing:** By silently clobbering pins, we don't get a clear signal when a recipe becomes *truly* incompatible with new packages. We just force an update. This leads to recipe rot (like #180, which is pinned to old packages) and doesn't *actually* guarantee the recipe logic works with the newest versions.

### The Proposal: A Dual Workflow with `conda-lock`

I propose we use lockfiles to separate these two goals. We will commit these files for each recipe.

This creates two distinct workflows: one for the user (stability) and one for CI (liveness).

#### 1. The User Workflow (Guaranteed Reproducibility)

* The `conda-lock.yml` file for each recipe will be added to the downloadable `.zip` file (via `post_process_gallery` in `noxfile.py`).
* The `INSTALLING.rst` file (also in the zip) will be updated to instruct users to create their environment from the lockfile:
    `conda env create -f conda-lock.yml`
* The `nox` `example` session itself will also be modified to install directly from the `conda-lock.yml`.

**Benefit:** The user *always* gets a 100% reproducible, last-known-good environment. No more "clobbering," no more "it works on CI but not for me."

#### 2. The Maintainer Workflow (Automated "Live at the Head" Testing)

We should create a new, separate CI job (e.g., run weekly and on-demand) to handle "living at the head." This job will:

1.  **Attempt to Resolve "at Head":** For each recipe, it will *delete* the existing `conda-lock.yml`.
2.  **Solve:** It will then try to solve a *complete* environment from scratch, using the recipe's `environment.yml` + the "gallery" dependencies (`sphinx-gallery`, `chemiscope`, etc.). This tests the recipe against the latest available packages.
3.  **On Success:** If it solves successfully, it generates a new `conda-lock.yml` and opens a PR (or auto-commits) to update the lockfile. This "blesses" the new package versions as the "last-known-good" set.
4.  **On Failure (e.g., #180):** If the solve fails (like `periodic-hamiltonian` would), the job fails *explicitly*. It does **not** update the lockfile.

### How This Solves Our Problems

This dual system gives us the best of both worlds:

* **Users are protected:** They always get the last-known-good lockfile from the `main` branch. A "live at the head" failure in our CI *does not* break the recipe for the user.
* **Maintenance is explicit:** When the "at head" CI job fails, we get an immediate, actionable signal. We can then:
    * Fix the recipe to make it compatible with the new packages.
    * Pin a dependency in the recipe's `environment.yml` if a new package is truly broken.
    * Formally mark the recipe as "deprecated" on the website if it's no longer maintainable (like #180).

This stops recipe rot, makes maintenance a clear and explicit process, and delivers a reproducible, working example for our users.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

rfc: lockfiles for user reproducibility while testing "at the head" #190

The Conflict: "Live at the Head" vs. User Reproducibility

The Proposal: A Dual Workflow with `conda-lock`

1. The User Workflow (Guaranteed Reproducibility)

2. The Maintainer Workflow (Automated "Live at the Head" Testing)

How This Solves Our Problems

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

rfc: lockfiles for user reproducibility while testing "at the head" #190

Description

The Conflict: "Live at the Head" vs. User Reproducibility

The Proposal: A Dual Workflow with conda-lock

1. The User Workflow (Guaranteed Reproducibility)

2. The Maintainer Workflow (Automated "Live at the Head" Testing)

How This Solves Our Problems

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

The Proposal: A Dual Workflow with `conda-lock`