Skip to content

[Example] Isaac Lab RNN PPO with compact memory + knowledge-base notes#3763

Merged
vmoens merged 13 commits into
gh/vmoens/279/basefrom
gh/vmoens/279/head
May 19, 2026
Merged

[Example] Isaac Lab RNN PPO with compact memory + knowledge-base notes#3763
vmoens merged 13 commits into
gh/vmoens/279/basefrom
gh/vmoens/279/head

Conversation

@vmoens
Copy link
Copy Markdown
Collaborator

@vmoens vmoens commented May 15, 2026

Stack from ghstack (oldest at bottom):

End-to-end recurrent PPO example targeting Isaac Lab. The script
demonstrates the collector / GAE features added below in the stack:

  • :class:~torchrl.collectors.MultiCollector (sync or async) with
    policy_factory and a :class:~torchrl.weight_update.MultiProcessWeightSyncScheme
    for parent->worker weight updates.
  • compact_obs=True + final_obs=True to drop ("next", obs) from
    every step and carry the boundary obs under ("final", obs).
  • :class:~torchrl.objectives.value.GAE with shifted=True: bootstraps
    from ("final", obs) at window boundaries.
  • :class:~torchrl.modules.LSTMModule's configurable recurrent_backend:
    during collection (set_recurrent_mode=False) the LSTM auto-uses
    cuDNN regardless of the backend; the configured backend applies during
    training (set_recurrent_mode=True).
  • Optional torch.compile + :class:~tensordict.nn.CudaGraphModule
    around the update step.

Isaac Lab is only imported inside the worker subprocess (via the lazy
import in make_env) — the main process never imports isaaclab,
so it stays light and owns the trainer. storing_device="cpu" keeps
the rollout off the collector GPU on the way back to the trainer.

knowledge_base/ISAACLAB.md adds the pip-based container flow we use
on the cluster (uv venv when ensurepip is missing, IsaacLab pip wheels,
the EULA / privacy-consent env vars to make Kit non-interactive) and a
recurrent-PPO-on-Isaac recipe pinning the collector-owned policy_factory,
the weight-sync scheme, and the compact_obs + shifted-GAE pairing.

[ghstack-poisoned]
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented May 15, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3763

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

❌ 2 New Failures, 2 Unrelated Failures

As of commit 97f297b with merge base 5d11fa3 (image):

NEW FAILURES - The following jobs have failed:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@github-actions
Copy link
Copy Markdown
Contributor

⚠️ PR Title Label Error

Unknown or invalid prefix [Example].

Current title: [Example] Isaac Lab RNN PPO with compact memory + knowledge-base notes

Supported Prefixes (case-sensitive)

Your PR title must start with exactly one of these prefixes:

Prefix Label Applied Example
[BugFix] BugFix [BugFix] Fix memory leak in collector
[Feature] Feature [Feature] Add new optimizer
[Doc] or [Docs] Documentation [Doc] Update installation guide
[Refactor] Refactoring [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Tests [Tests] Add unit tests for buffer
[Environment] or [Environments] Environments [Environments] Add Gymnasium support
[Data] Data [Data] Fix replay buffer sampling
[Performance] or [Perf] Performance [Performance] Optimize tensor ops
[BC-Breaking] bc breaking [BC-Breaking] Remove deprecated API
[Deprecation] Deprecation [Deprecation] Mark old function
[Quality] Quality [Quality] Fix typos and add codespell

Note: Common variations like singular/plural are supported (e.g., [Doc] or [Docs]).

@github-actions
Copy link
Copy Markdown
Contributor

⚠️ PR Title Label Error

Unknown or invalid prefix [Example].

Current title: [Example] Isaac Lab RNN PPO with compact memory + knowledge-base notes

Supported Prefixes (case-sensitive)

Your PR title must start with exactly one of these prefixes:

Prefix Label Applied Example
[BugFix] BugFix [BugFix] Fix memory leak in collector
[Feature] Feature [Feature] Add new optimizer
[Doc] or [Docs] Documentation [Doc] Update installation guide
[Refactor] Refactoring [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Tests [Tests] Add unit tests for buffer
[Environment] or [Environments] Environments [Environments] Add Gymnasium support
[Data] Data [Data] Fix replay buffer sampling
[Performance] or [Perf] Performance [Performance] Optimize tensor ops
[BC-Breaking] bc breaking [BC-Breaking] Remove deprecated API
[Deprecation] Deprecation [Deprecation] Mark old function
[Quality] Quality [Quality] Fix typos and add codespell

Note: Common variations like singular/plural are supported (e.g., [Doc] or [Docs]).

[ghstack-poisoned]
@github-actions
Copy link
Copy Markdown
Contributor

⚠️ PR Title Label Error

Unknown or invalid prefix [Example].

Current title: [Example] Isaac Lab RNN PPO with compact memory + knowledge-base notes

Supported Prefixes (case-sensitive)

Your PR title must start with exactly one of these prefixes:

Prefix Label Applied Example
[BugFix] BugFix [BugFix] Fix memory leak in collector
[Feature] Feature [Feature] Add new optimizer
[Doc] or [Docs] Documentation [Doc] Update installation guide
[Refactor] Refactoring [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Tests [Tests] Add unit tests for buffer
[Environment] or [Environments] Environments [Environments] Add Gymnasium support
[Data] Data [Data] Fix replay buffer sampling
[Performance] or [Perf] Performance [Performance] Optimize tensor ops
[BC-Breaking] bc breaking [BC-Breaking] Remove deprecated API
[Deprecation] Deprecation [Deprecation] Mark old function
[Quality] Quality [Quality] Fix typos and add codespell

Note: Common variations like singular/plural are supported (e.g., [Doc] or [Docs]).

[ghstack-poisoned]
@github-actions
Copy link
Copy Markdown
Contributor

⚠️ PR Title Label Error

Unknown or invalid prefix [Example].

Current title: [Example] Isaac Lab RNN PPO with compact memory + knowledge-base notes

Supported Prefixes (case-sensitive)

Your PR title must start with exactly one of these prefixes:

Prefix Label Applied Example
[BugFix] BugFix [BugFix] Fix memory leak in collector
[Feature] Feature [Feature] Add new optimizer
[Doc] or [Docs] Documentation [Doc] Update installation guide
[Refactor] Refactoring [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Tests [Tests] Add unit tests for buffer
[Environment] or [Environments] Environments [Environments] Add Gymnasium support
[Data] Data [Data] Fix replay buffer sampling
[Performance] or [Perf] Performance [Performance] Optimize tensor ops
[BC-Breaking] bc breaking [BC-Breaking] Remove deprecated API
[Deprecation] Deprecation [Deprecation] Mark old function
[Quality] Quality [Quality] Fix typos and add codespell

Note: Common variations like singular/plural are supported (e.g., [Doc] or [Docs]).

[ghstack-poisoned]
vmoens added a commit that referenced this pull request May 15, 2026
End-to-end recurrent PPO example targeting Isaac Lab. The script
demonstrates the collector / GAE features added below in the stack:

- :class:`~torchrl.collectors.MultiCollector` (sync or async) with
  ``policy_factory`` and a :class:`~torchrl.weight_update.MultiProcessWeightSyncScheme`
  for parent->worker weight updates.
- ``compact_obs=True`` + ``final_obs=True`` to drop ``("next", obs)`` from
  every step and carry the boundary obs under ``("final", obs)``.
- :class:`~torchrl.objectives.value.GAE` with ``shifted=True``: bootstraps
  from ``("final", obs)`` at window boundaries.
- :class:`~torchrl.modules.LSTMModule`'s configurable ``recurrent_backend``:
  during collection (``set_recurrent_mode=False``) the LSTM auto-uses
  cuDNN regardless of the backend; the configured backend applies during
  training (``set_recurrent_mode=True``).
- Optional ``torch.compile`` + :class:`~tensordict.nn.CudaGraphModule`
  around the update step.

Isaac Lab is only imported inside the worker subprocess (via the lazy
import in ``make_env``) — the main process never imports ``isaaclab``,
so it stays light and owns the trainer. ``storing_device="cpu"`` keeps
the rollout off the collector GPU on the way back to the trainer.

``knowledge_base/ISAACLAB.md`` adds the pip-based container flow we use
on the cluster (uv venv when ensurepip is missing, IsaacLab pip wheels,
the EULA / privacy-consent env vars to make Kit non-interactive) and a
recurrent-PPO-on-Isaac recipe pinning the collector-owned policy_factory,
the weight-sync scheme, and the compact_obs + shifted-GAE pairing.

ghstack-source-id: 464cf38
Pull-Request: #3763
@github-actions
Copy link
Copy Markdown
Contributor

⚠️ PR Title Label Error

Unknown or invalid prefix [Example].

Current title: [Example] Isaac Lab RNN PPO with compact memory + knowledge-base notes

Supported Prefixes (case-sensitive)

Your PR title must start with exactly one of these prefixes:

Prefix Label Applied Example
[BugFix] BugFix [BugFix] Fix memory leak in collector
[Feature] Feature [Feature] Add new optimizer
[Doc] or [Docs] Documentation [Doc] Update installation guide
[Refactor] Refactoring [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Tests [Tests] Add unit tests for buffer
[Environment] or [Environments] Environments [Environments] Add Gymnasium support
[Data] Data [Data] Fix replay buffer sampling
[Performance] or [Perf] Performance [Performance] Optimize tensor ops
[BC-Breaking] bc breaking [BC-Breaking] Remove deprecated API
[Deprecation] Deprecation [Deprecation] Mark old function
[Quality] Quality [Quality] Fix typos and add codespell

Note: Common variations like singular/plural are supported (e.g., [Doc] or [Docs]).

@github-actions
Copy link
Copy Markdown
Contributor

⚠️ PR Title Label Error

Unknown or invalid prefix [Example].

Current title: [Example] Isaac Lab RNN PPO with compact memory + knowledge-base notes

Supported Prefixes (case-sensitive)

Your PR title must start with exactly one of these prefixes:

Prefix Label Applied Example
[BugFix] BugFix [BugFix] Fix memory leak in collector
[Feature] Feature [Feature] Add new optimizer
[Doc] or [Docs] Documentation [Doc] Update installation guide
[Refactor] Refactoring [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Tests [Tests] Add unit tests for buffer
[Environment] or [Environments] Environments [Environments] Add Gymnasium support
[Data] Data [Data] Fix replay buffer sampling
[Performance] or [Perf] Performance [Performance] Optimize tensor ops
[BC-Breaking] bc breaking [BC-Breaking] Remove deprecated API
[Deprecation] Deprecation [Deprecation] Mark old function
[Quality] Quality [Quality] Fix typos and add codespell

Note: Common variations like singular/plural are supported (e.g., [Doc] or [Docs]).

[ghstack-poisoned]
vmoens added a commit that referenced this pull request May 15, 2026
End-to-end recurrent PPO example targeting Isaac Lab. The script
demonstrates the collector / GAE features added below in the stack:

- :class:`~torchrl.collectors.MultiCollector` (sync or async) with
  ``policy_factory`` and a :class:`~torchrl.weight_update.MultiProcessWeightSyncScheme`
  for parent->worker weight updates.
- ``compact_obs=True`` + ``final_obs=True`` to drop ``("next", obs)`` from
  every step and carry the boundary obs under ``("final", obs)``.
- :class:`~torchrl.objectives.value.GAE` with ``shifted=True``: bootstraps
  from ``("final", obs)`` at window boundaries.
- :class:`~torchrl.modules.LSTMModule`'s configurable ``recurrent_backend``:
  during collection (``set_recurrent_mode=False``) the LSTM auto-uses
  cuDNN regardless of the backend; the configured backend applies during
  training (``set_recurrent_mode=True``).
- Optional ``torch.compile`` + :class:`~tensordict.nn.CudaGraphModule`
  around the update step.

Isaac Lab is only imported inside the worker subprocess (via the lazy
import in ``make_env``) — the main process never imports ``isaaclab``,
so it stays light and owns the trainer. ``storing_device="cpu"`` keeps
the rollout off the collector GPU on the way back to the trainer.

``knowledge_base/ISAACLAB.md`` adds the pip-based container flow we use
on the cluster (uv venv when ensurepip is missing, IsaacLab pip wheels,
the EULA / privacy-consent env vars to make Kit non-interactive) and a
recurrent-PPO-on-Isaac recipe pinning the collector-owned policy_factory,
the weight-sync scheme, and the compact_obs + shifted-GAE pairing.

ghstack-source-id: a9a0154
Pull-Request: #3763
@github-actions
Copy link
Copy Markdown
Contributor

⚠️ PR Title Label Error

Unknown or invalid prefix [Example].

Current title: [Example] Isaac Lab RNN PPO with compact memory + knowledge-base notes

Supported Prefixes (case-sensitive)

Your PR title must start with exactly one of these prefixes:

Prefix Label Applied Example
[BugFix] BugFix [BugFix] Fix memory leak in collector
[Feature] Feature [Feature] Add new optimizer
[Doc] or [Docs] Documentation [Doc] Update installation guide
[Refactor] Refactoring [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Tests [Tests] Add unit tests for buffer
[Environment] or [Environments] Environments [Environments] Add Gymnasium support
[Data] Data [Data] Fix replay buffer sampling
[Performance] or [Perf] Performance [Performance] Optimize tensor ops
[BC-Breaking] bc breaking [BC-Breaking] Remove deprecated API
[Deprecation] Deprecation [Deprecation] Mark old function
[Quality] Quality [Quality] Fix typos and add codespell

Note: Common variations like singular/plural are supported (e.g., [Doc] or [Docs]).

[ghstack-poisoned]
vmoens added a commit that referenced this pull request May 15, 2026
End-to-end recurrent PPO example targeting Isaac Lab. The script
demonstrates the collector / GAE features added below in the stack:

- :class:`~torchrl.collectors.MultiCollector` (sync or async) with
  ``policy_factory`` and a :class:`~torchrl.weight_update.MultiProcessWeightSyncScheme`
  for parent->worker weight updates.
- ``compact_obs=True`` + ``final_obs=True`` to drop ``("next", obs)`` from
  every step and carry the boundary obs under ``("final", obs)``.
- :class:`~torchrl.objectives.value.GAE` with ``shifted=True``: bootstraps
  from ``("final", obs)`` at window boundaries.
- :class:`~torchrl.modules.LSTMModule`'s configurable ``recurrent_backend``:
  during collection (``set_recurrent_mode=False``) the LSTM auto-uses
  cuDNN regardless of the backend; the configured backend applies during
  training (``set_recurrent_mode=True``).
- Optional ``torch.compile`` + :class:`~tensordict.nn.CudaGraphModule`
  around the update step.

Isaac Lab is only imported inside the worker subprocess (via the lazy
import in ``make_env``) — the main process never imports ``isaaclab``,
so it stays light and owns the trainer. ``storing_device="cpu"`` keeps
the rollout off the collector GPU on the way back to the trainer.

``knowledge_base/ISAACLAB.md`` adds the pip-based container flow we use
on the cluster (uv venv when ensurepip is missing, IsaacLab pip wheels,
the EULA / privacy-consent env vars to make Kit non-interactive) and a
recurrent-PPO-on-Isaac recipe pinning the collector-owned policy_factory,
the weight-sync scheme, and the compact_obs + shifted-GAE pairing.

ghstack-source-id: a47ad52
Pull-Request: #3763
@github-actions
Copy link
Copy Markdown
Contributor

⚠️ PR Title Label Error

Unknown or invalid prefix [Example].

Current title: [Example] Isaac Lab RNN PPO with compact memory + knowledge-base notes

Supported Prefixes (case-sensitive)

Your PR title must start with exactly one of these prefixes:

Prefix Label Applied Example
[BugFix] BugFix [BugFix] Fix memory leak in collector
[Feature] Feature [Feature] Add new optimizer
[Doc] or [Docs] Documentation [Doc] Update installation guide
[Refactor] Refactoring [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Tests [Tests] Add unit tests for buffer
[Environment] or [Environments] Environments [Environments] Add Gymnasium support
[Data] Data [Data] Fix replay buffer sampling
[Performance] or [Perf] Performance [Performance] Optimize tensor ops
[BC-Breaking] bc breaking [BC-Breaking] Remove deprecated API
[Deprecation] Deprecation [Deprecation] Mark old function
[Quality] Quality [Quality] Fix typos and add codespell

Note: Common variations like singular/plural are supported (e.g., [Doc] or [Docs]).

[ghstack-poisoned]
@github-actions
Copy link
Copy Markdown
Contributor

⚠️ PR Title Label Error

Unknown or invalid prefix [Example].

Current title: [Example] Isaac Lab RNN PPO with compact memory + knowledge-base notes

Supported Prefixes (case-sensitive)

Your PR title must start with exactly one of these prefixes:

Prefix Label Applied Example
[BugFix] BugFix [BugFix] Fix memory leak in collector
[Feature] Feature [Feature] Add new optimizer
[Doc] or [Docs] Documentation [Doc] Update installation guide
[Refactor] Refactoring [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Tests [Tests] Add unit tests for buffer
[Environment] or [Environments] Environments [Environments] Add Gymnasium support
[Data] Data [Data] Fix replay buffer sampling
[Performance] or [Perf] Performance [Performance] Optimize tensor ops
[BC-Breaking] bc breaking [BC-Breaking] Remove deprecated API
[Deprecation] Deprecation [Deprecation] Mark old function
[Quality] Quality [Quality] Fix typos and add codespell

Note: Common variations like singular/plural are supported (e.g., [Doc] or [Docs]).

@github-actions
Copy link
Copy Markdown
Contributor

⚠️ PR Title Label Error

Unknown or invalid prefix [Example].

Current title: [Example] Isaac Lab RNN PPO with compact memory + knowledge-base notes

Supported Prefixes (case-sensitive)

Your PR title must start with exactly one of these prefixes:

Prefix Label Applied Example
[BugFix] BugFix [BugFix] Fix memory leak in collector
[Feature] Feature [Feature] Add new optimizer
[Doc] or [Docs] Documentation [Doc] Update installation guide
[Refactor] Refactoring [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Tests [Tests] Add unit tests for buffer
[Environment] or [Environments] Environments [Environments] Add Gymnasium support
[Data] Data [Data] Fix replay buffer sampling
[Performance] or [Perf] Performance [Performance] Optimize tensor ops
[BC-Breaking] bc breaking [BC-Breaking] Remove deprecated API
[Deprecation] Deprecation [Deprecation] Mark old function
[Quality] Quality [Quality] Fix typos and add codespell

Note: Common variations like singular/plural are supported (e.g., [Doc] or [Docs]).

[ghstack-poisoned]
@github-actions
Copy link
Copy Markdown
Contributor

⚠️ PR Title Label Error

Unknown or invalid prefix [Example].

Current title: [Example] Isaac Lab RNN PPO with compact memory + knowledge-base notes

Supported Prefixes (case-sensitive)

Your PR title must start with exactly one of these prefixes:

Prefix Label Applied Example
[BugFix] BugFix [BugFix] Fix memory leak in collector
[Feature] Feature [Feature] Add new optimizer
[Doc] or [Docs] Documentation [Doc] Update installation guide
[Refactor] Refactoring [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Tests [Tests] Add unit tests for buffer
[Environment] or [Environments] Environments [Environments] Add Gymnasium support
[Data] Data [Data] Fix replay buffer sampling
[Performance] or [Perf] Performance [Performance] Optimize tensor ops
[BC-Breaking] bc breaking [BC-Breaking] Remove deprecated API
[Deprecation] Deprecation [Deprecation] Mark old function
[Quality] Quality [Quality] Fix typos and add codespell

Note: Common variations like singular/plural are supported (e.g., [Doc] or [Docs]).

[ghstack-poisoned]
@github-actions
Copy link
Copy Markdown
Contributor

⚠️ PR Title Label Error

Unknown or invalid prefix [Example].

Current title: [Example] Isaac Lab RNN PPO with compact memory + knowledge-base notes

Supported Prefixes (case-sensitive)

Your PR title must start with exactly one of these prefixes:

Prefix Label Applied Example
[BugFix] BugFix [BugFix] Fix memory leak in collector
[Feature] Feature [Feature] Add new optimizer
[Doc] or [Docs] Documentation [Doc] Update installation guide
[Refactor] Refactoring [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Tests [Tests] Add unit tests for buffer
[Environment] or [Environments] Environments [Environments] Add Gymnasium support
[Data] Data [Data] Fix replay buffer sampling
[Performance] or [Perf] Performance [Performance] Optimize tensor ops
[BC-Breaking] bc breaking [BC-Breaking] Remove deprecated API
[Deprecation] Deprecation [Deprecation] Mark old function
[Quality] Quality [Quality] Fix typos and add codespell

Note: Common variations like singular/plural are supported (e.g., [Doc] or [Docs]).

@github-actions
Copy link
Copy Markdown
Contributor

⚠️ PR Title Label Error

Unknown or invalid prefix [Example].

Current title: [Example] Isaac Lab RNN PPO with compact memory + knowledge-base notes

Supported Prefixes (case-sensitive)

Your PR title must start with exactly one of these prefixes:

Prefix Label Applied Example
[BugFix] BugFix [BugFix] Fix memory leak in collector
[Feature] Feature [Feature] Add new optimizer
[Doc] or [Docs] Documentation [Doc] Update installation guide
[Refactor] Refactoring [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Tests [Tests] Add unit tests for buffer
[Environment] or [Environments] Environments [Environments] Add Gymnasium support
[Data] Data [Data] Fix replay buffer sampling
[Performance] or [Perf] Performance [Performance] Optimize tensor ops
[BC-Breaking] bc breaking [BC-Breaking] Remove deprecated API
[Deprecation] Deprecation [Deprecation] Mark old function
[Quality] Quality [Quality] Fix typos and add codespell

Note: Common variations like singular/plural are supported (e.g., [Doc] or [Docs]).

[ghstack-poisoned]
@github-actions
Copy link
Copy Markdown
Contributor

⚠️ PR Title Label Error

Unknown or invalid prefix [Example].

Current title: [Example] Isaac Lab RNN PPO with compact memory + knowledge-base notes

Supported Prefixes (case-sensitive)

Your PR title must start with exactly one of these prefixes:

Prefix Label Applied Example
[BugFix] BugFix [BugFix] Fix memory leak in collector
[Feature] Feature [Feature] Add new optimizer
[Doc] or [Docs] Documentation [Doc] Update installation guide
[Refactor] Refactoring [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Tests [Tests] Add unit tests for buffer
[Environment] or [Environments] Environments [Environments] Add Gymnasium support
[Data] Data [Data] Fix replay buffer sampling
[Performance] or [Perf] Performance [Performance] Optimize tensor ops
[BC-Breaking] bc breaking [BC-Breaking] Remove deprecated API
[Deprecation] Deprecation [Deprecation] Mark old function
[Quality] Quality [Quality] Fix typos and add codespell

Note: Common variations like singular/plural are supported (e.g., [Doc] or [Docs]).

@github-actions
Copy link
Copy Markdown
Contributor

⚠️ PR Title Label Error

Unknown or invalid prefix [Example].

Current title: [Example] Isaac Lab RNN PPO with compact memory + knowledge-base notes

Supported Prefixes (case-sensitive)

Your PR title must start with exactly one of these prefixes:

Prefix Label Applied Example
[BugFix] BugFix [BugFix] Fix memory leak in collector
[Feature] Feature [Feature] Add new optimizer
[Doc] or [Docs] Documentation [Doc] Update installation guide
[Refactor] Refactoring [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Tests [Tests] Add unit tests for buffer
[Environment] or [Environments] Environments [Environments] Add Gymnasium support
[Data] Data [Data] Fix replay buffer sampling
[Performance] or [Perf] Performance [Performance] Optimize tensor ops
[BC-Breaking] bc breaking [BC-Breaking] Remove deprecated API
[Deprecation] Deprecation [Deprecation] Mark old function
[Quality] Quality [Quality] Fix typos and add codespell

Note: Common variations like singular/plural are supported (e.g., [Doc] or [Docs]).

[ghstack-poisoned]
@github-actions
Copy link
Copy Markdown
Contributor

⚠️ PR Title Label Error

Unknown or invalid prefix [Example].

Current title: [Example] Isaac Lab RNN PPO with compact memory + knowledge-base notes

Supported Prefixes (case-sensitive)

Your PR title must start with exactly one of these prefixes:

Prefix Label Applied Example
[BugFix] BugFix [BugFix] Fix memory leak in collector
[Feature] Feature [Feature] Add new optimizer
[Doc] or [Docs] Documentation [Doc] Update installation guide
[Refactor] Refactoring [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Tests [Tests] Add unit tests for buffer
[Environment] or [Environments] Environments [Environments] Add Gymnasium support
[Data] Data [Data] Fix replay buffer sampling
[Performance] or [Perf] Performance [Performance] Optimize tensor ops
[BC-Breaking] bc breaking [BC-Breaking] Remove deprecated API
[Deprecation] Deprecation [Deprecation] Mark old function
[Quality] Quality [Quality] Fix typos and add codespell

Note: Common variations like singular/plural are supported (e.g., [Doc] or [Docs]).

[ghstack-poisoned]
@github-actions
Copy link
Copy Markdown
Contributor

⚠️ PR Title Label Error

Unknown or invalid prefix [Example].

Current title: [Example] Isaac Lab RNN PPO with compact memory + knowledge-base notes

Supported Prefixes (case-sensitive)

Your PR title must start with exactly one of these prefixes:

Prefix Label Applied Example
[BugFix] BugFix [BugFix] Fix memory leak in collector
[Feature] Feature [Feature] Add new optimizer
[Doc] or [Docs] Documentation [Doc] Update installation guide
[Refactor] Refactoring [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Tests [Tests] Add unit tests for buffer
[Environment] or [Environments] Environments [Environments] Add Gymnasium support
[Data] Data [Data] Fix replay buffer sampling
[Performance] or [Perf] Performance [Performance] Optimize tensor ops
[BC-Breaking] bc breaking [BC-Breaking] Remove deprecated API
[Deprecation] Deprecation [Deprecation] Mark old function
[Quality] Quality [Quality] Fix typos and add codespell

Note: Common variations like singular/plural are supported (e.g., [Doc] or [Docs]).

[ghstack-poisoned]
@github-actions
Copy link
Copy Markdown
Contributor

⚠️ PR Title Label Error

Unknown or invalid prefix [Example].

Current title: [Example] Isaac Lab RNN PPO with compact memory + knowledge-base notes

Supported Prefixes (case-sensitive)

Your PR title must start with exactly one of these prefixes:

Prefix Label Applied Example
[BugFix] BugFix [BugFix] Fix memory leak in collector
[Feature] Feature [Feature] Add new optimizer
[Doc] or [Docs] Documentation [Doc] Update installation guide
[Refactor] Refactoring [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Tests [Tests] Add unit tests for buffer
[Environment] or [Environments] Environments [Environments] Add Gymnasium support
[Data] Data [Data] Fix replay buffer sampling
[Performance] or [Perf] Performance [Performance] Optimize tensor ops
[BC-Breaking] bc breaking [BC-Breaking] Remove deprecated API
[Deprecation] Deprecation [Deprecation] Mark old function
[Quality] Quality [Quality] Fix typos and add codespell

Note: Common variations like singular/plural are supported (e.g., [Doc] or [Docs]).

2 similar comments
@github-actions
Copy link
Copy Markdown
Contributor

⚠️ PR Title Label Error

Unknown or invalid prefix [Example].

Current title: [Example] Isaac Lab RNN PPO with compact memory + knowledge-base notes

Supported Prefixes (case-sensitive)

Your PR title must start with exactly one of these prefixes:

Prefix Label Applied Example
[BugFix] BugFix [BugFix] Fix memory leak in collector
[Feature] Feature [Feature] Add new optimizer
[Doc] or [Docs] Documentation [Doc] Update installation guide
[Refactor] Refactoring [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Tests [Tests] Add unit tests for buffer
[Environment] or [Environments] Environments [Environments] Add Gymnasium support
[Data] Data [Data] Fix replay buffer sampling
[Performance] or [Perf] Performance [Performance] Optimize tensor ops
[BC-Breaking] bc breaking [BC-Breaking] Remove deprecated API
[Deprecation] Deprecation [Deprecation] Mark old function
[Quality] Quality [Quality] Fix typos and add codespell

Note: Common variations like singular/plural are supported (e.g., [Doc] or [Docs]).

@github-actions
Copy link
Copy Markdown
Contributor

⚠️ PR Title Label Error

Unknown or invalid prefix [Example].

Current title: [Example] Isaac Lab RNN PPO with compact memory + knowledge-base notes

Supported Prefixes (case-sensitive)

Your PR title must start with exactly one of these prefixes:

Prefix Label Applied Example
[BugFix] BugFix [BugFix] Fix memory leak in collector
[Feature] Feature [Feature] Add new optimizer
[Doc] or [Docs] Documentation [Doc] Update installation guide
[Refactor] Refactoring [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Tests [Tests] Add unit tests for buffer
[Environment] or [Environments] Environments [Environments] Add Gymnasium support
[Data] Data [Data] Fix replay buffer sampling
[Performance] or [Perf] Performance [Performance] Optimize tensor ops
[BC-Breaking] bc breaking [BC-Breaking] Remove deprecated API
[Deprecation] Deprecation [Deprecation] Mark old function
[Quality] Quality [Quality] Fix typos and add codespell

Note: Common variations like singular/plural are supported (e.g., [Doc] or [Docs]).

vmoens added a commit that referenced this pull request May 19, 2026
End-to-end recurrent PPO example targeting Isaac Lab. The script
demonstrates the collector / GAE features added below in the stack:

- :class:`~torchrl.collectors.MultiCollector` (sync or async) with
  ``policy_factory`` and a :class:`~torchrl.weight_update.MultiProcessWeightSyncScheme`
  for parent->worker weight updates.
- ``compact_obs=True`` + ``final_obs=True`` to drop ``("next", obs)`` from
  every step and carry the boundary obs under ``("final", obs)``.
- :class:`~torchrl.objectives.value.GAE` with ``shifted=True``: bootstraps
  from ``("final", obs)`` at window boundaries.
- :class:`~torchrl.modules.LSTMModule`'s configurable ``recurrent_backend``:
  during collection (``set_recurrent_mode=False``) the LSTM auto-uses
  cuDNN regardless of the backend; the configured backend applies during
  training (``set_recurrent_mode=True``).
- Optional ``torch.compile`` + :class:`~tensordict.nn.CudaGraphModule`
  around the update step.

Isaac Lab is only imported inside the worker subprocess (via the lazy
import in ``make_env``) — the main process never imports ``isaaclab``,
so it stays light and owns the trainer. ``storing_device="cpu"`` keeps
the rollout off the collector GPU on the way back to the trainer.

``knowledge_base/ISAACLAB.md`` adds the pip-based container flow we use
on the cluster (uv venv when ensurepip is missing, IsaacLab pip wheels,
the EULA / privacy-consent env vars to make Kit non-interactive) and a
recurrent-PPO-on-Isaac recipe pinning the collector-owned policy_factory,
the weight-sync scheme, and the compact_obs + shifted-GAE pairing.

ghstack-source-id: d512eb5
Pull-Request: #3763
@vmoens vmoens merged commit 97f297b into gh/vmoens/279/base May 19, 2026
106 of 113 checks passed
@vmoens vmoens deleted the gh/vmoens/279/head branch May 19, 2026 07:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Examples

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant