Wave E batch 3: Dreamer-pixels and CQL-offline reproduction labs by ChatGPU · Pull Request #17 · ChatGPU/Autonomous-Driving-Learning-Atlas

ChatGPU · 2026-05-27T17:26:22Z

Two more long-program reproduction labs in the new paradigm-grouped layout.

labs/world_models/lab_dreamer_cartpole_pixels/ — Dreamer-style world model on CartPole-v1 from pixels:

CNN encoder + RSSM (deterministic h_t + stochastic Gaussian z_t) + transposed-conv decoder + reward head.
Latent-imagination actor-critic with GAE λ-returns.
README, paper.md (links paper_world_models / paper_dreamer_v2 / paper_dreamer_v3), notebook narrative, full src/ module split.

labs/rl_decision/lab_cql_offline_minigrid/ — CQL vs BC vs DQN on an 8×8 sparse-reward gridworld:

Three trainers + unified eval pipeline + auto-tuned α ablation.
assets/: q_overestimation, q_overestimation_dqn_only, ood_action_density, action_histogram, eval_returns, ablation_alpha.
data/: bc.pt, dqn.pt, cql.pt, offline_dataset.pt.
paper.md links paper_cql, paper_bear, paper_iql.

Both labs follow the per-lab directory contract documented in labs/RESTRUCTURE_PROPOSAL.md.

https://claude.ai/code/session_017Ez7KNKDCGRRLjEnJi9TW7

Generated by Claude Code

labs/world_models/lab_dreamer_cartpole_pixels/ - Dreamer-style world model on CartPole-v1 from pixels. - CNN encoder + RSSM (det h_t + stochastic Gaussian z_t) + transposed- conv decoder + reward head; latent imagination actor-critic with GAE lambda-returns. - README, paper.md (linking paper_world_models / paper_dreamer_v2 / paper_dreamer_v3), notebook narrative, full src module split (env, world_model, trainer, policy, viz, seeds). Training/asset PNGs are deferred to a follow-up; the code is end-to-end runnable. labs/rl_decision/lab_cql_offline_minigrid/ - CQL on an 8x8 sparse-reward gridworld, with BC + DQN baselines and an alpha-tuning ablation. - Three trainers (BC, DQN, CQL) plus a unified eval pipeline. - assets/ contains the four story PNGs: q_overestimation, q_overestimation_dqn_only, ood_action_density, action_histogram, eval_returns, ablation_alpha. - data/ checkpoints: bc.pt, dqn.pt, cql.pt, offline_dataset.pt. - paper.md links paper_cql / paper_bear / paper_iql. Both labs follow the per-lab directory contract documented in labs/RESTRUCTURE_PROPOSAL.md. https://claude.ai/code/session_017Ez7KNKDCGRRLjEnJi9TW7

ChatGPU merged commit a1a15f4 into main May 27, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wave E batch 3: Dreamer-pixels and CQL-offline reproduction labs#17

Wave E batch 3: Dreamer-pixels and CQL-offline reproduction labs#17
ChatGPU merged 1 commit into
mainfrom
claude/epic-ritchie-A7YtN

ChatGPU commented May 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ChatGPU commented May 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants