Clarifies Factory vs FORGE action-space control semantics#6158
Clarifies Factory vs FORGE action-space control semantics#61580xadvait wants to merge 2 commits into
Conversation
The Factory and FORGE environments share the CtrlCfg fields pos/rot_action_bounds and pos/rot_action_threshold, but use them in opposite roles: Factory actions are displacements relative to the current end-effector pose (threshold scales the per-step action, bounds clip the target relative to the fixed asset), while FORGE actions are absolute targets relative to the fixed asset (bounds map the action onto the operational volume, the randomized threshold clips the per-step motion). This mirrors the action scale (lambda) defined in Sec. III-B, Eq. 6 of the FORGE paper and its Appendix A randomization ranges, but has repeatedly been read as an accidental swap when comparing the two environments side by side. Document both semantics on CtrlCfg, ForgeCtrlCfg and ForgeEnv._apply_action with references to the paper. Related to isaac-sim#5424 Signed-off-by: Advait Jayant <advait@vannalabs.ai>
Greptile SummaryThis is a documentation-only PR that adds class-level docstrings to
Confidence Score: 4/5Safe to merge — purely documentation additions with no runtime impact. All four changed files are documentation-only. The docstrings accurately reflect the implementation verified by reading the actual _apply_action code. The one minor point is that the _apply_action docstring references pos_threshold/rot_threshold (the runtime tensor names) while ForgeCtrlCfg and the rest of the codebase use pos_action_threshold/rot_action_threshold (the config field names), which could puzzle a reader cross-referencing the config class. The _apply_action docstring in forge_env.py uses pos_threshold/rot_threshold where all other references use the config-field names pos_action_threshold/rot_action_threshold; worth aligning for consistency. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
subgraph Factory["Factory (_apply_action)"]
FA["Normalized action [-1, 1]"]
FA -->|"× pos_action_threshold\n(scales to per-step displacement)"| FD["Per-step delta pose\n(relative to current EE)"]
FD -->|"target = current_EE + delta"| FT["Absolute target pose"]
FT -->|"clip with pos_action_bounds\n(workspace bound rel. fixed asset)"| FC["Clipped target → controller"]
end
subgraph FORGE["FORGE (_apply_action)"]
GA["Normalized action [-1, 1]"]
GA -->|"× pos_action_bounds\n(scales onto operational volume)"| GT["Absolute target pose\n(relative to fixed asset)"]
GT -->|"delta = target − current_EE"| GD["Per-step delta pose"]
GD -->|"clip with pos_action_threshold (λ)\n(limits per-step motion)"| GC["Clipped target → controller"]
end
Reviews (1): Last reviewed commit: "Clarify Factory vs FORGE action-space co..." | Re-trigger Greptile |
| * ``pos_threshold`` and ``rot_threshold`` clip the per-step motion of the target relative to the | ||
| current end-effector pose. They correspond to the action scale (lambda) in the FORGE paper, | ||
| which is randomized per episode as part of the dynamics randomization scheme and exposed to | ||
| the critic as privileged state. |
There was a problem hiding this comment.
The docstring uses the runtime attribute names
`pos_threshold` and `rot_threshold` (i.e., self.pos_threshold/self.rot_threshold) where the ForgeCtrlCfg docstring and all other cross-references consistently use the config field names `pos_action_threshold` and `rot_action_threshold`. A reader searching the config class for `pos_threshold` won't find it, making the cross-reference harder to follow.
| * ``pos_threshold`` and ``rot_threshold`` clip the per-step motion of the target relative to the | |
| current end-effector pose. They correspond to the action scale (lambda) in the FORGE paper, | |
| which is randomized per episode as part of the dynamics randomization scheme and exposed to | |
| the critic as privileged state. | |
| * ``pos_action_threshold`` and ``rot_action_threshold`` clip the per-step motion of the target relative to the | |
| current end-effector pose. They correspond to the action scale (lambda) in the FORGE paper, | |
| which is randomized per episode as part of the dynamics randomization scheme and exposed to | |
| the critic as privileged state. |
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
There was a problem hiding this comment.
Agreed -- updated to use the config field names consistently, keeping a note on the runtime pos_threshold/rot_threshold tensors they are applied through (those are what the code below actually reads, post-randomization).
Reference pos_action_threshold/rot_action_threshold consistently with the ForgeCtrlCfg docstring, and note the runtime tensors they are applied through. Signed-off-by: Advait Jayant <advait@vannalabs.ai>
Description
The Factory and FORGE environments share the
CtrlCfgfieldspos_action_bounds/rot_action_boundsandpos_action_threshold/rot_action_threshold, but apply them in opposite roles, because the two families use different action spaces:factory_env.py)forge_env.py)pos_action_thresholdpos_action_boundsRead side by side, this looks like the two variables were accidentally swapped in
ForgeEnv(see #5424). The FORGE implementation is, however, faithful to the FORGE paper:pos_action_bounds = 0.05).x * morx / mwithm = 1 + U(0,1) · noise_level),pos_action_threshold = 0.02andpos_threshold_noise_level = 0.25give exactly0.02 × [1/1.25, 1.25] = [1.6, 2.5] cm. The same check works for the controller gains:565 × [1/1.41, 1.41] ≈ [400, 800], matching the paper's "Controller Gains [400, 800]".Because this keeps being read as a bug, this PR documents both semantics where a reader would look for them: a class docstring on
CtrlCfg(Factory semantics), a class docstring onForgeCtrlCfg(FORGE semantics, the λ correspondence and the randomization ranges), and an expandedForgeEnv._apply_actiondocstring with the paper reference.Documentation-only change; no behavior is modified.
Related to #5424 (a detailed analysis is posted on the issue).
Type of change
Checklist
pre-commitchecks with./isaaclab.sh --formatconfig/extension.tomlfileCONTRIBUTORS.mdor my name already exists there