OMPE-92490: Fix singular rotation matrix and non-rotation quaternion#5609
Conversation
…bugs - Make create_rotation_matrix_from_view emit a valid orthonormal frame when forward is parallel to up (via alt-up substitution), and fill NaN for rows with undefined forward (eyes == targets or non-finite input). - Add unit-norm safeguard to quat_from_matrix: non-rotation input (singular, reflection, scale-error) now returns NaN instead of silent garbage. - Update Camera and RayCasterCamera set_world_poses_from_view to skip degenerate rows with a warning and raise ValueError on full-batch failure. - Filter zero-acceleration bodies in Pva acceleration visualizers (Newton + PhysX) to avoid producing degenerate inputs. - Add 11 regression tests covering singular rotation, non-finite input, partial-batch failure, reflection matrices, and scale-error matrices. - Remove the 0.1 x-nudge workaround in test_ray_caster_camera and test_multi_mesh_ray_caster_camera now that the bare (0, 0, 5) input produces a valid rotation deterministically.
There was a problem hiding this comment.
🤖 Isaac Lab Review Bot
Summary
This PR correctly addresses a mathematical edge case in rotation matrix construction when the look-at direction is parallel to the up axis, which previously produced singular matrices and subsequently garbage quaternions. The fix uses an alternate reference vector (world X-axis) for the cross product when the standard up-vector fails, and adds NaN-propagation as a fail-safe for truly undefined inputs (eye == target). The implementation is mathematically sound and well-tested.
Findings
🔵 Suggestion (math.py:1671): The alternate up vector substitution using world X-axis is correct for Y-up and Z-up stages, but the comment could be more explicit about why X is guaranteed non-parallel: "World X is non-parallel to z whenever the symptom fires" — this holds because if forward ∥ Z (or forward ∥ Y), then z_axis = -normalize(forward) ∥ Z (or Y), and X is orthogonal to both Y and Z by definition. Worth documenting for future maintainers.
🔵 Suggestion (math.py:377): The 2e-5 threshold for detecting non-unit quaternions is reasonable (2× worst-case float32 error), but consider making this a named constant (e.g., _QUAT_UNIT_NORM_TOL = 2e-5) for clarity and easier tuning if needed.
🔵 Suggestion (camera.py:336, ray_caster_camera.py:280): Both callers have nearly identical NaN-handling logic (check valid indices, warn on partial failure, raise on total failure). Consider extracting this pattern into a helper function to reduce duplication and ensure consistency.
✅ Good: The test coverage is comprehensive — 11 new tests covering:
- Parallel-to-up-axis cases (Z-up and Y-up)
- Negative up-axis alignment
- Zero forward (eye == target)
- Batched partial failure
- Non-finite input (NaN/Inf)
- Singular matrices
- Reflection matrices (det = -1)
- Scale-error matrices (non-orthonormal)
✅ Good: The removal of the 0.1 x-nudge workaround from integration tests demonstrates the fix working in practice — the camera can now be placed directly on the up-axis without needing artificial offsets.
✅ Good: The PVA visualizer fixes correctly filter zero-acceleration bodies before constructing look-at rotations, preventing the same singularity issue in the debug visualization path.
Verdict
Ship it
The mathematical fix is correct, the NaN-propagation provides a robust fail-safe, and test coverage is thorough. The suggestions above are minor improvements for code clarity — none block merging.
Greptile SummaryThis PR fixes two interrelated bugs:
Confidence Score: 4/5The core math changes and camera callers are correct and well-tested; the only gaps are in the debug-visualization path of both pva implementations. The Both pva.py files (newton and physx) would benefit from a second look: the valid_indices filter should also exclude non-finite acceleration to be consistent with the NaN-propagation protection introduced elsewhere in this PR. Important Files Changed
Reviews (1): Last reviewed commit: "OMPE-92490: Fix singular rotation matrix..." | Re-trigger Greptile |
| valid_indices = (torch.linalg.norm(accel_w, dim=-1) > 1e-5).nonzero(as_tuple=True)[0] | ||
| if valid_indices.numel() == 0: | ||
| return | ||
| pos_filtered = pos_w_torch.index_select(0, valid_indices) | ||
| accel_filtered = accel_w.index_select(0, valid_indices) | ||
| rotation_matrix = math_utils.create_rotation_matrix_from_view( | ||
| pos_filtered, | ||
| pos_filtered + accel_filtered, | ||
| up_axis=up_axis, | ||
| device=self._device, | ||
| ) | ||
| quat_opengl = math_utils.quat_from_matrix(rotation_matrix) |
There was a problem hiding this comment.
Non-finite acceleration bypasses the degenerate-input guard
torch.linalg.norm(inf_accel) evaluates to inf, which satisfies > 1e-5 and slips through the filter. When accel_w has an inf component, forward is non-finite, create_rotation_matrix_from_view returns a NaN row (via its ~torch.isfinite(forward) check), and quat_from_matrix then receives that NaN matrix. Unlike the camera callers, this code has no post-hoc isnan(rotation_matrix) guard, so the partial-NaN quaternion reaches convert_camera_frame_orientation_convention and ultimately visualize() unchecked. The same gap exists in the physx Pva file.
The fix is to add a finiteness condition to the existing filter: (torch.linalg.norm(accel_w, dim=-1) > 1e-5) & torch.isfinite(accel_w).all(dim=-1).
| logger.warning( | ||
| "set_world_poses_from_view: skipping %d pose(s) where eye equals target", | ||
| n_total - n_valid, | ||
| ) |
There was a problem hiding this comment.
The warning message only names
eye == target as the reason for skipping a row, but the updated contract of create_rotation_matrix_from_view also returns NaN for non-finite inputs (eyes or targets containing inf/NaN). A caller seeing this warning while debugging a non-finite-coordinate issue would be misled.
| logger.warning( | |
| "set_world_poses_from_view: skipping %d pose(s) where eye equals target", | |
| n_total - n_valid, | |
| ) | |
| logger.warning( | |
| "set_world_poses_from_view: skipping %d pose(s) with undefined look-at direction" | |
| " (eye equals target or non-finite coordinates)", | |
| n_total - n_valid, | |
| ) |
| logger.warning( | ||
| "set_world_poses_from_view: skipping %d pose(s) where eye equals target", | ||
| n_total - n_valid, | ||
| ) |
There was a problem hiding this comment.
Same incomplete warning message as in
camera.py: the log only mentions eye == target but NaN rotation matrices also arise from non-finite input coordinates.
| logger.warning( | |
| "set_world_poses_from_view: skipping %d pose(s) where eye equals target", | |
| n_total - n_valid, | |
| ) | |
| logger.warning( | |
| "set_world_poses_from_view: skipping %d pose(s) with undefined look-at direction" | |
| " (eye equals target or non-finite coordinates)", | |
| n_total - n_valid, | |
| ) |
…_ik (#5644) ## Summary One-character fix in `source/isaaclab/test/controllers/test_pink_ik.py:309`: ```diff - quat_from_matrix(matrix_from_quat(target_rot_tensor) * matrix_from_quat(quat_inv(current_rot))) + quat_from_matrix(matrix_from_quat(target_rot_tensor) @ matrix_from_quat(quat_inv(current_rot))) ``` `calculate_rotation_error` was composing two rotation matrices with PyTorch's element-wise multiplication (`*`) where matrix multiplication (`@`) was intended. The Hadamard product of two rotation matrices is not generally a rotation matrix. ## Why this surfaced as test failures now The bug has been latent since [#3149](#3149) (2025-08-26) because the Hadamard product of two near-identity matrices is also near-identity — `quat_from_matrix` could still recover a near-unit quaternion and the assertion `rot_error ≈ 0` would pass for completely wrong mathematical reasons. It became visible when [#5609 (jmart)](#5609) (2026-05-14) added the unit-norm guard to `isaaclab/utils/math.py:quat_from_matrix`: ```python invalid = (quat.norm(p=2, dim=-1, keepdim=True) - 1.0).abs() > 2e-5 return torch.where(invalid, torch.full_like(quat, float("nan")), quat) ``` After that PR, any non-rotation input (the Hadamard mess) returns NaN, which `axis_angle_from_quat` propagates → `torch.max(NaN) = NaN` → `AssertionError: Left hand IK rotation error (nan) exceeds tolerance`. Both hands always went to NaN; left hand is just asserted first. ## Verification Local repro on the Horde VM against current `develop` (`isaaclab_physx` backend, `newton[sim]@v1.2.0rc2`): | Configuration | Result | |---|---| | Unfixed, `Isaac-PickPlace-GR1T2-Abs-v0-horizontal_movement` | FAILED — `Left hand IK rotation error (nan)` | | Fixed, same parameterization | PASSED — rotation errors `1e-4` to `1e-7` (well within 0.02 rad tolerance) | | Fixed, all 12 GR1T2 cases, run 1 | 11 passed, 1 skipped | | Fixed, all 12 GR1T2 cases, run 2 | 11 passed, 1 skipped (deterministic) | ## Scope This addresses the consistent `Left hand IK rotation error (nan)` failures seen across recent develop PRs (e.g. [#5633 `test-curobo` log](https://github.com/isaac-sim/IsaacLab/actions/runs/25926139790/job/76211194676), [#5609 `test-curobo` log](https://github.com/isaac-sim/IsaacLab/actions/runs/25831490295/job/75897258188), [#5616 `test-curobo` log](https://github.com/isaac-sim/IsaacLab/actions/runs/25930392313/job/76222556444)). Remaining failures on G1 envs (finite ~0.03-0.05 rad rotation errors against the 0.030 rad tolerance) are a **separate** issue — IK convergence quality rather than the NaN math bug. Out of scope for this PR; needs its own ticket. ## Test plan - [x] Pre-commit clean. - [x] Unfixed branch reproduces NaN on `Isaac-PickPlace-GR1T2-Abs-v0-horizontal_movement` locally. - [x] Fixed branch passes the same parameterization locally with finite rotation errors. - [x] Fixed branch passes all 12 GR1T2 parameterizations across two consecutive runs (deterministic).
Description
When calculating the "look-at" quaternion for a camera, an orthonormal rotation matrix is first calculated using the camera's eye position, look-at target, and world up vectors:
forward = target - eye("camera forward")camera_z = -normalize(forward)("camera backward")camera_x = world_up × camera_z("camera right")camera_y = camera_z × camera_x("camera up")R = [camera_x, camera_y, camera_z ](OpenGL convention)However, if
forwardis parallel toworld_upthen the cross productcamera_xis zero, leading to a singular non-invertible matrix returned fromcreate_rotation_matrix_from_view(). Thenquat_from_matrix()would silently convert this to a non-unit quaternion and return this garbage back to the caller.This change fixes both issues as follows:
create_rotation_matrix_from_view:world_upvector and re-calculates the matrix.camera_y × camera_zwas used as the fallback, which was already zero due to the problem described above.world_upsinceworld_upis restricted to Y or Z.eye == targetor non-finite values) it now returns per-rowNaNthat the caller can detect and handle.quat_from_matrix:NaNwhen the input is not a valid rotation matrix (singular, reflection, or non-orthonormal).All callers in IsaacLab have been updated to detect
NaNwhere appropriate and fail gracefully, or avoid passing degenerate input altogether where possible.Added 11 new unit tests and removed the 0.1 x-nudge workaround from the integration tests (PR #5470 and #5380)
Type of change
Checklist
pre-commitchecks with./isaaclab.sh --formatconfig/extension.tomlfileCONTRIBUTORS.mdor my name already exists there