diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 00000000..662a1d8a --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,47 @@ +# Contributing to EmbodiChain + +Thank you for your interest in contributing to EmbodiChain! We welcome contributions from the community to help make this project better. + +## Bug report and feature requests + +### Bug Report + +If you encounter a bug, please use the **Bug Report** template to submit an issue. +* Check if the issue has already been reported. +* Use the [Bug Report Template](.github/ISSUE_TEMPLATE/bug.md) when creating a new issue. +* Provide a clear and concise description of the bug. +* Include steps to reproduce the bug, along with error messages and stack traces if applicable. + +### Feature Requests + +If you have an idea for a new feature or improvement, please use the **Proposal** template. +* Use the [Proposal Template](.github/ISSUE_TEMPLATE/proposal.md). +* Describe the feature and its core capabilities. +* Explain the motivation behind the proposal and the problem it solves. + +## Pull requests + +We welcome pull requests for bug fixes, new features, and documentation improvements. + +1. **Fork the repository** and create a new branch for your changes. +2. **Make your changes**. Please ensure your code is clean and readable. +3. **Run formatters**. We use `black` for code formatting. Please run it before submitting your PR. + ```bash + black . + ``` + > Currently, we use black==24.3.0 for formatting. Make sure to use the same version to avoid inconsistencies. +4. **Submit a Pull Request**. + * Use the [Pull Request Template](.github/PULL_REQUEST_TEMPLATE.md). + * Keep PRs small and focused. + * Include a summary of the changes and link to any relevant issues (e.g., `Fixes #123`). + * Ensure all checks pass. + +## Contribute specific robots + +TODO: Add instructions for contributing new robot models and its configurations. + +## Contribute specific environments + +To contribute a new environment: + +TODO: Add instructions for contributing new environments. \ No newline at end of file diff --git a/CONTRIBUTORSmd b/CONTRIBUTORSmd new file mode 100644 index 00000000..e21edf9c --- /dev/null +++ b/CONTRIBUTORSmd @@ -0,0 +1,55 @@ +# EmbodiChain Developers and Contributors + +This is the official list of EmbodiChain developers and contributors. We sincerely thank everyone who has contributed to the project! + + +## Organizations + +* DexForce Technology co.,Ltd. +* Chinese University of Hong Kong, Shenzhen + +--- + +## Developers + +**Simulation & Infrastructure** +* Yueci Deng +* Jian Chen +* Jietao Chen +* Haonan Yuan + +**Data Pipeline** +* Ziyan He +* Sheng Xu +* YanJun Wei +* Yunxin Tai + +**Model & Training** +* Ziyang Zhang +* Ruixiang Wang +* Yunxin Tai + +**Model Deployment** +* Jiahao Jiang +* Yunxin Tai + + +## Contributors + +* Runyi Zhao +* Lihao Zheng + + + +## Acknowledgements + +**Project Coordination** +* Guiliang Liu + +**Team of Support** +* Mingyue Yang +* Wenyi Huang +* Sen Lin +* Zheng Lu +* Rongmao Li +* Danwei Huang diff --git a/README.md b/README.md index 95314178..b4e6bc58 100644 --- a/README.md +++ b/README.md @@ -3,17 +3,17 @@ ![teaser](assets/imgs/teaser.jpg) [![License](https://img.shields.io/github/license/DexForce/EmbodiChain)](LICENSE) +[![Website](https://img.shields.io/badge/website-dexforce.com-green?logo=google-chrome&logoColor=white)](https://dexforce.com/embodichain/index.html#/) [![GitHub Pages](https://img.shields.io/badge/GitHub%20Pages-docs-blue?logo=github&logoColor=white)](https://dexforce.github.io/EmbodiChain/introduction.html) [![Python](https://img.shields.io/badge/python-3.10%20|%203.11-blue.svg)](https://docs.python.org/3/whatsnew/3.10.html) [![Version](https://img.shields.io/badge/version-0.0.1-blue.svg)](https://github.com/DexForce/EmbodiChain/releases) - --- EmbodiChain is an end-to-end, GPU-accelerated framework for Embodied AI. It streamlines research and development by unifying high-performance simulation, real-to-sim data pipelines, modular model architectures, and efficient training workflows. This integration enables rapid experimentation, seamless deployment of intelligent agents, and effective Sim2Real transfer for real-world robotic systems. > [!NOTE] > EmbodiChain is in Alpha and under active development: -> * More features will be continually added in the coming months. +> * More features will be continually added in the coming months. You can find more details in the [roadmap](https://dexforce.github.io/EmbodiChain/resources/roadmap.html). > * Since this is an early release, we welcome feedback (bug reports, feature requests, etc.) via GitHub Issues. @@ -25,6 +25,13 @@ EmbodiChain is an end-to-end, GPU-accelerated framework for Embodied AI. It stre - ⚡ **Efficient Training & Evaluation**: Online data streaming, parallel environment rollouts, and modern training paradigms. - 🧩 **Modular & Extensible**: Easily integrate new robots, environments, and learning algorithms. +The figure below illustrates the overall architecture of EmbodiChain: + +

+ architecture +

+ + ## Getting Started @@ -37,7 +44,7 @@ To get started with EmbodiChain, follow these steps: ## Citation -If you use EmbodiChain in your research, please cite: +If you find EmbodiChain helpful for your research, please consider citing our work: ```bibtex @misc{EmbodiChain, @@ -47,4 +54,15 @@ If you use EmbodiChain in your research, please cite: year = {2025}, url = {https://github.com/DexForce/EmbodiChain} } +``` + +```bibtex +@misc{GS-World, + author = {Liu, G., Deng, Y., Liu, Z., and Jia, K}, + title = {GS-World: An Efficient, Engine-driven Learning Paradigm for Pursuing Embodied Intelligence using World + Models of Generative Simulation}, + month = {October}, + year = {2025}, + journal = {TechRxiv} +} ``` \ No newline at end of file diff --git a/assets/imgs/frameworks.jpg b/assets/imgs/frameworks.jpg new file mode 100644 index 00000000..7ab1667d Binary files /dev/null and b/assets/imgs/frameworks.jpg differ diff --git a/assets/imgs/teaser.jpg b/assets/imgs/teaser.jpg index 59f12c3d..c5db34b0 100644 Binary files a/assets/imgs/teaser.jpg and b/assets/imgs/teaser.jpg differ diff --git a/configs/agents/rl/push_cube/gym_config.json b/configs/agents/rl/push_cube/gym_config.json index b78e7f37..766d12c4 100644 --- a/configs/agents/rl/push_cube/gym_config.json +++ b/configs/agents/rl/push_cube/gym_config.json @@ -1,129 +1,193 @@ -{ - "id": "PushCubeRL", - "max_episodes": 5, - "env": { - "num_envs": 128, - "sim_steps_per_control": 4, - "events": { - "randomize_cube": { - "func": "randomize_rigid_object_pose", - "mode": "reset", - "params": { - "entity_cfg": {"uid": "cube"}, - "position_range": [[-0.2, -0.2, 0.0], [0.2, 0.2, 0.0]], - "relative_position": true +{ + "id": "PushCubeRL", + "max_episodes": 5, + "env": { + "num_envs": 128, + "sim_steps_per_control": 4, + "events": { + "randomize_cube": { + "func": "randomize_rigid_object_pose", + "mode": "reset", + "params": { + "entity_cfg": { + "uid": "cube" + }, + "position_range": [ + [-0.2, -0.2, 0.0], + [0.2, 0.2, 0.0] + ], + "relative_position": true + } + }, + "randomize_goal": { + "func": "randomize_target_pose", + "mode": "reset", + "params": { + "position_range": [ + [-0.3, -0.3, 0.05], + [0.3, 0.3, 0.05] + ], + "relative_position": false, + "store_key": "goal_pose" + } + } + }, + "observations": { + "robot_qpos": { + "func": "normalize_robot_joint_data", + "mode": "modify", + "name": "robot/qpos", + "params": { + "joint_ids": [0, 1, 2, 3, 4, 5] + } + }, + "robot_ee_pos": { + "func": "get_robot_eef_pose", + "mode": "add", + "name": "robot/ee_pos", + "params": { + "part_name": "arm" + } + }, + "cube_pos": { + "func": "get_rigid_object_pose", + "mode": "add", + "name": "object/cube_pos", + "params": { + "entity_cfg": { + "uid": "cube" + } + } + }, + "goal_pos": { + "func": "target_position", + "mode": "add", + "name": "object/goal_pos", + "params": { + "target_pose_key": "goal_pose" + } + } + }, + "rewards": { + "reaching_reward": { + "func": "reaching_behind_object", + "mode": "add", + "weight": 0.1, + "params": { + "object_cfg": { + "uid": "cube" + }, + "target_pose_key": "goal_pose", + "behind_offset": 0.015, + "height_offset": 0.015, + "distance_scale": 5.0, + "part_name": "arm" + } + }, + "place_reward": { + "func": "incremental_distance_to_target", + "mode": "add", + "weight": 1.0, + "params": { + "source_entity_cfg": { + "uid": "cube" + }, + "target_pose_key": "goal_pose", + "tanh_scale": 10.0, + "positive_weight": 2.0, + "negative_weight": 0.5, + "use_xy_only": true + } + }, + "action_penalty": { + "func": "action_smoothness_penalty", + "mode": "add", + "weight": 0.01, + "params": {} + }, + "success_bonus": { + "func": "success_reward", + "mode": "add", + "weight": 10.0, + "params": {} + } + }, + "extensions": { + "obs_mode": "state", + "episode_length": 100, + "joint_limits": 0.5, + "action_scale": 0.1, + "success_threshold": 0.1 } - } }, - "observations": {}, - "extensions": { - "obs_mode": "state", - "episode_length": 100, - "joint_limits": 0.5, - "action_scale": 0.1, - "success_threshold": 0.1, - "reaching_reward_weight": 0.1, - "place_reward_weight": 2.0, - "place_penalty_weight": 0.5, - "action_penalty_weight": 0.01, - "success_bonus_weight": 10.0 - } - }, - "robot": { - "uid": "Manipulator", - "urdf_cfg": { - "components": [ - { - "component_type": "arm", - "urdf_path": "UniversalRobots/UR10/UR10.urdf", - "transform": [ - [1.0, 0.0, 0.0, 0.0], - [0.0, 1.0, 0.0, 0.0], - [0.0, 0.0, 1.0, 0.0], - [0.0, 0.0, 0.0, 1.0] - ] + "robot": { + "uid": "Manipulator", + "urdf_cfg": { + "components": [ + { + "component_type": "arm", + "urdf_path": "UniversalRobots/UR10/UR10.urdf" + }, + { + "component_type": "hand", + "urdf_path": "DH_PGI_140_80/DH_PGI_140_80.urdf" + } + ] }, - { - "component_type": "hand", - "urdf_path": "DH_PGI_140_80/DH_PGI_140_80.urdf", - "transform": [ - [1.0, 0.0, 0.0, 0.0], - [0.0, 1.0, 0.0, 0.0], - [0.0, 0.0, 1.0, 0.0], - [0.0, 0.0, 0.0, 1.0] - ] + "init_pos": [0.0, 0.0, 0.0], + "init_rot": [0.0, 0.0, 0.0], + "init_qpos": [0.0, -1.57, 1.57, -1.57, -1.57, 0.0, 0.04, 0.04], + "drive_pros": { + "drive_type": "force", + "stiffness": 100000.0, + "damping": 1000.0, + "max_velocity": 2.0, + "max_effort": 500.0 + }, + "solver_cfg": { + "arm": { + "class_type": "PytorchSolver", + "end_link_name": "ee_link", + "root_link_name": "base_link", + "tcp": [ + [1.0, 0.0, 0.0, 0.0], + [0.0, 1.0, 0.0, 0.0], + [0.0, 0.0, 1.0, 0.16], + [0.0, 0.0, 0.0, 1.0] + ] + } + }, + "control_parts": { + "arm": ["JOINT[1-6]"] } - ] - }, - "init_pos": [0.0, 0.0, 0.0], - "init_rot": [0.0, 0.0, 0.0], - "init_qpos": [0.0, -1.57, 1.57, -1.57, -1.57, 0.0, 0.04, 0.04], - "drive_pros": { - "drive_type": "force", - "stiffness": 100000.0, - "damping": 1000.0, - "max_velocity": 2.0, - "max_effort": 500.0 - }, - "solver_cfg": { - "arm": { - "class_type": "PytorchSolver", - "end_link_name": "ee_link", - "root_link_name": "base_link", - "tcp": [ - [1.0, 0.0, 0.0, 0.0], - [0.0, 1.0, 0.0, 0.0], - [0.0, 0.0, 1.0, 0.16], - [0.0, 0.0, 0.0, 1.0] - ] - } }, - "control_parts": { - "arm": ["JOINT[1-6]"] - } - }, - "sensor": [], - "light": { - }, - "background": [ - { - "uid": "goal_sphere", - "shape": { - "shape_type": "Sphere", - "radius": 0.02 - }, - "body_type": "kinematic", - "init_pos": [-0.9, -0.6, 0.05], - "attrs": { - "enable_collision": false, - "mass": 0.0 - } - } - ], - "rigid_object": [ - { - "uid": "cube", - "shape": { - "shape_type": "Cube", - "size": [0.1, 0.1, 0.1] - }, - "body_type": "dynamic", - "init_pos": [-0.6, -0.4, 0.05], - "attrs": { - "mass": 10.0, - "static_friction": 3.0, - "dynamic_friction": 2.0, - "linear_damping": 2.0, - "angular_damping": 2.0, - "contact_offset": 0.003, - "rest_offset": 0.001, - "restitution": 0.1, - "max_depenetration_velocity": 10.0, - "max_linear_velocity": 1.0, - "max_angular_velocity": 1.0 - } - } - ], - "rigid_object_group": [], - "articulation": [] + "sensor": [], + "light": {}, + "background": [], + "rigid_object": [ + { + "uid": "cube", + "shape": { + "shape_type": "Cube", + "size": [0.1, 0.1, 0.1] + }, + "body_type": "dynamic", + "init_pos": [-0.6, -0.4, 0.05], + "attrs": { + "mass": 10.0, + "static_friction": 3.0, + "dynamic_friction": 2.0, + "linear_damping": 2.0, + "angular_damping": 2.0, + "contact_offset": 0.003, + "rest_offset": 0.001, + "restitution": 0.1, + "max_depenetration_velocity": 10.0, + "max_linear_velocity": 1.0, + "max_angular_velocity": 1.0 + } + } + ], + "rigid_object_group": [], + "articulation": [] } diff --git a/configs/agents/rl/push_cube/train_config.json b/configs/agents/rl/push_cube/train_config.json index f1558fda..3c3bda05 100644 --- a/configs/agents/rl/push_cube/train_config.json +++ b/configs/agents/rl/push_cube/train_config.json @@ -1,64 +1,67 @@ -{ - "trainer": { - "exp_name": "push_cube_ppo", - "gym_config": "configs/agents/rl/push_cube/gym_config.json", - "seed": 42, - "device": "cuda:0", - "headless": true, - "iterations": 1000, - "rollout_steps": 1024, - "eval_freq": 2, - "save_freq": 200, - "use_wandb": false, - "wandb_project_name": "embodychain-push_cube", - "events": { - "eval": { - "record_camera": { - "func": "record_camera_data_async", - "mode": "interval", - "interval_step": 1, - "params": { - "name": "main_cam", - "resolution": [640, 480], - "eye": [-1.4, 1.4, 2.0], - "target": [0, 0, 0], - "up": [0, 0, 1], - "intrinsics": [600, 600, 320, 240], - "save_path": "./outputs/videos/eval" - } +{ + "trainer": { + "exp_name": "push_cube_ppo", + "gym_config": "configs/agents/rl/push_cube/gym_config.json", + "seed": 42, + "device": "cuda:0", + "headless": false, + "enable_rt": false, + "gpu_id": 0, + "num_envs": 8, + "iterations": 1000, + "rollout_steps": 1024, + "eval_freq": 200, + "save_freq": 200, + "use_wandb": true, + "wandb_project_name": "embodychain-push_cube", + "events": { + "eval": { + "record_camera": { + "func": "record_camera_data_async", + "mode": "interval", + "interval_step": 1, + "params": { + "name": "main_cam", + "resolution": [640, 480], + "eye": [-1.4, 1.4, 2.0], + "target": [0, 0, 0], + "up": [0, 0, 1], + "intrinsics": [600, 600, 320, 240], + "save_path": "./outputs/videos/eval" + } + } + } } - } - } - }, - "policy": { - "name": "actor_critic", - "actor": { - "type": "mlp", - "network_cfg": { - "hidden_sizes": [256, 256], - "activation": "relu" - } }, - "critic": { - "type": "mlp", - "network_cfg": { - "hidden_sizes": [256, 256], - "activation": "relu" - } - } - }, - "algorithm": { - "name": "ppo", - "cfg": { - "learning_rate": 0.0001, - "n_epochs": 10, - "batch_size": 8192, - "gamma": 0.99, - "gae_lambda": 0.95, - "clip_coef": 0.2, - "ent_coef": 0.01, - "vf_coef": 0.5, - "max_grad_norm": 0.5 + "policy": { + "name": "actor_critic", + "actor": { + "type": "mlp", + "network_cfg": { + "hidden_sizes": [256, 256], + "activation": "relu" + } + }, + "critic": { + "type": "mlp", + "network_cfg": { + "hidden_sizes": [256, 256], + "activation": "relu" + } + } + }, + "algorithm": { + "name": "ppo", + "cfg": { + "learning_rate": 0.0001, + "n_epochs": 10, + "batch_size": 8192, + "gamma": 0.99, + "gae_lambda": 0.95, + "clip_coef": 0.2, + "ent_coef": 0.01, + "vf_coef": 0.5, + "max_grad_norm": 0.5 + } } - } -} \ No newline at end of file +} diff --git a/configs/agents/rl/push_t/gym_config.json b/configs/agents/rl/push_t/gym_config.json new file mode 100644 index 00000000..7bb1216d --- /dev/null +++ b/configs/agents/rl/push_t/gym_config.json @@ -0,0 +1,193 @@ +{ + "id": "PushTRL", + "max_episodes": 5, + "env": { + "num_envs": 128, + "sim_steps_per_control": 4, + "events": { + "randomize_t": { + "func": "randomize_rigid_object_pose", + "mode": "reset", + "params": { + "entity_cfg": { + "uid": "t" + }, + "position_range": [ + [-0.2, -0.2, 0.0], + [0.2, 0.2, 0.0] + ], + "relative_position": true + } + }, + "randomize_goal": { + "func": "randomize_target_pose", + "mode": "reset", + "params": { + "position_range": [ + [-0.3, -0.3, 0.05], + [0.3, 0.3, 0.05] + ], + "relative_position": false, + "store_key": "goal_pose" + } + } + }, + "observations": { + "robot_qpos": { + "func": "normalize_robot_joint_data", + "mode": "modify", + "name": "robot/qpos", + "params": { + "joint_ids": [0, 1, 2, 3, 4, 5] + } + }, + "robot_ee_pos": { + "func": "get_robot_eef_pose", + "mode": "add", + "name": "robot/ee_pos", + "params": { + "part_name": "arm" + } + }, + "t_pos": { + "func": "get_rigid_object_pose", + "mode": "add", + "name": "object/t_pos", + "params": { + "entity_cfg": { + "uid": "t" + } + } + }, + "goal_pos": { + "func": "target_position", + "mode": "add", + "name": "object/goal_pos", + "params": { + "target_pose_key": "goal_pose" + } + } + }, + "rewards": { + "reaching_reward": { + "func": "reaching_behind_object", + "mode": "add", + "weight": 0.1, + "params": { + "object_cfg": { + "uid": "t" + }, + "target_pose_key": "goal_pose", + "behind_offset": 0.015, + "height_offset": 0.015, + "distance_scale": 5.0, + "part_name": "arm" + } + }, + "place_reward": { + "func": "incremental_distance_to_target", + "mode": "add", + "weight": 1.0, + "params": { + "source_entity_cfg": { + "uid": "t" + }, + "target_pose_key": "goal_pose", + "tanh_scale": 10.0, + "positive_weight": 2.0, + "negative_weight": 0.5, + "use_xy_only": true + } + }, + "action_penalty": { + "func": "action_smoothness_penalty", + "mode": "add", + "weight": 0.01, + "params": {} + }, + "success_bonus": { + "func": "success_reward", + "mode": "add", + "weight": 10.0, + "params": {} + } + }, + "extensions": { + "obs_mode": "state", + "episode_length": 100, + "joint_limits": 0.5, + "action_scale": 0.1, + "success_threshold": 0.1 + } + }, + "robot": { + "uid": "Manipulator", + "urdf_cfg": { + "components": [ + { + "component_type": "arm", + "urdf_path": "UniversalRobots/UR10/UR10.urdf" + }, + { + "component_type": "hand", + "urdf_path": "DH_PGI_140_80/DH_PGI_140_80.urdf" + } + ] + }, + "init_pos": [0.0, 0.0, 0.0], + "init_rot": [0.0, 0.0, 0.0], + "init_qpos": [0.0, -1.57, 1.57, -1.57, -1.57, 0.0, 0.04, 0.04], + "drive_pros": { + "drive_type": "force", + "stiffness": 100000.0, + "damping": 1000.0, + "max_velocity": 2.0, + "max_effort": 500.0 + }, + "solver_cfg": { + "arm": { + "class_type": "PytorchSolver", + "end_link_name": "ee_link", + "root_link_name": "base_link", + "tcp": [ + [1.0, 0.0, 0.0, 0.0], + [0.0, 1.0, 0.0, 0.0], + [0.0, 0.0, 1.0, 0.16], + [0.0, 0.0, 0.0, 1.0] + ] + } + }, + "control_parts": { + "arm": ["JOINT[1-6]"] + } + }, + "sensor": [], + "light": {}, + "background": [], + "rigid_object": [ + { + "uid": "t", + "shape": { + "shape_type": "Cube", + "size": [0.1, 0.1, 0.1] + }, + "body_type": "dynamic", + "init_pos": [-0.6, -0.4, 0.05], + "attrs": { + "mass": 10.0, + "static_friction": 3.0, + "dynamic_friction": 2.0, + "linear_damping": 2.0, + "angular_damping": 2.0, + "contact_offset": 0.003, + "rest_offset": 0.001, + "restitution": 0.1, + "max_depenetration_velocity": 10.0, + "max_linear_velocity": 1.0, + "max_angular_velocity": 1.0 + } + } + ], + "rigid_object_group": [], + "articulation": [] +} diff --git a/configs/agents/rl/push_t/train_config.json b/configs/agents/rl/push_t/train_config.json new file mode 100644 index 00000000..98d4aac2 --- /dev/null +++ b/configs/agents/rl/push_t/train_config.json @@ -0,0 +1,67 @@ +{ + "trainer": { + "exp_name": "push_t_ppo", + "gym_config": "configs/agents/rl/push_t/gym_config.json", + "seed": 42, + "device": "cuda:0", + "headless": true, + "enable_rt": false, + "gpu_id": 0, + "num_envs": 8, + "iterations": 1000, + "rollout_steps": 1024, + "eval_freq": 200, + "save_freq": 200, + "use_wandb": true, + "wandb_project_name": "embodychain-push_t", + "events": { + "eval": { + "record_camera": { + "func": "record_camera_data_async", + "mode": "interval", + "interval_step": 1, + "params": { + "name": "main_cam", + "resolution": [640, 480], + "eye": [-1.4, 1.4, 2.0], + "target": [0, 0, 0], + "up": [0, 0, 1], + "intrinsics": [600, 600, 320, 240], + "save_path": "./outputs/videos/eval" + } + } + } + } + }, + "policy": { + "name": "actor_critic", + "actor": { + "type": "mlp", + "network_cfg": { + "hidden_sizes": [256, 256], + "activation": "relu" + } + }, + "critic": { + "type": "mlp", + "network_cfg": { + "hidden_sizes": [256, 256], + "activation": "relu" + } + } + }, + "algorithm": { + "name": "ppo", + "cfg": { + "learning_rate": 0.0001, + "n_epochs": 10, + "batch_size": 8192, + "gamma": 0.99, + "gae_lambda": 0.95, + "clip_coef": 0.2, + "ent_coef": 0.01, + "vf_coef": 0.5, + "max_grad_norm": 0.5 + } + } +} diff --git a/docs/source/features/interaction/index.rst b/docs/source/features/interaction/index.rst new file mode 100644 index 00000000..7640c1f1 --- /dev/null +++ b/docs/source/features/interaction/index.rst @@ -0,0 +1,10 @@ +Interactive Simulation +====================== + +The Interactive Simulation module provides tools and interfaces for interacting with the simulation environment, including window management, input handling, and real-time control of simulated assets. + +.. toctree:: + :maxdepth: 2 + + Window interaction + \ No newline at end of file diff --git a/docs/source/features/interaction/window.md b/docs/source/features/interaction/window.md new file mode 100644 index 00000000..6c512186 --- /dev/null +++ b/docs/source/features/interaction/window.md @@ -0,0 +1,47 @@ +# Window interaction + +This section describes the default window interaction controls available in the simulation. These controls allow users to interact with the simulation environment using keyboard, mouse, and customizable input events. + +## Default Window Events + +The simulation window comes with a set of default controls that enable users to perform various actions, such as selecting objects, manipulating the camera view, and triggering specific events. These controls are implemented using the `ObjectManipulator` class (provided by `dexsim`). + +| Events | Description | +|---------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------| +| **Raycast Information Display** | Press the right mouse button to select a point and the 'C' key to print the raycast distance and hit position of a surface (world coordinates) to the console. Useful for debugging and checking the position of objects in the simulation. | + +> **Note:** We will add more interaction features in future releases. Stay tuned for updates! + +## Customizing Window Events + +Users can create their own custom window interaction controls by subclassing the `ObjectManipulator` class. This allows for the implementation of specific behaviors and responses to user inputs. + +Here's an example of how to create a custom window event that responds to key presses: + +```python +from dexsim.engine import ObjectManipulator +from dexsim.types import InputKey + +class CustomWindowEvent(ObjectManipulator): + def on_key_down(self, key): + if key == InputKey.SPACE.value: + print("Space key pressed!") + + +# Assuming you already have a SimulationManager instance called `sim_manager` +# (for example, created elsewhere in your code): +# sim_manager = SimulationManager(...) + +# Register the custom window event handler with the simulation: +sim_manager.add_custom_window_control(CustomWindowEvent()) +``` + +The functions table below summarizes the key methods available in the `ObjectManipulator` class for customizing window events: +| Method | Description | +|----------------------|---------------------------------------------------------------------------------------------------| +| `on_key_down(key)` | Triggered when a key is pressed down. The `key` parameter indicates which key was pressed. | +| `on_key_up(key)` | Triggered when a key is released. The `key` parameter indicates which key was released. | +| `on_mouse_moved(x, y)`| Triggered when the mouse is moved. The `x` and `y` parameters indicate the new mouse position. | +| `on_mouse_down(button, x, y)` | Triggered when a mouse button is pressed. The `button` parameter indicates which button was pressed, and `x`, `y` indicate the mouse position. | +| `on_mouse_up(button, x, y)` | Triggered when a mouse button is released. The `button` parameter indicates which button was released, and `x`, `y` indicate the mouse position. | +| `on_mouse_wheel(delta)` | Triggered when the mouse wheel is scrolled. The `delta` parameter indicates the amount of scroll. | diff --git a/docs/source/index.rst b/docs/source/index.rst index 45814c42..c3ef7343 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -29,7 +29,6 @@ Table of Contents overview/sim/index overview/gym/index - overview/vla/index overview/rl/index .. toctree:: @@ -38,6 +37,7 @@ Table of Contents :glob: features/workspace_analyzer/index* + features/interaction/index* .. toctree:: :maxdepth: 1 diff --git a/docs/source/introduction.rst b/docs/source/introduction.rst index 07807094..8631a708 100644 --- a/docs/source/introduction.rst +++ b/docs/source/introduction.rst @@ -9,8 +9,6 @@ EmbodiChain .. image:: ../../assets/imgs/teaser.jpg :alt: teaser -📘 `Documentation `_ - --- EmbodiChain is an end-to-end, GPU-accelerated framework for Embodied AI. It streamlines research and development by unifying high-performance simulation, real-to-sim data pipelines, modular model architectures, and efficient training workflows. This integration enables rapid experimentation, seamless deployment of intelligent agents, and effective Sim2Real transfer for real-world robotic systems. @@ -18,7 +16,7 @@ EmbodiChain is an end-to-end, GPU-accelerated framework for Embodied AI. It stre .. NOTE:: EmbodiChain is in Alpha and under active development: - * More features will be continually added in the coming months. + * More features will be continually added in the coming months. You can find more details in the `roadmap `_. * Since this is an early release, we welcome feedback (bug reports, feature requests, etc.) via GitHub Issues. @@ -31,6 +29,11 @@ Key Features * ⚡ **Efficient Training & Evaluation**: Online data streaming, parallel environment rollouts, and modern training paradigms. * 🧩 **Modular & Extensible**: Easily integrate new robots, environments, and learning algorithms. +The figure below illustrates the overall architecture of EmbodiChain: + +.. image:: ../../assets/imgs/frameworks.jpg + :alt: frameworks + Getting Started --------------- @@ -44,7 +47,7 @@ To get started with EmbodiChain, follow these steps: Citation -------- -If you use EmbodiChain in your research, please cite: +If you find EmbodiChain helpful for your research, please consider citing our work: .. code-block:: bibtex @@ -55,3 +58,15 @@ If you use EmbodiChain in your research, please cite: year = {2025}, url = {https://github.com/DexForce/EmbodiChain} } + +.. code-block:: bibtex + + @misc{GS-World, + author = {Liu, G., Deng, Y., Liu, Z., and Jia, K}, + title = {GS-World: An Efficient, Engine-driven Learning Paradigm for Pursuing Embodied Intelligence using World + Models of Generative Simulation}, + month = {October}, + year = {2025}, + journal = {TechRxiv} + } + diff --git a/docs/source/overview/gym/env.md b/docs/source/overview/gym/env.md new file mode 100644 index 00000000..a15d26f9 --- /dev/null +++ b/docs/source/overview/gym/env.md @@ -0,0 +1,222 @@ +# Embodied Environments + +```{currentmodule} embodichain.lab.gym +``` + +The {class}`~envs.EmbodiedEnv` is the core environment class in EmbodiChain designed for complex Embodied AI tasks. It adopts a **configuration-driven** architecture, allowing users to define robots, sensors, objects, lighting, and automated behaviors (events) purely through configuration classes, minimizing the need for boilerplate code. + +## Core Architecture + +Unlike the standard {class}`~envs.BaseEnv`, the {class}`~envs.EmbodiedEnv` integrates several manager systems to handle the complexity of simulation: + +* **Scene Management**: Automatically loads and manages robots, sensors, and scene objects defined in the configuration. +* **Event Manager**: Handles automated behaviors such as domain randomization, scene setup, and dynamic asset swapping. +* **Observation Manager**: Allows flexible extension of observation spaces without modifying the environment code. +* **Dataset Manager**: Built-in support for collecting demonstration data during simulation steps. + +## Configuration System + +The environment is defined by inheriting from {class}`~envs.EmbodiedEnvCfg`. This configuration class serves as the single source of truth for the scene description. + +{class}`~envs.EmbodiedEnvCfg` inherits from {class}`~envs.EnvCfg` (the base environment configuration class, sometimes referred to as `BaseEnvCfg`), which provides fundamental environment parameters. The following sections describe both the base class parameters and the additional parameters specific to {class}`~envs.EmbodiedEnvCfg`. + +### BaseEnvCfg Parameters + +Since {class}`~envs.EmbodiedEnvCfg` inherits from {class}`~envs.EnvCfg`, it includes the following base parameters: + +* **num_envs** (int): + The number of sub environments (arenas) to be simulated in parallel. Defaults to ``1``. + +* **sim_cfg** ({class}`~embodichain.lab.sim.SimulationManagerCfg`): + Simulation configuration for the environment, including physics settings, device selection, and rendering options. Defaults to a basic configuration with headless mode enabled. + +* **seed** (int | None): + The seed for the random number generator. Defaults to ``None``, in which case the seed is not set. The seed is set at the beginning of the environment initialization to ensure deterministic behavior across different runs. + +* **sim_steps_per_control** (int): + Number of simulation steps per control (environment) step. This parameter determines the relationship between the simulation timestep and the control timestep. For instance, if the simulation dt is 0.01s and the control dt is 0.1s, then ``sim_steps_per_control`` should be 10. This means that the control action is updated every 10 simulation steps. Defaults to ``4``. + +* **ignore_terminations** (bool): + Whether to ignore terminations when deciding when to auto reset. Terminations can be caused by the task reaching a success or fail state as defined in a task's evaluation function. If set to ``False``, episodes will stop early when termination conditions are met. If set to ``True``, episodes will only stop due to the timelimit, which is useful for modeling tasks as infinite horizon. Defaults to ``False``. + +### EmbodiedEnvCfg Parameters + +The {class}`~envs.EmbodiedEnvCfg` class exposes the following additional parameters: + +* **robot** ({class}`~embodichain.lab.sim.cfg.RobotCfg`): + Defines the agent in the scene. Supports loading robots from URDF/MJCF with specified initial state and control mode. This is a required field. + +* **sensor** (List[{class}`~embodichain.lab.sim.sensor.SensorCfg`]): + A list of sensors attached to the scene or robot. Common sensors include {class}`~embodichain.lab.sim.sensors.StereoCamera` for RGB-D and segmentation data. Defaults to an empty list. + +* **light** ({class}`~envs.EmbodiedEnvCfg.EnvLightCfg`): + Configures the lighting environment. The {class}`EnvLightCfg` class contains: + + * ``direct``: List of direct light sources (Point, Spot, Directional) affecting local illumination. Defaults to an empty list. + * ``indirect``: Global illumination settings (Ambient, IBL) - *planned for future release*. + +* **rigid_object** (List[{class}`~embodichain.lab.sim.cfg.RigidObjectCfg`]): + List of dynamic or kinematic simple bodies. Defaults to an empty list. + +* **rigid_object_group** (List[{class}`~embodichain.lab.sim.cfg.RigidObjectGroupCfg`]): + Collections of rigid objects that can be managed together. Efficient for many similar objects. Defaults to an empty list. + +* **articulation** (List[{class}`~embodichain.lab.sim.cfg.ArticulationCfg`]): + List of complex mechanisms with joints (doors, drawers). Defaults to an empty list. + +* **background** (List[{class}`~embodichain.lab.sim.cfg.RigidObjectCfg`]): + Static or kinematic objects serving as obstacles or landmarks in the scene. Defaults to an empty list. + +* **events** (Union[object, None]): + Event settings for domain randomization and automated behaviors. Defaults to None, in which case no events are applied through the event manager. Please refer to the {class}`~envs.managers.EventManager` class for more details. + +* **observations** (Union[object, None]): + Custom observation specifications. Defaults to None, in which case no additional observations are applied through the observation manager. Please refer to the {class}`~envs.managers.ObservationManager` class for more details. + +* **dataset** (Union[object, None]): + Dataset collection settings. Defaults to None, in which case no dataset collection is performed. Please refer to the {class}`~envs.managers.DatasetManager` class for more details. + +* **extensions** (Union[Dict[str, Any], None]): + Task-specific extension parameters that are automatically bound to the environment instance. This allows passing custom parameters (e.g., ``episode_length``, ``obs_mode``, ``action_scale``) without modifying the base configuration class. These parameters are accessible as instance attributes after environment initialization. For example, if ``extensions = {"episode_length": 500}``, you can access it via ``self.episode_length``. Defaults to None. + +* **filter_visual_rand** (bool): + Whether to filter out visual randomization functors. Useful for debugging motion and physics issues when visual randomization interferes with the debugging process. Defaults to ``False``. + +### Example Configuration + +```python +from embodichain.lab.gym.envs import EmbodiedEnv, EmbodiedEnvCfg +from embodichain.utils import configclass + +@configclass +class MyTaskEnvCfg(EmbodiedEnvCfg): + # 1. Define Scene Components + robot = ... # Robot configuration + sensor = [...] # List of sensors (e.g., Cameras) + light = ... # Lighting configuration + + # 2. Define Objects + rigid_object = [...] # Dynamic objects (e.g., tools, debris) + rigid_object_group = [...] # Object groups (efficient for many similar objects) + articulation = [...] # Articulated objects (e.g., cabinets) + + # 3. Define Managers + events = ... # Event settings (Randomization, etc.) + observations = ... # Custom observation spec + dataset = ... # Data collection settings + + # 4. Task Extensions + extensions = { # Task-specific parameters + "episode_length": 500, + "obs_mode": "state", + } +``` + +## Manager Systems + +The manager systems in {class}`~envs.EmbodiedEnv` provide modular, configuration-driven functionality for handling complex simulation behaviors. Each manager uses a **functor-based** architecture, allowing you to compose behaviors through configuration without modifying environment code. Functors are reusable functions or classes (inheriting from {class}`~envs.managers.Functor`) that operate on the environment state, configured through {class}`~envs.managers.cfg.FunctorCfg`. + +### Event Manager + +The Event Manager automates changes in the environment through event functors. Events can be triggered at different stages: + +* **startup**: Executed once when the environment initializes. Useful for setting up initial scene properties that don't change during episodes. +* **reset**: Executed every time {meth}`~envs.Env.reset()` is called. Applied to specific environments that need resetting (via ``env_ids`` parameter). This is the most common mode for domain randomization. +* **interval**: Executed periodically every N steps (specified by ``interval_step``, defaults to 10). Can be configured per-environment (``is_global=False``) or globally synchronized (``is_global=True``). + +Event functors are configured using {class}`~envs.managers.cfg.EventCfg`. For a complete list of available event functors, please refer to {doc}`event_functors`. + +### Observation Manager + +While {class}`~envs.EmbodiedEnv` provides default observations organized into two groups: + +* **robot**: Contains ``qpos`` (joint positions), ``qvel`` (joint velocities), and ``qf`` (joint forces). +* **sensor**: Contains raw sensor outputs (images, depth, segmentation masks, etc.). + +The Observation Manager allows you to extend the observation space with task-specific information. Observations are configured using {class}`~envs.managers.cfg.ObservationCfg` with two operation modes: + +* **modify**: Update existing observations in-place. The observation must already exist in the observation dictionary. Useful for normalization, transformation, or filtering existing data. Example: Normalize joint positions to [0, 1] range based on joint limits. +* **add**: Compute and add new observations to the observation space. The observation name can use hierarchical keys separated by ``/`` (e.g., ``"object/fork/pose"``). + +For a complete list of available observation functors, please refer to {doc}`observation_functors`. + +### Dataset Manager + +For Imitation Learning (IL) tasks, the Dataset Manager automates data collection through dataset functors. It currently supports: + +* **LeRobot Format** (via {class}`~envs.managers.datasets.LeRobotRecorder`): + Standard format for LeRobot training pipelines. Includes support for task instructions, robot metadata, success flags, and optional video recording. + +```{note} +Additional dataset formats (HDF5, Zarr) are planned for future releases. +``` + +The manager operates in a single mode ``"save"`` which handles both recording and auto-saving: + +* **Recording**: On each environment step, observation-action pairs are buffered in memory. +* **Auto-saving**: When ``dones=True`` (episode completion), completed episodes are automatically saved to disk with proper formatting. + +**Configuration options include:** + * ``save_path``: Root directory for saving datasets. + * ``robot_meta``: Robot metadata dictionary (required for LeRobot format). + * ``instruction``: Task instruction dictionary. + * ``use_videos``: Whether to save video recordings of episodes. + * ``export_success_only``: Filter to save only successful episodes (based on ``info["success"]``). + +The dataset manager is called automatically during {meth}`~envs.Env.step()`, ensuring all observation-action pairs are recorded without additional user code. + +## Creating a Custom Task + +To create a new task, inherit from {class}`~envs.EmbodiedEnv` and implement the task-specific logic. + +```python +from embodichain.lab.gym.envs import EmbodiedEnv, EmbodiedEnvCfg +from embodichain.lab.gym.utils.registration import register_env + +@register_env("MyTask-v0", max_episode_steps=500) +class MyTaskEnv(EmbodiedEnv): + def __init__(self, cfg: MyTaskEnvCfg, **kwargs): + super().__init__(cfg, **kwargs) + + def create_demo_action_list(self, *args, **kwargs): + # Optional: Implement for expert demonstration data generation (for Imitation Learning) + # This method is used to generate scripted demonstrations for IL data collection. + # Must set self.action_length = len(action_list) if returning actions + pass + + def is_task_success(self, **kwargs): + # Optional: Define success criteria (mainly for IL data collection) + # Returns: torch.Tensor of shape (num_envs,) with boolean values + return success_tensor + + def get_reward(self, obs, action, info): + # Optional: Override for RL tasks + # Returns: torch.Tensor of shape (num_envs,) + return super().get_reward(obs, action, info) + + def get_info(self, **kwargs): + # Optional: Override to add custom info fields + # Should include "success" and "fail" keys for termination + info = super().get_info(**kwargs) + info["custom_metric"] = ... + return info +``` + +```{note} +The {meth}`~envs.EmbodiedEnv.create_demo_action_list` method is specifically designed for expert demonstration data generation in Imitation Learning scenarios. For Reinforcement Learning tasks, you should override the {meth}`~envs.EmbodiedEnv.get_reward` method instead. +``` + +For a complete example of a modular environment setup, please refer to the {ref}`tutorial_modular_env` tutorial. + +## See Also + +- {ref}`tutorial_create_basic_env` - Creating basic environments +- {ref}`tutorial_modular_env` - Advanced modular environment setup +- {doc}`/api_reference/embodichain/embodichain.lab.gym.envs` - Complete API reference for EmbodiedEnv and EmbodiedEnvCfg + +```{toctree} +:maxdepth: 1 + +event_functors.md +observation_functors.md +``` diff --git a/docs/source/overview/gym/event_functors.md b/docs/source/overview/gym/event_functors.md new file mode 100644 index 00000000..efd8eb3d --- /dev/null +++ b/docs/source/overview/gym/event_functors.md @@ -0,0 +1,85 @@ +# Event Functors + +```{currentmodule} embodichain.lab.gym.envs.managers +``` + +This page lists all available event functors that can be used with the Event Manager. Event functors are configured using {class}`~envs.managers.cfg.EventCfg` and can be triggered at different stages: ``startup``, ``reset``, or ``interval``. + +## Physics Randomization + +```{list-table} Physics Randomization Functors +:header-rows: 1 +:widths: 30 70 + +* - Functor Name + - Description +* - ``randomize_rigid_object_mass`` + - Randomize object masses within a specified range. Supports both absolute and relative mass randomization. +``` + +## Visual Randomization + +```{list-table} Visual Randomization Functors +:header-rows: 1 +:widths: 30 70 + +* - Functor Name + - Description +* - ``randomize_visual_material`` + - Randomize textures, base colors, and material properties (metallic, roughness, IOR). Implemented as a Functor class. Supports both RigidObject and Articulation assets. +* - ``randomize_light`` + - Vary light position, color, and intensity within specified ranges. +* - ``randomize_camera_extrinsics`` + - Randomize camera poses for viewpoint diversity. Supports both attach mode (pos/euler perturbation) and look_at mode (eye/target/up perturbation). +* - ``randomize_camera_intrinsics`` + - Vary focal length (fx, fy) and principal point (cx, cy) within specified ranges. +``` + +## Spatial Randomization + +```{list-table} Spatial Randomization Functors +:header-rows: 1 +:widths: 30 70 + +* - Functor Name + - Description +* - ``randomize_rigid_object_pose`` + - Randomize object positions and orientations. Supports both relative and absolute pose randomization. +* - ``randomize_robot_eef_pose`` + - Vary end-effector initial poses by solving inverse kinematics. The randomization is performed relative to the current end-effector pose. +* - ``randomize_robot_qpos`` + - Randomize robot joint configurations. Supports both relative and absolute joint position randomization, and can target specific joints. +``` + +## Asset Management + +```{list-table} Asset Management Functors +:header-rows: 1 +:widths: 30 70 + +* - Functor Name + - Description +* - ``replace_assets_from_group`` + - Swap object models from a folder on reset for visual diversity. Currently supports RigidObject assets with mesh-based shapes. +* - ``prepare_extra_attr`` + - Set up additional object attributes dynamically. Supports both static values and callable functions. Useful for setting up affordance data and other custom attributes. +``` + +## Usage Example + +```python +from embodichain.lab.gym.envs.managers.cfg import EventCfg, SceneEntityCfg + +# Example: Randomize object mass on reset +events = { + "randomize_mass": EventCfg( + func="randomize_rigid_object_mass", + mode="reset", + params={ + "entity_cfg": SceneEntityCfg(uid="cube"), + "mass_range": (0.1, 2.0), + "relative": False, + }, + ), +} +``` diff --git a/docs/source/overview/gym/index.rst b/docs/source/overview/gym/index.rst index b0fe003d..e72a20fb 100644 --- a/docs/source/overview/gym/index.rst +++ b/docs/source/overview/gym/index.rst @@ -1,6 +1,26 @@ -Embodied Environments -================== +Gym +=================== -*To be completed by adding a detailed description of the Embodied Environments.* +.. currentmodule:: embodichain.lab.gym +The ``gym`` module provides a comprehensive framework for creating robot learning environments. It extends the Gymnasium interface to support multi-environment parallel execution, custom observations, and robotic-specific functionality. +Environment Classes +------------------- + +Base Environments +~~~~~~~~~~~~~~~~~ + +- :class:`envs.BaseEnv` - Foundational environment class that provides core functionality for all EmbodiChain RL environments +- :class:`envs.EnvCfg` - Configuration class for basic environment settings + +Embodied Environments +~~~~~~~~~~~~~~~~~~~~~ + +- :class:`envs.EmbodiedEnv` - Advanced environment class for complex Embodied AI tasks with configuration-driven architecture +- :class:`envs.EmbodiedEnvCfg` - Configuration class for Embodied Environments + +.. toctree:: + :maxdepth: 1 + + env.md \ No newline at end of file diff --git a/docs/source/overview/gym/observation_functors.md b/docs/source/overview/gym/observation_functors.md new file mode 100644 index 00000000..f55252f8 --- /dev/null +++ b/docs/source/overview/gym/observation_functors.md @@ -0,0 +1,93 @@ +# Observation Functors + +```{currentmodule} embodichain.lab.gym.envs.managers +``` + +This page lists all available observation functors that can be used with the Observation Manager. Observation functors are configured using {class}`~cfg.ObservationCfg` and can operate in two modes: ``modify`` (update existing observations) or ``add`` (add new observations). + +## Pose Computations + +```{list-table} Pose Computation Functors +:header-rows: 1 +:widths: 30 70 + +* - Functor Name + - Description +* - ``get_rigid_object_pose`` + - Get world poses of rigid objects. Returns 4x4 transformation matrices of shape (num_envs, 4, 4). If the object doesn't exist, returns a zero tensor. +* - ``get_sensor_pose_in_robot_frame`` + - Transform sensor poses to robot coordinate frame. Returns pose as [x, y, z, qw, qx, qy, qz] of shape (num_envs, 7). +``` + +## Sensor Information + +```{list-table} Sensor Information Functors +:header-rows: 1 +:widths: 30 70 + +* - Functor Name + - Description +* - ``get_sensor_intrinsics`` + - Get the intrinsic matrix of a camera sensor. Returns 3x3 intrinsic matrices of shape (num_envs, 3, 3). For stereo cameras, supports selecting left or right camera intrinsics. +* - ``compute_semantic_mask`` + - Compute semantic masks from camera segmentation masks. Returns masks of shape (num_envs, height, width, 3) with channels for robot, background, and foreground objects. +``` + +## Keypoint Projections + +```{list-table} Keypoint Projection Functors +:header-rows: 1 +:widths: 30 70 + +* - Functor Name + - Description +* - ``compute_exteroception`` + - Project 3D keypoints (affordance poses, robot parts) onto camera image planes. Supports multiple sources: affordance poses from objects (e.g., grasp poses, place poses) and robot control part poses (e.g., end-effector positions). Returns normalized 2D coordinates. Implemented as a Functor class. +``` + +## Normalization + +```{list-table} Normalization Functors +:header-rows: 1 +:widths: 30 70 + +* - Functor Name + - Description +* - ``normalize_robot_joint_data`` + - Normalize joint positions or velocities to [0, 1] range based on joint limits. Supports both ``qpos_limits`` and ``qvel_limits``. Operates in ``modify`` mode. +``` + +```{currentmodule} embodichain.lab.sim.objects +``` + +```{note} +To get robot end-effector poses, you can use the robot's {meth}`~Robot.compute_fk()` method directly in your observation functors or task code. +``` + +## Usage Example + +```python +from embodichain.lab.gym.envs.managers.cfg import ObservationCfg, SceneEntityCfg + +# Example: Add object pose to observations +observations = { + "object_pose": ObservationCfg( + func="get_rigid_object_pose", + mode="add", + name="object/cube/pose", + params={ + "entity_cfg": SceneEntityCfg(uid="cube"), + }, + ), + # Example: Normalize joint positions + "normalized_qpos": ObservationCfg( + func="normalize_robot_joint_data", + mode="modify", + name="robot/qpos", + params={ + "joint_ids": list(range(7)), # First 7 joints + "limit": "qpos_limits", + }, + ), +} +``` diff --git a/docs/source/overview/sim/index.rst b/docs/source/overview/sim/index.rst index 48e1a3ca..56f98ef2 100644 --- a/docs/source/overview/sim/index.rst +++ b/docs/source/overview/sim/index.rst @@ -13,13 +13,12 @@ Overview of the Simulation Framework: - Kinematics Solver - Motion Generation -Table of Contents -================= - .. toctree:: :maxdepth: 1 :glob: sim_manager.md + sim_assets.md + sim_sensor.md solvers/index planners/index diff --git a/docs/source/overview/sim/sim_articulation.md b/docs/source/overview/sim/sim_articulation.md new file mode 100644 index 00000000..1cde346f --- /dev/null +++ b/docs/source/overview/sim/sim_articulation.md @@ -0,0 +1,114 @@ +# Articulation + +```{currentmodule} embodichain.lab.sim +``` + +The {class}`~objects.Articulation` class represents the fundamental physics entity for articulated objects (e.g., robots, grippers, cabinets, doors) in EmbodiChain. + +## Configuration + +Articulations are configured using the {class}`~cfg.ArticulationCfg` dataclass. +| Parameter | Type | Default | Description | +| :--- | :--- | :--- | :--- | +| `fpath` | `str` | `None` | Path to the asset file (URDF/MJCF). | +| `init_pos` | `tuple` | `(0,0,0)` | Initial root position `(x, y, z)`. | +| `init_rot` | `tuple` | `(0,0,0)` | Initial root rotation `(r, p, y)` in degrees. | +| `fix_base` | `bool` | `True` | Whether to fix the base of the articulation. | +| `drive_props` | `JointDrivePropertiesCfg` | `...` | Default drive properties. | + +### Drive Configuration + +The `drive_props` parameter controls the joint physics behavior. It is defined using the `JointDrivePropertiesCfg` class. + +| Parameter | Type | Default | Description | +| :--- | :--- | :--- | :--- | +| `stiffness` | `float` / `Dict` | `1.0e4` | Stiffness (P-gain) of the joint drive. Unit: $N/m$ or $Nm/rad$. | +| `damping` | `float` / `Dict` | `1.0e3` | Damping (D-gain) of the joint drive. Unit: $Ns/m$ or $Nms/rad$. | +| `max_effort` | `float` / `Dict` | `1.0e10` | Maximum effort (force/torque) the joint can exert. | +| `max_velocity` | `float` / `Dict` | `1.0e10` | Maximum velocity allowed for the joint ($m/s$ or $rad/s$). | +| `friction` | `float` / `Dict` | `0.0` | Joint friction coefficient. | +| `drive_type` | `str` | `"force"` | Drive mode: `"force"` or `"acceleration"`. | + +### Setup & Initialization + +```python +import torch +from embodichain.lab.sim import SimulationManager, SimulationManagerCfg +from embodichain.lab.sim.objects import Articulation, ArticulationCfg + +# 1. Initialize Simulation +device = "cuda" if torch.cuda.is_available() else "cpu" +sim_cfg = SimulationManagerCfg(sim_device=device) +sim = SimulationManager(sim_config=sim_cfg) + +# 2. Configure Articulation +art_cfg = ArticulationCfg( + fpath="assets/robots/franka/franka.urdf", + init_pos=(0, 0, 0.5), + fix_base=True +) + +# 3. Spawn Articulation +# Note: The method is 'add_articulation' +articulation: Articulation = sim.add_articulation(cfg=art_cfg) + +# 4. Initialize Physics +sim.reset_objects_state() +``` +## Articulation Class +State Data (Observation) +State data is accessed via getter methods that return batched tensors. + +| Property | Shape | Description | +| :--- | :--- | :--- | +| `get_local_pose` | `(N, 7)` | Root link pose `[x, y, z, qw, qx, qy, qz]`. | +| `get_qpos` | `(N, dof)` | Joint positions. | +| `get_qvel` | `(N, dof)` | Joint velocities. | + + + +```python +# Example: Accessing state +# Note: Use methods (with brackets) instead of properties +print(f"Current Joint Positions: {articulation.get_qpos()}") +print(f"Root Pose: {articulation.get_local_pose()}") +``` +### Control & Dynamics +You can control the articulation by setting joint targets. + +### Joint Control +```python +# Set joint position targets (PD Control) +# Get current qpos to create a target tensor of correct shape +current_qpos = articulation.get_qpos() +target_qpos = torch.zeros_like(current_qpos) + +# Set target position +# target=True: Sets the drive target. The physics engine applies forces to reach this position. +# target=False: Instantly resets/teleports joints to this position (ignoring physics). +articulation.set_qpos(target_qpos, target=True) + +# Important: Step simulation to apply control +sim.update() +``` +### Drive Configuration +Dynamically adjust drive properties. + +```python +# Set stiffness for all joints +articulation.set_drive( + stiffness=torch.tensor([100.0], device=device), + damping=torch.tensor([10.0], device=device) +) +``` +### Kinematics +Supports differentiable Forward Kinematics (FK) and Jacobian computation. +```python +# Compute Forward Kinematics +# Note: Ensure 'build_pk_chain=True' in cfg +if getattr(art_cfg, 'build_pk_chain', False): + ee_pose = articulation.compute_fk( + qpos=articulation.get_qpos(), # Use method call + end_link_name="ee_link" # Replace with actual link name + ) +``` diff --git a/docs/source/overview/sim/sim_assets.md b/docs/source/overview/sim/sim_assets.md index dfa07edf..61463b21 100644 --- a/docs/source/overview/sim/sim_assets.md +++ b/docs/source/overview/sim/sim_assets.md @@ -1,12 +1,15 @@ # Simulation Assets -Simulation assets in EmbodiChain are configured using Python dataclasses. This approach provides a structured and type-safe way to define properties for physics, materials, objects and sensors in the simulation environment. +```{currentmodule} embodichain.lab.sim +``` + +Simulation assets in EmbodiChain are configured using Python dataclasses. This approach provides a structured and type-safe way to define properties for physics, materials and objects in the simulation environment. ## Visual Materials ### Configuration -The `VisualMaterialCfg` class defines the visual appearance of objects using Physically Based Rendering (PBR) properties. +The {class}`~material.VisualMaterialCfg` class defines the visual appearance of objects using Physically Based Rendering (PBR) properties. | Parameter | Type | Default | Description | | :--- | :--- | :--- | :--- | @@ -25,11 +28,11 @@ The `VisualMaterialCfg` class defines the visual appearance of objects using Phy ### Visual Material and Visual Material Instance -A visual material is defined using the `VisualMaterialCfg` class. It is actually a material template that can be used to create multiple instances with different parameters. +A visual material is defined using the {class}`~material.VisualMaterialCfg` class. It is actually a material template that can be used to create multiple instances with different parameters. -A visual material instance is created from a visual material using the method `create_instance()`. User can set different properties for each instance. For details API usage, please refer to the [VisualMaterialInst](https://dexforce.github.io/EmbodiChain/api_reference/embodichain/embodichain.lab.sim.html#embodichain.lab.sim.material.VisualMaterialInst) documentation. +A visual material instance is created from a visual material using the method {meth}`~material.VisualMaterial.create_instance()`. User can set different properties for each instance. For details API usage, please refer to the [VisualMaterialInst](https://dexforce.github.io/EmbodiChain/api_reference/embodichain/embodichain.lab.sim.html#embodichain.lab.sim.material.VisualMaterialInst) documentation. -For batch simualtion scenarios, when user set a material to a object (eg, a rigid object), the material instance will be created for each simulation instance automatically. +For batch simualtion scenarios, when user set a material to a object (eg, a rigid object with `num_envs` instances), the material instance will be created for each simulation instance automatically. ### Code @@ -56,7 +59,7 @@ mat_inst[0].set_base_color([1.0, 0.0, 0.0, 1.0]) ## Objects -All objects inherit from `ObjectBaseCfg`, which provides common properties. +All objects inherit from {class}`~cfg.ObjectBaseCfg`, which provides common properties. **Base Properties** @@ -69,7 +72,7 @@ All objects inherit from `ObjectBaseCfg`, which provides common properties. ## Rigid Object -Configured via `RigidObjectCfg`. +Configured via {class}`~cfg.RigidObjectCfg`. | Parameter | Type | Default | Description | | :--- | :--- | :--- | :--- | @@ -81,7 +84,7 @@ Configured via `RigidObjectCfg`. ### Rigid Body Attributes -The `RigidBodyAttributesCfg` class defines physical properties for rigid bodies. +The {class}`~cfg.RigidBodyAttributesCfg` class defines physical properties for rigid bodies. | Parameter | Type | Default | Description | | :--- | :--- | :--- | :--- | @@ -99,59 +102,12 @@ The `RigidBodyAttributesCfg` class defines physical properties for rigid bodies. | `dynamic_friction` | `float` | `0.5` | Dynamic friction coefficient. | | `static_friction` | `float` | `0.5` | Static friction coefficient. | -For Rigid Object tutorial, please refer to the [Create Scene](https://dexforce.github.io/EmbodiChain/tutorial/create_scene.html). +For Rigid Object tutorial, please refer to the [Create Scene](https://dexforce.github.io/EmbodiChain/tutorial/create_scene.html) tutorial. ## Rigid Object Groups -`RigidObjectGroupCfg` allows initializing multiple rigid objects, potentially from a folder. - - -## Soft Object - -Configured via `SoftObjectCfg`. - -| Parameter | Type | Default | Description | -| :--- | :--- | :--- | :--- | -| `voxel_attr` | `SoftbodyVoxelAttributesCfg` | `...` | Voxelization attributes. | -| `physical_attr` | `SoftbodyPhysicalAttributesCfg` | `...` | Physical attributes. | -| `shape` | `MeshCfg` | `MeshCfg()` | Mesh configuration. | - -### Soft Body Attributes - -Soft bodies require both voxelization and physical attributes. - -**Voxel Attributes (`SoftbodyVoxelAttributesCfg`)** - -| Parameter | Type | Default | Description | -| :--- | :--- | :--- | :--- | -| `triangle_remesh_resolution` | `int` | `8` | Resolution to remesh the softbody mesh before building physx collision mesh. | -| `triangle_simplify_target` | `int` | `0` | Simplify mesh faces to target value. | -| `simulation_mesh_resolution` | `int` | `8` | Resolution to build simulation voxelize textra mesh. | -| `simulation_mesh_output_obj` | `bool` | `False` | Whether to output the simulation mesh as an obj file for debugging. | +{class}`~cfg.RigidObjectGroupCfg` allows initializing multiple rigid objects, potentially from a folder. -**Physical Attributes (`SoftbodyPhysicalAttributesCfg`)** - -| Parameter | Type | Default | Description | -| :--- | :--- | :--- | :--- | -| `youngs` | `float` | `1e6` | Young's modulus (higher = stiffer). | -| `poissons` | `float` | `0.45` | Poisson's ratio (higher = closer to incompressible). | -| `dynamic_friction` | `float` | `0.0` | Dynamic friction coefficient. | -| `elasticity_damping` | `float` | `0.0` | Elasticity damping factor. | -| `material_model` | `SoftBodyMaterialModel` | `CO_ROTATIONAL` | Material constitutive model. | -| `enable_kinematic` | `bool` | `False` | If True, (partially) kinematic behavior is enabled. | -| `enable_ccd` | `bool` | `False` | Enable continuous collision detection. | -| `enable_self_collision` | `bool` | `False` | Enable self-collision handling. | -| `mass` | `float` | `-1.0` | Total mass. If negative, density is used. | -| `density` | `float` | `1000.0` | Material density in kg/m^3. | - -For Soft Object tutorial, please refer to the [Soft Body Simulation](https://dexforce.github.io/EmbodiChain/tutorial/create_softbody.html). - - -### Articulations & Robots - -Configured via `ArticulationCfg` and `RobotCfg` (which inherits from `ArticulationCfg`). - -These configurations are typically loaded from URDF or MJCF files. ### Lights @@ -164,15 +120,11 @@ Configured via `LightCfg`. | `intensity` | `float` | `50.0` | Intensity in watts/m^2. | | `radius` | `float` | `1e2` | Falloff radius. | -### Markers - -Configured via `MarkerCfg` for debugging and visualization. - -| Parameter | Type | Default | Description | -| :--- | :--- | :--- | :--- | -| `marker_type` | `Literal` | `"axis"` | "axis", "line", or "point". | -| `axis_size` | `float` | `0.002` | Thickness of axis lines. | -| `axis_len` | `float` | `0.005` | Length of axis arms. | -| `line_color` | `list` | `[1, 1, 0, 1.0]` | RGBA color for lines. | +```{toctree} +:maxdepth: 1 +sim_soft_object.md +sim_articulation.md +sim_robot.md +``` \ No newline at end of file diff --git a/docs/source/overview/sim/sim_manager.md b/docs/source/overview/sim/sim_manager.md index 823786b8..2eca587c 100644 --- a/docs/source/overview/sim/sim_manager.md +++ b/docs/source/overview/sim/sim_manager.md @@ -1,14 +1,17 @@ # Simulation Manager -The `SimulationManager` is the central class in EmbodiChain's simulation framework for managing the simulation lifecycle. It handles: -- **Asset Management**: Loading and managing robots, rigid objects, soft objects, articulations, sensors, and lights. +```{currentmodule} embodichain.lab.sim +``` + +The {class}`SimulationManager` is the central class in EmbodiChain's simulation framework for managing the simulation lifecycle. It handles: +- **Asset Management**: Loading and managing robots, rigid objects, soft objects, articulations, and lights. - **Simulation Loop**: Controlling the physics stepping and rendering updates. - **Rendering**: Managing the simulation window, camera rendering, material settings and ray-tracing configuration. - **Interaction**: Providing gizmo controls for interactive manipulation of objects. ## Configuration -The simulation is configured using the `SimulationManagerCfg` class. +The simulation is configured using the {class}`SimulationManagerCfg` class. ```python from embodichain.lab.sim import SimulationManagerCfg @@ -45,7 +48,7 @@ sim_config = SimulationManagerCfg( ### Physics Configuration -The `PhysicsCfg` class controls the global physics simulation parameters. +The {class}`~cfg.PhysicsCfg` class controls the global physics simulation parameters. | Parameter | Type | Default | Description | | :--- | :--- | :--- | :--- | @@ -77,7 +80,6 @@ The manager provides methods to add, retrieve and remove various simulation asse - Soft Objects - Articulations - Robots -- Sensors - Lights - Materials @@ -87,7 +89,7 @@ For more details on simulation assets, please refer to their respective document ### Manual Update mode -In this mode, the physics simulation should be explicitly stepped by calling `update()` method, which provides precise control over the simulation timing. +In this mode, the physics simulation should be explicitly stepped by calling {meth}`~SimulationManager.update()` method, which provides precise control over the simulation timing. The use case for manual update mode includes: - Data generation with openai gym environments, in which the observation and action must be synchronized with the physics simulation. @@ -110,11 +112,24 @@ In this mode, the physics simulation stepping is automatically handling by the p > When in automatic update mode, user are recommanded to use CPU `sim_device` for simulation. -### Methods +## Mainly used methods + +- **`SimulationManager.update(physics_dt=None, step=1)`**: Steps the physics simulation with optional custom time step and number of steps. If `physics_dt` is None, uses the configured physics time step. +- **`SimulationManager.enable_physics(enable: bool)`**: Enable or disable physics simulation. +- **`SimulationManager.set_manual_update(enable: bool)`**: Set manual update mode for physics. + + +## Multiple instances + +`SimulationManager` supports multiple instances to run separate simulations world independently. Each instance maintains its own simulation state, assets, and configurations. + +- To get current instance number of `SimulationManager`: `SimulationManager.get_instance_num()` +- To get specific instance: `SimulationManager.get_instance(instance_id)`. + +> Currently, multiple instances are not supported for ray tracing rendering backend. Good news is that we are working on adding this feature in future releases. + -- **`update(physics_dt=None, step=1)`**: Steps the physics simulation with optional custom time step and number of steps. If `physics_dt` is None, uses the configured physics time step. -- **`enable_physics(enable: bool)`**: Enable or disable physics simulation. -- **`set_manual_update(enable: bool)`**: Set manual update mode for physics. +For more methods and details, refer to the [SimulationManager](https://dexforce.github.io/EmbodiChain/api_reference/embodichain/embodichain.lab.sim.html#embodichain.lab.sim.SimulationManager) documentation. ### Related Tutorials diff --git a/docs/source/overview/sim/sim_robot.md b/docs/source/overview/sim/sim_robot.md new file mode 100644 index 00000000..e0ab5992 --- /dev/null +++ b/docs/source/overview/sim/sim_robot.md @@ -0,0 +1,108 @@ +# Robot + +```{currentmodule} embodichain.lab.sim +``` + +The {class}`~objects.Robot` class extends {class}`~objects.Articulation` to support advanced robotics features such as kinematic solvers (IK/FK), motion planners, and part-based control (e.g., controlling "arm" and "gripper" separately). + +## Configuration + +Robots are configured using {class}`~cfg.RobotCfg`. +| Parameter | Type | Default | Description | +| :--- | :--- | :--- | :--- | +| `control_parts` | `Dict[str, List[str]]` | `None` | Defines groups of joints (e.g., `{"arm": ["joint1", ...], "hand": ["finger1", ...]}`). | +| `solver_cfg` | `SolverCfg` | `None` | Configuration for kinematic solvers (IK/FK). | +| `urdf_cfg` | `URDFCfg` | `None` | Advanced configuration for assembling a robot from multiple URDF components. | + +### Setup & Initialization + +A `Robot` must be spawned within a `SimulationManager`. + +```python +import torch +from embodichain.lab.sim import SimulationManager, SimulationManagerCfg +from embodichain.lab.sim.objects import Robot, RobotCfg +from embodichain.lab.sim.solvers import SolverCfg + +# 1. Initialize Simulation Environment +# Note: Use 'sim_device' to specify device (e.g., "cuda:0" or "cpu") +device = "cuda" if torch.cuda.is_available() else "cpu" +sim_cfg = SimulationManagerCfg(sim_device=device, physics_dt=0.01) +sim = SimulationManager(sim_config=sim_cfg) + +# 2. Configure Robot +robot_cfg = RobotCfg( + fpath="assets/robots/franka/franka.urdf", + control_parts={ + "arm": ["panda_joint[1-7]"], + "gripper": ["panda_finger_joint[1-2]"] + }, + solver_cfg=SolverCfg() +) + +# 3. Spawn Robot +# Note: The method is 'add_robot', and it takes 'cfg' as argument +robot: Robot = sim.add_robot(cfg=robot_cfg) + +# 4. Reset Simulation +# This performs a global reset of the simulation state +sim.reset_objects_state() +``` + +## Robot Class + +### Control Parts + +A unique feature of the `Robot` class is **Control Parts**. Instead of controlling the entire DoF vector at once, you can target specific body parts by name. + +```python +# Get joint IDs for a specific part +arm_ids = robot.get_joint_ids(name="arm") + +# Control only the arm +# Note: Ensure 'sim.update()' is called in your loop to apply these actions +target_qpos = torch.zeros((robot.num_instances, len(arm_ids)), device=device) +robot.set_qpos(target_qpos, name="arm", target=True) +``` + +### Kinematics (Solvers) +The robot class integrates solvers to perform differentiable Forward Kinematics (FK) and Inverse Kinematics (IK). +#### Forward Kinematics (FK) +Compute the pose of a link (e.g., end-effector) given joint positions. +```python +# Compute FK for a specific part (uses the part's configured solver) +current_qpos = robot.get_qpos() +ee_pose = robot.compute_fk(qpos=current_qpos, name="arm") +print(f"EE Pose: {ee_pose}") +``` +#### Inverse Kinematics (IK) +Compute the required joint positions to reach a target pose. +```python +# Compute IK +# pose: Target pose (N, 7) or (N, 4, 4) +target_pose = ee_pose.clone() # Example target +target_pose[:, 2] += 0.1 # Move up 10cm + +success, solved_qpos = robot.compute_ik( + pose=target_pose, + name="arm", + joint_seed=current_qpos +) +``` +### Proprioception +Get standard proprioceptive observation data for learning agents. +```python +# Returns a dict containing 'qpos', 'qvel', and 'qf' +obs = robot.get_proprioception() +``` +### Advanced API +The Robot class overrides standard Articulation methods to support the name argument for part-specific operations. +| Method | Description | +| :--- | :--- | +| `set_qpos(..., name="part")` | Set joint positions for a specific part. | +| `set_qvel(..., name="part")` | Set joint velocities for a specific part. | +| `set_qf(..., name="part")` | Set joint efforts for a specific part. | +| `get_qpos(name="part")` | Get joint positions of a specific part. | +| `get_qvel(name="part")` | Get joint velocities of a specific part. | + +For more API details, refer to the {class}`~objects.Robot` documentation. diff --git a/docs/source/overview/sim/sim_sensor.md b/docs/source/overview/sim/sim_sensor.md new file mode 100644 index 00000000..b64a1665 --- /dev/null +++ b/docs/source/overview/sim/sim_sensor.md @@ -0,0 +1,214 @@ +# Sensors + +```{currentmodule} embodichain.lab.sim.sensors +``` + +The Simulation framework provides sensor interfaces for agents to perceive the environment. Currently, the primary supported sensor type is the **Camera**. + +## Camera + +### Configuration + +The {class}`CameraCfg` class defines the configuration for camera sensors. It inherits from {class}`~SensorCfg` and controls resolution, clipping planes, intrinsics, and active data modalities. + +| Parameter | Type | Default | Description | +| :--- | :--- | :--- | :--- | +| `width` | `int` | `640` | Width of the captured image. | +| `height` | `int` | `480` | Height of the captured image. | +| `intrinsics` | `tuple` | `(600, 600, 320.0, 240.0)` | Camera intrinsics `(fx, fy, cx, cy)`. | +| `extrinsics` | `ExtrinsicsCfg` | `ExtrinsicsCfg()` | Pose configuration (see below). | +| `near` | `float` | `0.005` | Near clipping plane distance. | +| `far` | `float` | `100.0` | Far clipping plane distance. | +| `enable_color` | `bool` | `True` | Enable RGBA image capture. | +| `enable_depth` | `bool` | `False` | Enable depth map capture. | +| `enable_mask` | `bool` | `False` | Enable segmentation mask capture. | +| `enable_normal` | `bool` | `False` | Enable surface normal capture. | +| `enable_position` | `bool` | `False` | Enable 3D position map capture. | + +### Camera Extrinsics + +The `ExtrinsicsCfg` class defines the position and orientation of the camera. + +| Parameter | Type | Default | Description | +| :--- | :--- | :--- | :--- | +| `parent` | `str` | `None` | Name of the link to attach to (e.g., `"ee_link"`). If `None`, camera is fixed in world. | +| `pos` | `list` | `[0.0, 0.0, 0.0]` | Position offset `[x, y, z]`. | +| `quat` | `list` | `[1.0, 0.0, 0.0, 0.0]` | Orientation quaternion `[w, x, y, z]`. | +| `eye` | `tuple` | `None` | (Optional) Camera eye position for look-at mode. | +| `target` | `tuple` | `None` | (Optional) Target position for look-at mode. | +| `up` | `tuple` | `None` | (Optional) Up vector for look-at mode. | + +### Usage + +You can create a camera sensor using `sim.add_sensor()` with a `CameraCfg` object. + +#### Code Example + +```python +from embodichain.lab.sim.sensors import Camera, CameraCfg + +# 1. Define Configuration +camera_cfg = CameraCfg( + width=640, + height=480, + intrinsics=(600, 600, 320.0, 240.0), + extrinsics=CameraCfg.ExtrinsicsCfg( + parent="ee_link", # Attach to robot end-effector + pos=[0.09, 0.05, 0.04], # Relative position + quat=[0, 1, 0, 0], # Relative rotation [w, x, y, z] + ), + enable_color=True, + enable_depth=True, +) + +# 2. Add Sensor to Simulation +camera: Camera = sim.add_sensor(sensor_cfg=camera_cfg) +``` +### Observation Data +Retrieve sensor data using camera.get_data(). The data is returned as a dictionary of tensors on the specified device. + +| Key | Data Type | Shape | Description | +| :--- | :--- | :--- | :--- | +| `color` | `torch.uint8` | `(B, H, W, 4)` | RGBA image data. | +| `depth` | `torch.float32` | `(B, H, W)` | Depth map in meters. | +| `mask` | `torch.int32` | `(B, H, W)` | Segmentation mask / Instance IDs. | +| `normal` | `torch.float32` | `(B, H, W, 3)` | Surface normal vectors. | +| `position` | `torch.float32` | `(B, H, W, 3)` | 3D Position map (OpenGL coords). | + +*Note: `B` represents the number of environments (batch size).* + +## Stereo Camera + +### Configuration + +The {class}`StereoCameraCfg` class defines the configuration for stereo camera sensors. It inherits from {class}`CameraCfg` and includes additional settings for the right camera and stereo-specific features like disparity computation. + +In addition to the standard {class}`CameraCfg` parameters, it supports the following: + +| Parameter | Type | Default | Description | +| :--- | :--- | :--- | :--- | +| `intrinsics_right` | `tuple` | `(600, 600, 320.0, 240.0)` | The intrinsics for the right camera `(fx, fy, cx, cy)`. | +| `left_to_right_pos` | `tuple` | `(0.05, 0.0, 0.0)` | Position offset `[x, y, z]` from the left camera to the right camera. | +| `left_to_right_rot` | `tuple` | `(0.0, 0.0, 0.0)` | Rotation offset `[x, y, z]` (Euler angles in degrees) from the left camera to the right camera. | +| `enable_disparity` | `bool` | `False` | Enable disparity map computation. *Note: Requires `enable_depth` to be `True`.* | + +### Usage + +You can create a stereo camera sensor using `sim.add_sensor()` with a `StereoCameraCfg` object. + +#### Code Example + +```python +from embodichain.lab.sim.sensors import StereoCamera, StereoCameraCfg + +# 1. Define Configuration +stereo_cfg = StereoCameraCfg( + width=640, + height=480, + # Intrinsics for Left (inherited) and Right cameras + intrinsics=(600, 600, 320.0, 240.0), + intrinsics_right=(600, 600, 320.0, 240.0), + # Baseline configuration (e.g., 5cm baseline) + left_to_right_pos=(0.05, 0.0, 0.0), + extrinsics=StereoCameraCfg.ExtrinsicsCfg( + parent="head_link", + pos=[0.1, 0.0, 0.0], + ), + # Data modalities + enable_color=True, + enable_depth=True, + enable_disparity=True, +) + +# 2. Add Sensor to Simulation +stereo_camera: StereoCamera = sim.add_sensor(sensor_cfg=stereo_cfg) +``` + +## Contact Sensor + +### Configuration + +The {class}`ContactSensorCfg` class defines the configuration for contact sensors. It inherits from {class}`~SensorCfg` and enables filtering and monitoring of contact events between specific rigid bodies and articulation links in the simulation. + +| Parameter | Type | Default | Description | +| :--- | :--- | :--- | :--- | +| `rigid_uid_list` | `List[str]` | `[]` | List of rigid body UIDs to monitor for contacts. | +| `articulation_cfg_list` | `List[ArticulationContactFilterCfg]` | `[]` | List of articulation link contact filter configurations. | +| `filter_need_both_actor` | `bool` | `True` | Whether to filter contact only when both actors are in the filter list. If `False`, contact is reported if either actor is in the filter. | + +### Articulation Contact Filter Configuration + +The `ArticulationContactFilterCfg` class specifies which articulation links to monitor for contacts. + +| Parameter | Type | Default | Description | +| :--- | :--- | :--- | :--- | +| `articulation_uid` | `str` | `""` | Unique identifier of the articulation (robot or articulated object). | +| `link_name_list` | `List[str]` | `[]` | List of link names in the articulation to monitor. If empty, all links are monitored. | + +### Usage + +You can create a contact sensor using `sim.add_sensor()` with a `ContactSensorCfg` object. + +#### Code Example + +```python +from embodichain.lab.sim.sensors import ContactSensor, ContactSensorCfg, ArticulationContactFilterCfg +import torch + +# 1. Define Contact Filter Configuration +contact_filter_cfg = ContactSensorCfg() + +# Monitor contacts for specific rigid bodies +contact_filter_cfg.rigid_uid_list = ["cube0", "cube1", "cube2"] + +# Monitor contacts for specific articulation links +contact_filter_art_cfg = ArticulationContactFilterCfg() +contact_filter_art_cfg.articulation_uid = "UR10_PGI" +contact_filter_art_cfg.link_name_list = ["finger1_link", "finger2_link"] +contact_filter_cfg.articulation_cfg_list = [contact_filter_art_cfg] + +# Only report contacts when both actors are in the filter list +contact_filter_cfg.filter_need_both_actor = True + +# 2. Add Sensor to Simulation +contact_sensor: ContactSensor = sim.add_sensor(sensor_cfg=contact_filter_cfg) + +# 3. Update and Retrieve Contact Data +sim.update(step=1) +contact_sensor.update() +contact_report = contact_sensor.get_data() + +# 4. Filter contacts by specific user IDs +cube2_user_ids = sim.get_rigid_object("cube2").get_user_ids() +finger1_user_ids = sim.get_robot("UR10_PGI").get_user_ids("finger1_link").reshape(-1) +filter_user_ids = torch.cat([cube2_user_ids, finger1_user_ids]) +filter_contact_report = contact_sensor.filter_by_user_ids(filter_user_ids) + +# 5. Visualize Contact Points +contact_sensor.set_contact_point_visibility( + visible=True, + rgba=(0.0, 0.0, 1.0, 1.0), # Blue color + point_size=6.0 +) +``` + +### Observation Data + +Retrieve contact data using `contact_sensor.get_data()`. The data is returned as a dictionary of tensors on the specified device. + +| Key | Data Type | Shape | Description | +| :--- | :--- | :--- | :--- | +| `position` | `torch.float32` | `(n_contact, 3)` | Contact positions in arena frame (world coordinates minus arena offset). | +| `normal` | `torch.float32` | `(n_contact, 3)` | Contact normal vectors. | +| `friction` | `torch.float32` | `(n_contact, 3)` | Contact friction forces. *Note: Currently this value may not be accurate.* | +| `impulse` | `torch.float32` | `(n_contact,)` | Contact impulse magnitudes. | +| `distance` | `torch.float32` | `(n_contact,)` | Contact penetration distances. | +| `user_ids` | `torch.int32` | `(n_contact, 2)` | Pair of user IDs for the two actors in contact. Use with `rigid_object.get_user_ids()` to identify objects. | +| `env_ids` | `torch.int32` | `(n_contact,)` | Environment IDs indicating which parallel environment each contact belongs to. | + +*Note: `N` represents the number of contacts detected.* + +### Additional Methods + +- **`filter_by_user_ids(item_user_ids)`**: Filter contact report to include only contacts involving specific user IDs. +- **`set_contact_point_visibility(visible, rgba, point_size)`**: Enable/disable visualization of contact points with customizable color and size. \ No newline at end of file diff --git a/docs/source/overview/sim/sim_soft_object.md b/docs/source/overview/sim/sim_soft_object.md new file mode 100644 index 00000000..4f8b0101 --- /dev/null +++ b/docs/source/overview/sim/sim_soft_object.md @@ -0,0 +1,113 @@ +# Soft Object + +```{currentmodule} embodichain.lab.sim +``` + +The {class}`~objects.SoftObject` class represents deformable entities (e.g., cloth, sponges, soft robotics) in EmbodiChain. Unlike rigid bodies, soft objects are defined by vertices and meshes rather than a single rigid pose. + +## Configuration + +Configured via {class}`~cfg.SoftObjectCfg`. + +| Parameter | Type | Default | Description | +| :--- | :--- | :--- | :--- | +| `voxel_attr` | `SoftbodyVoxelAttributesCfg` | `...` | Voxelization attributes. | +| `physical_attr` | `SoftbodyPhysicalAttributesCfg` | `...` | Physical attributes. | +| `shape` | `MeshCfg` | `MeshCfg()` | Mesh configuration. | + +### Soft Body Attributes + +Soft bodies require both voxelization and physical attributes. + +**Voxel Attributes ({class}`~cfg.SoftbodyVoxelAttributesCfg`)** + +| Parameter | Type | Default | Description | +| :--- | :--- | :--- | :--- | +| `triangle_remesh_resolution` | `int` | `8` | Resolution to remesh the softbody mesh before building physics collision mesh. | +| `triangle_simplify_target` | `int` | `0` | Simplify mesh faces to target value. | +| `simulation_mesh_resolution` | `int` | `8` | Resolution to build simulation voxelize textra mesh. | +| `simulation_mesh_output_obj` | `bool` | `False` | Whether to output the simulation mesh as an obj file for debugging. | + +**Physical Attributes ({class}`~cfg.SoftbodyPhysicalAttributesCfg`)** + +| Parameter | Type | Default | Description | +| :--- | :--- | :--- | :--- | +| `youngs` | `float` | `1e6` | Young's modulus (higher = stiffer). | +| `poissons` | `float` | `0.45` | Poisson's ratio (higher = closer to incompressible). | +| `dynamic_friction` | `float` | `0.0` | Dynamic friction coefficient. | +| `elasticity_damping` | `float` | `0.0` | Elasticity damping factor. | +| `material_model` | `SoftBodyMaterialModel` | `CO_ROTATIONAL` | Material constitutive model. | +| `enable_kinematic` | `bool` | `False` | If True, (partially) kinematic behavior is enabled. | +| `enable_ccd` | `bool` | `False` | Enable continuous collision detection. | +| `enable_self_collision` | `bool` | `False` | Enable self-collision handling. | +| `mass` | `float` | `-1.0` | Total mass. If negative, density is used. | +| `density` | `float` | `1000.0` | Material density in kg/m^3. | + +For Soft Object tutorial, please refer to the [Soft Body Simulation](https://dexforce.github.io/EmbodiChain/tutorial/create_softbody.html). + + +### Setup & Initialization + +```python +import torch +from embodichain.lab.sim import SimulationManager, SimulationManagerCfg +from embodichain.lab.sim.objects import SoftObject, SoftObjectCfg + +# 1. Initialize Simulation +device = "cuda" if torch.cuda.is_available() else "cpu" +sim_cfg = SimulationManagerCfg(sim_device=device) +sim = SimulationManager(sim_config=sim_cfg) + +# 2. Configure Soft Object +soft_cfg = SoftObjectCfg( + fpath="assets/objects/sponge.msh", # Example asset path + init_pos=(0, 0, 0.5), + init_rot=(0, 0, 0) +) + +# 3. Spawn Soft Object +# Note: Assuming the method in SimulationManager is 'add_soft_object' +soft_object: SoftObject = sim.add_soft_object(cfg=soft_cfg) + +# 4. Initialize Physics +sim.reset_objects_state() +``` +### Soft Object Class +#### Vertex Data (Observation) +For soft objects, the state is represented by the positions and velocities of its vertices, rather than a single root pose. + +| Method | Return Shape | Description | +| :--- | :--- | :--- | +| `get_current_collision_vertices()` | `(N, V_col, 3)` | Current positions of collision mesh vertices. | +| `get_current_sim_vertices()` | `(N, V_sim, 3)` | Current positions of simulation mesh vertices (nodes). | +| `get_current_sim_vertex_velocities()` | `(N, V_sim, 3)` | Current velocities of simulation vertices. | +| `get_rest_collision_vertices()` | `(N, V_col, 3)` | Rest (initial) positions of collision vertices. | +| `get_rest_sim_vertices()` | `(N, V_sim, 3)` | Rest (initial) positions of simulation vertices. | + +> Note: N is the number of environments/instances, V_col is the number of collision vertices, and V_sim is the number of simulation vertices. + +```python +# Example: Accessing vertex data +sim_verts = soft_object.get_current_sim_vertices() +print(f"Simulation Vertices Shape: {sim_verts.shape}") + +velocities = soft_object.get_current_sim_vertex_velocities() +print(f"Vertex Velocities: {velocities}") +``` +#### Pose Management +You can set the global pose of a soft object (which transforms all its vertices), but getting a single "pose" from a deformed object is not supported. + +| Method | Description | +| :--- | :--- | +| `set_local_pose(pose)` | Sets the pose of the object by transforming all vertices. | +| `get_local_pose()` | **Not Supported**. Raises `NotImplementedError` because a deformed object does not have a single rigid pose. | + + +```python +# Reset or Move the Soft Object +target_pose = torch.tensor([[0, 0, 1.0, 1, 0, 0, 0]], device=device) # (x, y, z, qw, qx, qy, qz) +soft_object.set_local_pose(target_pose) + +# Important: Step simulation to apply changes +sim.update() +``` \ No newline at end of file diff --git a/docs/source/overview/vla/index.rst b/docs/source/overview/vla/index.rst deleted file mode 100644 index dcb4c123..00000000 --- a/docs/source/overview/vla/index.rst +++ /dev/null @@ -1,4 +0,0 @@ -Vision-Language-Action Models -================== - -*To be completed by adding an example of how to build a Vision-Language-Action (VLA) pipeline in EmbodiChain.* diff --git a/docs/source/resources/roadmap.md b/docs/source/resources/roadmap.md index ba160ea2..899a9c56 100644 --- a/docs/source/resources/roadmap.md +++ b/docs/source/resources/roadmap.md @@ -1,6 +1,6 @@ # Roadmap -Currently, EmbodiChain is under active development. Our roadmap is as follows: +Currently, EmbodiChain is under active development. Our roadmap includes the following planned features and enhancements: - Simulation: - Rendering: @@ -8,27 +8,29 @@ Currently, EmbodiChain is under active development. Our roadmap is as follows: - Add a high performance Hybrid rendering backend for better visual quality and speed trade-off. - Support a more efficient real-time denoiser. - Add a new rasterization backend for basic rendering tasks. - - Support 3DGS rendering mode (If we have enough bandwidth). - Physics: - Improve GPU physics throughput. - - Improve soft body simulation stability and add more examples and tasks. - - We are also exploring how to integrate [newton physics](https://github.com/newton-physics/newton) into EmbodiChain as an alternative physics backend. + - We are working on research and development of next-generation physics backend, supporting high-accuracy simulation, differentiable dynamics, and neural physical models for end-to-end AI integration. - Sensors: - - Add contact and force sensors with examples. + - Add more physical sensors (eg, force sensor) with examples. - Motion Generation: - - Add more advanced motion generation methods and examples. + - Add more advanced motion generation methods with examples. - Useful Tools: - We are working on USD support for EmbodiChain to enable better asset management and interoperability. - - We will release a simple Real2Sim pipeline, which enables automatic task generation from real-world data. - Robots Integration: - Add support for more robot models (eg: LeRobot, Unitree H1/G1, etc). -- Agents: - - Add more Reinforcement Learning examples and environments. - - We will release a Modular VLA framework for fast prototyping and training of embodied agents. - - We will release a simple online data streaming pipeline for Imitation Learning. +- Data Pipeline Coming Soon: + - We will release a Real2Sim pipeline, which enables automatic data generation and scaling from real-world seeding priors. + - We will release an agentic skill generation framework for automated expert trajectory generation. + - Add assets and scenes generator and the integration with data pipeline. -- Tasks: - - We will release a set of Real2Sim tasks as examples for EmbodiChain. - - We will release a set of tableware manipulation tasks for demonstration of data generation pipeline. +- Models & Training Infrastructure Coming Soon: + - We will release a modular VLA framework for fast prototyping and training of embodied agents. + - Add online data streaming pipeline for model training. + +- Embodied Tasks Coming Soon: + - Add more benchmark tasks for EmbodiChain. + - Add more tasks with reinforcement learning support. + - Add a set of manipulation tasks for demonstration of data generation pipeline. \ No newline at end of file diff --git a/embodichain/agents/rl/algo/ppo.py b/embodichain/agents/rl/algo/ppo.py index afd636e7..5253e897 100644 --- a/embodichain/agents/rl/algo/ppo.py +++ b/embodichain/agents/rl/algo/ppo.py @@ -17,7 +17,7 @@ import torch from typing import Dict, Any, Tuple, Callable -from embodichain.agents.rl.utils import AlgorithmCfg +from embodichain.agents.rl.utils import AlgorithmCfg, flatten_dict_observation from embodichain.agents.rl.buffer import RolloutBuffer from embodichain.utils import configclass from .base import BaseAlgorithm @@ -102,6 +102,10 @@ def collect_rollout( reward = reward.float() done = done.bool() + # Flatten dict observation from ObservationManager if needed + if isinstance(next_obs, dict): + next_obs = flatten_dict_observation(next_obs) + # Add to buffer self.buffer.add(current_obs, actions, reward, done, value, log_prob) diff --git a/embodichain/agents/rl/train.py b/embodichain/agents/rl/train.py index e87d4629..ca3c2996 100644 --- a/embodichain/agents/rl/train.py +++ b/embodichain/agents/rl/train.py @@ -32,7 +32,7 @@ from embodichain.agents.rl.utils.trainer import Trainer from embodichain.utils import logger from embodichain.lab.gym.envs.tasks.rl import build_env -from embodichain.lab.gym.utils.gym_utils import config_to_rl_cfg +from embodichain.lab.gym.utils.gym_utils import config_to_cfg from embodichain.utils.utility import load_json from embodichain.utils.module_utils import find_function_from_modules from embodichain.lab.sim import SimulationManagerCfg @@ -60,6 +60,9 @@ def main(): eval_freq = int(trainer_cfg.get("eval_freq", 10000)) save_freq = int(trainer_cfg.get("save_freq", 50000)) headless = bool(trainer_cfg.get("headless", True)) + enable_rt = bool(trainer_cfg.get("enable_rt", False)) + gpu_id = int(trainer_cfg.get("gpu_id", 0)) + num_envs = trainer_cfg.get("num_envs", None) wandb_project_name = trainer_cfg.get("wandb_project_name", "embodychain-generic") # Device @@ -120,7 +123,11 @@ def main(): logger.log_info(f"Current working directory: {Path.cwd()}") gym_config_data = load_json(str(gym_config_path)) - gym_env_cfg = config_to_rl_cfg(gym_config_data) + gym_env_cfg = config_to_cfg(gym_config_data) + + # Override num_envs from train config if provided + if num_envs is not None: + gym_env_cfg.num_envs = num_envs # Ensure sim configuration mirrors runtime overrides if gym_env_cfg.sim_cfg is None: @@ -135,24 +142,28 @@ def main(): else: gym_env_cfg.sim_cfg.sim_device = torch.device("cpu") gym_env_cfg.sim_cfg.headless = headless + gym_env_cfg.sim_cfg.enable_rt = enable_rt + gym_env_cfg.sim_cfg.gpu_id = gpu_id logger.log_info( - f"Loaded gym_config from {gym_config_path} (env_id={gym_env_cfg.env_id}, headless={gym_env_cfg.sim_cfg.headless}, sim_device={gym_env_cfg.sim_cfg.sim_device})" + f"Loaded gym_config from {gym_config_path} (env_id={gym_config_data['id']}, num_envs={gym_env_cfg.num_envs}, headless={gym_env_cfg.sim_cfg.headless}, enable_rt={gym_env_cfg.sim_cfg.enable_rt}, sim_device={gym_env_cfg.sim_cfg.sim_device})" ) - env = build_env(gym_env_cfg.env_id, base_env_cfg=gym_env_cfg) + env = build_env(gym_config_data["id"], base_env_cfg=gym_env_cfg) eval_gym_env_cfg = deepcopy(gym_env_cfg) eval_gym_env_cfg.num_envs = 4 eval_gym_env_cfg.sim_cfg.headless = True - eval_env = build_env(eval_gym_env_cfg.env_id, base_env_cfg=eval_gym_env_cfg) + eval_env = build_env(gym_config_data["id"], base_env_cfg=eval_gym_env_cfg) # Build Policy via registry policy_name = policy_block["name"] # Build Policy via registry (actor/critic must be explicitly defined in JSON when using actor_critic) if policy_name.lower() == "actor_critic": - obs_dim = env.observation_space.shape[-1] + # Get observation dimension from flattened observation space + # flattened_observation_space returns Box space for RL training + obs_dim = env.flattened_observation_space.shape[-1] action_dim = env.action_space.shape[-1] actor_cfg = policy_block.get("actor") @@ -167,7 +178,7 @@ def main(): policy = build_policy( policy_block, - env.observation_space, + env.flattened_observation_space, env.action_space, device, actor=actor, @@ -175,7 +186,7 @@ def main(): ) else: policy = build_policy( - policy_block, env.observation_space, env.action_space, device + policy_block, env.flattened_observation_space, env.action_space, device ) # Build Algorithm via factory diff --git a/embodichain/agents/rl/utils/__init__.py b/embodichain/agents/rl/utils/__init__.py index f6f9f4f9..e6f9e57a 100644 --- a/embodichain/agents/rl/utils/__init__.py +++ b/embodichain/agents/rl/utils/__init__.py @@ -15,7 +15,9 @@ # ---------------------------------------------------------------------------- from .config import AlgorithmCfg +from .helper import flatten_dict_observation __all__ = [ "AlgorithmCfg", + "flatten_dict_observation", ] diff --git a/embodichain/agents/rl/utils/helper.py b/embodichain/agents/rl/utils/helper.py new file mode 100644 index 00000000..3021a31f --- /dev/null +++ b/embodichain/agents/rl/utils/helper.py @@ -0,0 +1,52 @@ +# ---------------------------------------------------------------------------- +# Copyright (c) 2021-2025 DexForce Technology Co., Ltd. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ---------------------------------------------------------------------------- + +import torch + + +def flatten_dict_observation(input_dict: dict) -> torch.Tensor: + """ + Flatten hierarchical dict observations from ObservationManager. + + Recursively traverse nested dicts, collect all tensor values, + flatten each to (num_envs, -1), and concatenate in sorted key order. + + Args: + input_dict: Nested dict structure, e.g. {"robot": {"qpos": tensor, "ee_pos": tensor}, "object": {...}} + + Returns: + Concatenated flat tensor of shape (num_envs, total_dim) + """ + obs_list = [] + + def _collect_tensors(d, prefix=""): + """Recursively collect tensors from nested dicts in sorted order.""" + for key in sorted(d.keys()): + full_key = f"{prefix}/{key}" if prefix else key + value = d[key] + if isinstance(value, dict): + _collect_tensors(value, full_key) + elif isinstance(value, torch.Tensor): + # Flatten tensor to (num_envs, -1) shape + obs_list.append(value.flatten(start_dim=1)) + + _collect_tensors(input_dict) + + if not obs_list: + raise ValueError("No tensors found in observation dict") + + result = torch.cat(obs_list, dim=-1) + return result diff --git a/embodichain/agents/rl/utils/trainer.py b/embodichain/agents/rl/utils/trainer.py index 88c04908..34dc1919 100644 --- a/embodichain/agents/rl/utils/trainer.py +++ b/embodichain/agents/rl/utils/trainer.py @@ -25,6 +25,7 @@ import wandb from embodichain.lab.gym.envs.managers.event_manager import EventManager +from .helper import flatten_dict_observation class Trainer: @@ -74,22 +75,22 @@ def __init__( # initial obs (assume env returns torch tensors already on target device) obs, _ = self.env.reset() - self.obs = obs # Initialize algorithm's buffer - self.observation_space = getattr(self.env, "observation_space", None) - self.action_space = getattr(self.env, "action_space", None) - obs_dim = ( - self.observation_space.shape[-1] - if self.observation_space - else self.obs.shape[-1] - ) - action_dim = self.action_space.shape[-1] if self.action_space else None + # Flatten dict observations from ObservationManager to tensor for RL algorithms + if isinstance(obs, dict): + obs_tensor = flatten_dict_observation(obs) + obs_dim = obs_tensor.shape[-1] + num_envs = obs_tensor.shape[0] + # Store flattened observation for RL training + self.obs = obs_tensor + + action_space = getattr(self.env, "action_space", None) + action_dim = action_space.shape[-1] if action_space else None if action_dim is None: raise RuntimeError( "Env must expose action_space with shape for buffer initialization." ) - num_envs = self.obs.shape[0] if self.obs.ndim == 2 else 1 # Algorithm manages its own buffer self.algorithm.initialize_buffer(num_steps, num_envs, obs_dim, action_dim) @@ -160,8 +161,9 @@ def on_step(obs, actions, reward, done, info, next_obs): self.curr_len[done_idx] = 0 # Update global step and observation + # next_obs is already flattened in algorithm's collect_rollout self.obs = next_obs - self.global_step += next_obs.shape[0] if next_obs.ndim == 2 else 1 + self.global_step += next_obs.shape[0] if isinstance(info, dict): rewards_dict = info.get("rewards") @@ -226,6 +228,8 @@ def _eval_once(self, num_episodes: int = 5): returns = [] for _ in range(num_episodes): obs, _ = self.eval_env.reset() + obs = flatten_dict_observation(obs) + done_any = torch.zeros( obs.shape[0] if obs.ndim == 2 else 1, dtype=torch.bool, @@ -235,8 +239,12 @@ def _eval_once(self, num_episodes: int = 5): ep_ret = torch.zeros(num_envs_eval, dtype=torch.float32, device=self.device) while not done_any.any(): actions, _, _ = self.policy.get_action(obs, deterministic=True) - result = self.eval_env.step(actions) - obs, reward, terminated, truncated, info = result + obs, reward, terminated, truncated, info = self.eval_env.step(actions) + + # Flatten dict observation from step + if isinstance(obs, dict): + obs = flatten_dict_observation(obs) + done = terminated | truncated reward = reward.float() done_any = done diff --git a/embodichain/lab/gym/envs/base_env.py b/embodichain/lab/gym/envs/base_env.py index 15f5d3be..99743607 100644 --- a/embodichain/lab/gym/envs/base_env.py +++ b/embodichain/lab/gym/envs/base_env.py @@ -15,6 +15,7 @@ # ---------------------------------------------------------------------------- import torch +import numpy as np import gymnasium as gym from typing import Dict, List, Union, Tuple, Any, Sequence @@ -173,6 +174,21 @@ def observation_space(self) -> gym.spaces.Space: self.single_observation_space, n=self.num_envs ) + @cached_property + def flattened_observation_space(self) -> gym.spaces.Box: + """Flattened observation space for RL training. + + Returns a Box space by computing total dimensions from nested dict observations. + This is needed because RL algorithms (PPO, SAC, etc.) require flat vector inputs. + """ + from embodichain.agents.rl.utils.helper import flatten_dict_observation + + flattened_obs = flatten_dict_observation(self._init_raw_obs) + total_dim = flattened_obs.shape[-1] + return gym.spaces.Box( + low=-np.inf, high=np.inf, shape=(total_dim,), dtype=np.float32 + ) + @cached_property def action_space(self) -> gym.spaces.Space: if self.num_envs == 1: @@ -276,6 +292,27 @@ def _update_sim_state(self, **kwargs): # TODO: Add randomization event here. pass + def _hook_after_sim_step( + self, + obs: EnvObs, + action: EnvAction, + dones: torch.Tensor, + terminateds: torch.Tensor, + info: Dict, + **kwargs, + ) -> None: + """Hook function called after each simulation step. + + Args: + obs: The observation dictionary. + action: The action taken by the agent. + dones: A tensor indicating which environments are done. + terminateds: A tensor indicating which environments are terminated. + info: A dictionary containing additional information. + **kwargs: Additional keyword arguments to be passed to the :meth:`_hook_after_sim_step` function. + """ + pass + def _initialize_episode(self, env_ids: Sequence[int] | None = None, **kwargs): """Initialize the simulation assets before each episode. Randomization can be performed at this stage. @@ -397,6 +434,30 @@ def check_truncated(self, obs: EnvObs, info: Dict[str, Any]) -> torch.Tensor: """ return torch.zeros(self.num_envs, dtype=torch.bool, device=self.device) + def _extend_reward( + self, + rewards: torch.Tensor, + obs: EnvObs, + action: EnvAction, + info: Dict[str, Any], + **kwargs, + ) -> torch.Tensor: + """Extend the reward computation. + + Overwrite this function to extend or modify the reward computation. + + Args: + rewards: The base reward tensor. + obs: The observation from the environment. + action: The action applied to the robot agent. + info: The info dictionary. + **kwargs: Additional keyword arguments. + + Returns: + The extended reward tensor. + """ + return rewards + def get_reward( self, obs: EnvObs, @@ -417,7 +478,9 @@ def get_reward( The reward for the current step. """ - return torch.zeros(self.num_envs, dtype=torch.float32, device=self.device) + rewards = torch.zeros(self.num_envs, dtype=torch.float32, device=self.device) + + return rewards def _step_action(self, action: EnvAction) -> EnvAction: """Set action control command into simulation. @@ -481,6 +544,9 @@ def step( obs = self.get_obs(**kwargs) info = self.get_info(**kwargs) rewards = self.get_reward(obs=obs, action=action, info=info) + rewards = self._extend_reward( + rewards=rewards, obs=obs, action=action, info=info + ) terminateds = torch.logical_or( info.get( @@ -496,12 +562,20 @@ def step( terminateds[:] = False dones = torch.logical_or(terminateds, truncateds) + + self._hook_after_sim_step( + obs=obs, + action=action, + dones=dones, + terminateds=terminateds, + info=info, + **kwargs, + ) + reset_env_ids = dones.nonzero(as_tuple=False).squeeze(-1) if len(reset_env_ids) > 0: obs, _ = self.reset(options={"reset_ids": reset_env_ids}) - # TODO: may be add hook for observation postprocessing. - return obs, rewards, terminateds, truncateds, info def close(self) -> None: diff --git a/embodichain/lab/gym/envs/embodied_env.py b/embodichain/lab/gym/envs/embodied_env.py index 8ce31346..4e6b1b98 100644 --- a/embodichain/lab/gym/envs/embodied_env.py +++ b/embodichain/lab/gym/envs/embodied_env.py @@ -42,6 +42,7 @@ from embodichain.lab.gym.envs.managers import ( EventManager, ObservationManager, + RewardManager, DatasetManager, ) from embodichain.lab.gym.utils.registration import register_env @@ -91,6 +92,13 @@ class EnvLightCfg: Please refer to the :class:`embodichain.lab.gym.managers.ObservationManager` class for more details. """ + rewards: Union[object, None] = None + """Reward settings. Defaults to None, in which case no reward computation is performed through + the reward manager. + + Please refer to the :class:`embodichain.lab.gym.managers.RewardManager` class for more details. + """ + dataset: Union[object, None] = None """Dataset settings. Defaults to None, in which case no dataset collection is performed. @@ -149,7 +157,6 @@ def __init__(self, cfg: EmbodiedEnvCfg, **kwargs): # TODO: Change to array like data structure to handle different demo action list length for across different arena. self.action_length: int = 0 # Set by create_demo_action_list - self._action_step_counter: int = 0 # Track steps within current action sequence extensions = getattr(cfg, "extensions", {}) or {} @@ -157,6 +164,11 @@ def __init__(self, cfg: EmbodiedEnvCfg, **kwargs): setattr(cfg, name, value) setattr(self, name, value) + self.event_manager: EventManager | None = None + self.observation_manager: ObservationManager | None = None + self.reward_manager: RewardManager | None = None + self.dataset_manager: DatasetManager | None = None + super().__init__(cfg, **kwargs) def _init_sim_state(self, **kwargs): @@ -176,6 +188,9 @@ def _init_sim_state(self, **kwargs): if self.cfg.observations: self.observation_manager = ObservationManager(self.cfg.observations, self) + if self.cfg.rewards: + self.reward_manager = RewardManager(self.cfg.rewards, self) + if self.cfg.dataset: self.dataset_manager = DatasetManager(self.cfg.dataset, self) @@ -257,54 +272,15 @@ def get_affordance(self, key: str, default: Any = None): """ return self.affordance_datas.get(key, default) - def reset( - self, seed: int | None = None, options: dict | None = None - ) -> Tuple[EnvObs, Dict]: - obs, info = super().reset(seed=seed, options=options) - self._action_step_counter = 0 # Reset action step counter - return obs, info - - def step( - self, action: EnvAction, **kwargs - ) -> Tuple[EnvObs, torch.Tensor, torch.Tensor, torch.Tensor, Dict[str, Any]]: - """Step the environment with the given action. - - Extends BaseEnv.step() to integrate with DatasetManager for automatic - data collection and saving. The key is to: - 1. Record obs-action pairs as they happen - 2. Detect episode completion - 3. Auto-save episodes BEFORE reset - 4. Then perform the actual reset - """ - self._elapsed_steps += 1 - self._action_step_counter += 1 # Increment action sequence counter - - action = self._step_action(action=action) - self.sim.update(self.sim_cfg.physics_dt, self.cfg.sim_steps_per_control) - self._update_sim_state(**kwargs) - - obs = self.get_obs(**kwargs) - info = self.get_info(**kwargs) - rewards = self.get_reward(obs=obs, action=action, info=info) - - # Check termination conditions - terminateds = torch.logical_or( - info.get( - "success", - torch.zeros(self.num_envs, dtype=torch.bool, device=self.device), - ), - info.get( - "fail", torch.zeros(self.num_envs, dtype=torch.bool, device=self.device) - ), - ) - truncateds = self.check_truncated(obs=obs, info=info) - if self.cfg.ignore_terminations: - terminateds[:] = False - - # Detect which environments need reset - dones = torch.logical_or(terminateds, truncateds) - reset_env_ids = dones.nonzero(as_tuple=False).squeeze(-1) - + def _hook_after_sim_step( + self, + obs: EnvObs, + action: EnvAction, + dones: torch.Tensor, + terminateds: torch.Tensor, + info: Dict, + **kwargs, + ): # Call dataset manager with mode="save": it will record and auto-save if dones=True if self.cfg.dataset: if "save" in self.dataset_manager.available_modes: @@ -318,17 +294,26 @@ def step( info=info, ) - # Now perform reset for completed environments - if len(reset_env_ids) > 0: - obs, _ = self.reset(options={"reset_ids": reset_env_ids}) - - return obs, rewards, terminateds, truncateds, info - def _extend_obs(self, obs: EnvObs, **kwargs) -> EnvObs: if self.observation_manager: obs = self.observation_manager.compute(obs) return obs + def _extend_reward( + self, + rewards: torch.Tensor, + obs: EnvObs, + action: EnvAction, + info: Dict[str, Any], + **kwargs, + ) -> torch.Tensor: + if self.reward_manager: + rewards, reward_info = self.reward_manager.compute( + obs=obs, action=action, info=info + ) + info["rewards"] = reward_info + return rewards + def _prepare_scene(self, **kwargs) -> None: self._setup_lights() self._setup_background() @@ -352,6 +337,10 @@ def _initialize_episode( if "reset" in self.event_manager.available_modes: self.event_manager.apply(mode="reset", env_ids=env_ids) + # reset reward manager for environments that need a reset + if self.cfg.rewards: + self.reward_manager.reset(env_ids=env_ids) + def _step_action(self, action: EnvAction) -> EnvAction: """Set action control command into simulation. @@ -533,8 +522,8 @@ def check_truncated(self, obs: EnvObs, info: Dict[str, Any]) -> torch.Tensor: A boolean tensor indicating truncation for each environment in the batch. """ # Check if action sequence has reached its end - if self.action_length > 0 and self._action_step_counter >= self.action_length: - return torch.ones(self.num_envs, dtype=torch.bool, device=self.device) + if self.action_length > 0: + return self._elapsed_steps >= self.action_length return super().check_truncated(obs, info) diff --git a/embodichain/lab/gym/envs/managers/__init__.py b/embodichain/lab/gym/envs/managers/__init__.py index e38f4f22..88c96ef0 100644 --- a/embodichain/lab/gym/envs/managers/__init__.py +++ b/embodichain/lab/gym/envs/managers/__init__.py @@ -19,10 +19,12 @@ SceneEntityCfg, EventCfg, ObservationCfg, + RewardCfg, DatasetFunctorCfg, ) from .manager_base import Functor, ManagerBase from .event_manager import EventManager from .observation_manager import ObservationManager +from .reward_manager import RewardManager from .dataset_manager import DatasetManager from .datasets import * diff --git a/embodichain/lab/gym/envs/managers/cfg.py b/embodichain/lab/gym/envs/managers/cfg.py index 07888c9f..6aeb2804 100644 --- a/embodichain/lab/gym/envs/managers/cfg.py +++ b/embodichain/lab/gym/envs/managers/cfg.py @@ -231,7 +231,7 @@ def resolve(self, scene: SimulationManager): def _resolve_joint_names(self, scene: SimulationManager): # convert joint names to indices based on regex if self.joint_names is not None or self.joint_ids != slice(None): - entity: Articulation = scene[self.uid] + entity: Articulation = scene.get_articulation(self.uid) # -- if both are not their default values, check if they are valid if self.joint_names is not None and self.joint_ids != slice(None): if isinstance(self.joint_names, str): @@ -272,7 +272,7 @@ def _resolve_joint_names(self, scene: SimulationManager): def _resolve_body_names(self, scene: SimulationManager): # convert body names to indices based on regex if self.body_names is not None or self.body_ids != slice(None): - entity: RigidObject = scene[self.uid] + entity: RigidObject = scene.get_rigid_object(self.uid) # -- if both are not their default values, check if they are valid if self.body_names is not None and self.body_ids != slice(None): if isinstance(self.body_names, str): @@ -311,6 +311,29 @@ def _resolve_body_names(self, scene: SimulationManager): self.body_names = [entity.body_names[i] for i in self.body_ids] +@configclass +class RewardCfg(FunctorCfg): + """Configuration for a reward functor. + + The reward functor is used to compute rewards for the environment. The `mode` attribute + determines how the reward is combined with existing rewards. + """ + + mode: Literal["add", "replace"] = "add" + """The mode for the reward computation. + + - `add`: The reward is added to the existing total reward. + - `replace`: The reward replaces the total reward (useful for single reward functions). + """ + + weight: float = 1.0 + """The weight multiplier for this reward term. + + This value is used to scale the reward before adding it to the total reward. + Default is 1.0 (no scaling). + """ + + @configclass class DatasetFunctorCfg(FunctorCfg): """Configuration for dataset collection functors. diff --git a/embodichain/lab/gym/envs/managers/dataset_manager.py b/embodichain/lab/gym/envs/managers/dataset_manager.py index 0a8e9d49..a0ca1688 100644 --- a/embodichain/lab/gym/envs/managers/dataset_manager.py +++ b/embodichain/lab/gym/envs/managers/dataset_manager.py @@ -246,9 +246,45 @@ def finalize(self) -> Optional[str]: return None - """ - Operations - Functor settings. - """ + def get_cached_data(self) -> list[Dict[str, Any]]: + """Get cached data from all dataset functors (for online training). + + Iterates through all functors and collects cached data from those + that support online training mode (have get_cached_data method). + + Returns: + List of cached data dictionaries from all functors. + """ + all_cached_data = [] + + # Iterate through all modes and functors + for mode_cfgs in self._mode_functor_cfgs.values(): + for functor_cfg in mode_cfgs: + if hasattr(functor_cfg.func, "get_cached_data"): + cached_data = functor_cfg.func.get_cached_data() + all_cached_data.extend(cached_data) + + return all_cached_data + + def clear_cache(self) -> int: + """Clear cached data from all dataset functors (for online training). + + Iterates through all functors and clears their cache if they + support online training mode (have clear_cache method). + + Returns: + Total number of cached items cleared across all functors. + """ + total_cleared = 0 + + # Iterate through all modes and functors + for mode_cfgs in self._mode_functor_cfgs.values(): + for functor_cfg in mode_cfgs: + if hasattr(functor_cfg.func, "clear_cache"): + cleared = functor_cfg.func.clear_cache() + total_cleared += cleared + + return total_cleared def get_functor_cfg(self, functor_name: str) -> DatasetFunctorCfg: """Gets the configuration for the specified functor. @@ -267,10 +303,6 @@ def get_functor_cfg(self, functor_name: str) -> DatasetFunctorCfg: return self._mode_functor_cfgs[mode][functors.index(functor_name)] logger.log_error(f"Dataset functor '{functor_name}' not found.") - """ - Helper functions. - """ - def _prepare_functors(self): """Prepare dataset functors from configuration. diff --git a/embodichain/lab/gym/envs/managers/datasets.py b/embodichain/lab/gym/envs/managers/datasets.py index 09c0c1e2..625ce8e7 100644 --- a/embodichain/lab/gym/envs/managers/datasets.py +++ b/embodichain/lab/gym/envs/managers/datasets.py @@ -40,11 +40,9 @@ from lerobot.datasets.lerobot_dataset import LeRobotDataset, HF_LEROBOT_HOME LEROBOT_AVAILABLE = True - __all__ = ["LeRobotRecorder"] except ImportError: LEROBOT_AVAILABLE = False - __all__ = [] @@ -72,6 +70,11 @@ def __init__(self, cfg: DatasetFunctorCfg, env: EmbodiedEnv): - export_success_only: Whether to export only successful episodes env: The environment instance """ + if not LEROBOT_AVAILABLE: + logger.log_error( + "LeRobot is not installed. Please install it with: pip install lerobot" + ) + super().__init__(cfg, env) # Extract parameters from cfg.params @@ -104,8 +107,6 @@ def __init__(self, cfg: DatasetFunctorCfg, env: EmbodiedEnv): # Initialize dataset self._initialize_dataset() - logger.log_info(f"LeRobotRecorder initialized at: {self.dataset_path}") - @property def dataset_path(self) -> str: """Path to the dataset directory.""" @@ -215,7 +216,6 @@ def _save_episodes( elif terminateds is not None: is_success = terminateds[env_id].item() - logger.log_info(f"Episode {env_id} success: {is_success}") if self.export_success_only and not is_success: logger.log_info(f"Skipping failed episode for env {env_id}") continue @@ -295,11 +295,6 @@ def _initialize_dataset(self) -> None: fps = self.robot_meta.get("control_freq", 30) features = self._build_features() - logger.log_info("------------------------------------------") - logger.log_info(f"Building dataset: {dataset_name}") - logger.log_info(f"Parent directory: {lerobot_data_root}") - logger.log_info(f"Full path: {self.dataset_full_path}") - self.dataset = LeRobotDataset.create( repo_id=dataset_name, fps=fps, diff --git a/embodichain/lab/gym/envs/managers/observations.py b/embodichain/lab/gym/envs/managers/observations.py index d7ae016d..7e500af2 100644 --- a/embodichain/lab/gym/envs/managers/observations.py +++ b/embodichain/lab/gym/envs/managers/observations.py @@ -252,6 +252,51 @@ def compute_semantic_mask( return torch.stack(masks, dim=-1) +def get_robot_eef_pose( + env: "EmbodiedEnv", + obs: EnvObs, + part_name: str | None = None, + position_only: bool = False, +) -> torch.Tensor: + """Get robot end-effector pose using forward kinematics. + + Args: + env: The environment instance. + obs: The observation dictionary. + part_name: The name of the control part. If None, uses default part. + position_only: If True, returns only position (3D). If False, returns full pose (4x4 matrix). + + Returns: + A tensor of shape (num_envs, 3) if position_only=True, or (num_envs, 4, 4) otherwise. + """ + robot = env.robot + + if part_name is not None: + joint_ids = robot.get_joint_ids(part_name) + qpos = robot.body_data.qpos[:, joint_ids] + ee_pose = robot.compute_fk(name=part_name, qpos=qpos, to_matrix=True) + else: + qpos = robot.get_qpos() + ee_pose = robot.compute_fk(qpos=qpos, to_matrix=True) + + if position_only: + return ee_pose[:, :3, 3] + return ee_pose + + +def target_position( + env: "EmbodiedEnv", + obs: EnvObs, + target_pose_key: str = "goal_pose", +) -> torch.Tensor: + """Get virtual target position from env state.""" + state_attr = f"_{target_pose_key}s" + if hasattr(env, state_attr): + target_poses = getattr(env, state_attr) + return target_poses[:, :3, 3] + return torch.zeros(env.num_envs, 3, device=env.device) + + class compute_exteroception(Functor): """Compute the exteroception for the observation space. diff --git a/embodichain/lab/gym/envs/managers/randomization/spatial.py b/embodichain/lab/gym/envs/managers/randomization/spatial.py index 2437e883..c3e32d36 100644 --- a/embodichain/lab/gym/envs/managers/randomization/spatial.py +++ b/embodichain/lab/gym/envs/managers/randomization/spatial.py @@ -266,3 +266,86 @@ def randomize_robot_qpos( robot.set_qpos(qpos=current_qpos, env_ids=env_ids, joint_ids=joint_ids) env.sim.update(step=100) + + +def randomize_target_pose( + env: EmbodiedEnv, + env_ids: Union[torch.Tensor, None], + position_range: tuple[list[float], list[float]], + rotation_range: tuple[list[float], list[float]] | None = None, + relative_position: bool = False, + relative_rotation: bool = False, + reference_entity_cfg: SceneEntityCfg | None = None, + store_key: str = "target_pose", +) -> None: + """Randomize a virtual target pose and store in env state for use in get_info(). + + This function generates random target poses without requiring a physical object in the scene. + The generated poses are stored in env and should be exposed in get_info() for reward functors. + + Args: + env (EmbodiedEnv): The environment instance. + env_ids (Union[torch.Tensor, None]): The environment IDs to apply the randomization. + position_range (tuple[list[float], list[float]]): The range for the position randomization. + rotation_range (tuple[list[float], list[float]] | None): The range for the rotation randomization. + The rotation is represented as Euler angles (roll, pitch, yaw) in degree. + relative_position (bool): Whether to randomize the position relative to a reference entity. Default is False. + relative_rotation (bool): Whether to randomize the rotation relative to a reference entity. Default is False. + reference_entity_cfg (SceneEntityCfg | None): The reference entity for relative randomization. + If None and relative mode is True, uses world origin. + store_key (str): The key to store the target pose in env state. Default is "target_pose". + The pose will be stored in env._{store_key}s and should be exposed in info[store_key]. + """ + num_instance = len(env_ids) + + # Get reference pose if needed + if relative_position or relative_rotation: + if reference_entity_cfg is not None: + # Get reference entity pose + ref_obj = env.sim.get_rigid_object(reference_entity_cfg.uid) + if ref_obj is not None: + ref_pose = ref_obj.get_local_pose(to_matrix=True)[env_ids] + init_pos = ref_pose[:, :3, 3] + init_rot = ref_pose[:, :3, :3] + else: + # Fallback to world origin + init_pos = torch.zeros(num_instance, 3, device=env.device) + init_rot = ( + torch.eye(3, device=env.device) + .unsqueeze(0) + .repeat(num_instance, 1, 1) + ) + else: + # Use world origin as reference + init_pos = torch.zeros(num_instance, 3, device=env.device) + init_rot = ( + torch.eye(3, device=env.device).unsqueeze(0).repeat(num_instance, 1, 1) + ) + else: + # Absolute randomization, init values won't be used + init_pos = torch.zeros(num_instance, 3, device=env.device) + init_rot = ( + torch.eye(3, device=env.device).unsqueeze(0).repeat(num_instance, 1, 1) + ) + + # Generate random pose + pose = get_random_pose( + init_pos=init_pos, + init_rot=init_rot, + position_range=position_range, + rotation_range=rotation_range, + relative_position=relative_position, + relative_rotation=relative_rotation, + ) + + # Store in env state (to be exposed via get_info) + state_attr = f"_{store_key}" + if not hasattr(env, state_attr): + setattr( + env, + state_attr, + torch.zeros(env.num_envs, 4, 4, device=env.device, dtype=torch.float32), + ) + + target_poses = getattr(env, state_attr) + target_poses[env_ids] = pose diff --git a/embodichain/lab/gym/envs/managers/reward_manager.py b/embodichain/lab/gym/envs/managers/reward_manager.py new file mode 100644 index 00000000..b3e46de7 --- /dev/null +++ b/embodichain/lab/gym/envs/managers/reward_manager.py @@ -0,0 +1,229 @@ +# ---------------------------------------------------------------------------- +# Copyright (c) 2021-2025 DexForce Technology Co., Ltd. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ---------------------------------------------------------------------------- + +"""Reward manager for orchestrating reward computation in reinforcement learning tasks.""" + +from __future__ import annotations + +import inspect +import torch +from collections.abc import Sequence +from prettytable import PrettyTable +from typing import TYPE_CHECKING, Union + +from embodichain.utils import logger +from .manager_base import ManagerBase +from .cfg import RewardCfg + +if TYPE_CHECKING: + from embodichain.lab.gym.envs import EmbodiedEnv + + +class RewardManager(ManagerBase): + """Manager for orchestrating reward computation in reinforcement learning tasks. + + The reward manager computes rewards based on the current state of the environment and actions. + It supports multiple reward terms that can be combined through weighted summation. + + The reward manager offers two modes of operation: + - `add`: This mode computes a reward term and adds it to the total reward (weighted by the term's weight). + - `replace`: This mode replaces the total reward with the computed value (useful for single reward functions). + + Note: The config key is used as the unique identifier and display name for each reward functor. + """ + + _env: EmbodiedEnv + """The environment instance.""" + + def __init__(self, cfg: object, env: EmbodiedEnv): + """Initialize the reward manager. + + Args: + cfg: A configuration object or dictionary (``dict[str, RewardCfg]``). + env: An environment object. + """ + + self._mode_functor_names: dict[str, list[str]] = dict() + self._mode_functor_cfgs: dict[str, list[RewardCfg]] = dict() + self._mode_class_functor_cfgs: dict[str, list[RewardCfg]] = dict() + + # call the base class (this will parse the functors config) + super().__init__(cfg, env) + + def __str__(self) -> str: + """Returns: A string representation for reward manager.""" + functor_num = sum(len(v) for v in self._mode_functor_names.values()) + msg = f" contains {functor_num} active reward terms.\n" + + # add info on each mode + for mode in self._mode_functor_names: + # create table for functor information + table = PrettyTable() + table.title = f"Active Reward Terms in Mode: '{mode}'" + + table.field_names = ["Index", "Name", "Weight"] + table.align["Name"] = "l" + for index, name in enumerate(self._mode_functor_names[mode]): + functor_cfg = self._mode_functor_cfgs[mode][index] + weight = getattr(functor_cfg, "weight", 1.0) + table.add_row([index, name, f"{weight:.3f}"]) + + # convert table to string + msg += table.get_string() + msg += "\n" + return msg + + @property + def active_functors(self) -> dict[str, list[str]]: + """Name of active reward functors. + + The keys are the modes of reward computation and the values are the names of the reward functors. + """ + return self._mode_functor_names + + def reset(self, env_ids: Union[Sequence[int], None] = None) -> dict[str, float]: + """Reset reward terms that are stateful (implemented as classes). + + Args: + env_ids: The environment indices to reset. If None, all environments are reset. + + Returns: + An empty dictionary (no logging needed for reset). + """ + # call all functors that are classes + for mode_cfg in self._mode_class_functor_cfgs.values(): + for functor_cfg in mode_cfg: + functor_cfg.func.reset(env_ids=env_ids) + + # nothing to log here + return {} + + def compute( + self, + obs: "EnvObs", + action: "EnvAction", + info: dict, + ) -> tuple[torch.Tensor, dict[str, torch.Tensor]]: + """Compute the total reward by calling each reward functor. + + This function iterates over all the reward functors and calls them to compute individual + reward terms. The terms are then combined according to their mode and weight. + + Args: + obs: The observation from the environment. + action: The action applied to the robot. + info: Additional information dictionary. + + Returns: + A tuple containing: + - total_reward: The total reward for each environment (shape: [num_envs]). + - reward_info: A dictionary mapping reward term names to their values for logging. + + Raises: + ValueError: If the mode is not supported. + """ + # initialize total reward + total_reward = torch.zeros(self._env.num_envs, device=self._env.device) + reward_info = {} + + # iterate over all the reward functors + for mode, functor_cfgs in self._mode_functor_cfgs.items(): + for functor_name, functor_cfg in zip( + self._mode_functor_names[mode], functor_cfgs + ): + functor_cfg: RewardCfg + + # compute reward term + reward_term = functor_cfg.func( + self._env, obs=obs, action=action, info=info, **functor_cfg.params + ) + + # ensure reward is a tensor + if not isinstance(reward_term, torch.Tensor): + reward_term = torch.tensor( + reward_term, device=self._env.device, dtype=torch.float32 + ) + + # apply weight from config + weighted_reward = reward_term * functor_cfg.weight + + # combine reward based on mode + if mode == "add": + total_reward += weighted_reward + elif mode == "replace": + total_reward = weighted_reward + else: + logger.log_error(f"Unsupported reward mode '{mode}'.") + + # store for logging (use unweighted value for clarity) + reward_info[functor_name] = reward_term + + return total_reward, reward_info + + def get_functor_cfg(self, functor_name: str) -> RewardCfg: + """Gets the configuration for the specified functor. + + The method finds the functor by name by searching through all the modes. + It then returns the configuration of the functor with the first matching name. + + Args: + functor_name: The name of the reward functor. + + Returns: + The configuration of the reward functor. + + Raises: + ValueError: If the functor name is not found. + """ + for mode, functors in self._mode_functor_names.items(): + if functor_name in functors: + return self._mode_functor_cfgs[mode][functors.index(functor_name)] + logger.log_error(f"Reward functor '{functor_name}' not found.") + + def _prepare_functors(self): + # check if config is dict already + if isinstance(self.cfg, dict): + cfg_items = self.cfg.items() + else: + cfg_items = self.cfg.__dict__.items() + # iterate over all the functors + for functor_name, functor_cfg in cfg_items: + # check for non config + if functor_cfg is None: + continue + # check for valid config type + if not isinstance(functor_cfg, RewardCfg): + raise TypeError( + f"Configuration for the functor '{functor_name}' is not of type RewardCfg." + f" Received: '{type(functor_cfg)}'." + ) + + # resolve common parameters + self._resolve_common_functor_cfg(functor_name, functor_cfg, min_argc=4) + + # check if mode is a new mode + if functor_cfg.mode not in self._mode_functor_names: + # add new mode + self._mode_functor_names[functor_cfg.mode] = list() + self._mode_functor_cfgs[functor_cfg.mode] = list() + self._mode_class_functor_cfgs[functor_cfg.mode] = list() + # add functor name and parameters + self._mode_functor_names[functor_cfg.mode].append(functor_name) + self._mode_functor_cfgs[functor_cfg.mode].append(functor_cfg) + + # check if the functor is a class + if inspect.isclass(functor_cfg.func): + self._mode_class_functor_cfgs[functor_cfg.mode].append(functor_cfg) diff --git a/embodichain/lab/gym/envs/managers/rewards.py b/embodichain/lab/gym/envs/managers/rewards.py new file mode 100644 index 00000000..5cfbef55 --- /dev/null +++ b/embodichain/lab/gym/envs/managers/rewards.py @@ -0,0 +1,632 @@ +# ---------------------------------------------------------------------------- +# Copyright (c) 2021-2025 DexForce Technology Co., Ltd. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ---------------------------------------------------------------------------- + +"""Common reward functors for reinforcement learning tasks.""" + +from __future__ import annotations + +import torch +from typing import TYPE_CHECKING + +from embodichain.lab.gym.envs.managers.cfg import SceneEntityCfg + +if TYPE_CHECKING: + from embodichain.lab.gym.envs import EmbodiedEnv + + +def distance_between_objects( + env: EmbodiedEnv, + obs: dict, + action: torch.Tensor, + info: dict, + source_entity_cfg: SceneEntityCfg = None, + target_entity_cfg: SceneEntityCfg = None, + exponential: bool = False, + sigma: float = 1.0, +) -> torch.Tensor: + """Reward based on distance between two rigid objects. + + Encourages the source object to get closer to the target object. Can use either + linear negative distance or exponential Gaussian-shaped reward. + + Args: + source_entity_cfg: Configuration for the source object (e.g., {"uid": "cube"}) + target_entity_cfg: Configuration for the target object (e.g., {"uid": "goal_sphere"}) + exponential: If True, use exponential reward exp(-d²/2σ²), else use -distance + sigma: Standard deviation for exponential reward (controls reward spread) + + Returns: + Reward tensor of shape (num_envs,). Higher when objects are closer. + - Linear mode: ranges from -inf to 0 (0 when objects touch) + - Exponential mode: ranges from 0 to 1 (1 when objects touch) + + Example: + ```json + { + "func": "distance_between_objects", + "weight": 0.5, + "params": { + "source_entity_cfg": {"uid": "cube"}, + "target_entity_cfg": {"uid": "target"}, + "exponential": true, + "sigma": 0.2 + } + } + ``` + """ + # get source entity position + source_obj = env.sim.get_rigid_object(source_entity_cfg.uid) + source_pos = source_obj.get_local_pose(to_matrix=True)[:, :3, 3] + + # get target entity position + target_obj = env.sim.get_rigid_object(target_entity_cfg.uid) + target_pos = target_obj.get_local_pose(to_matrix=True)[:, :3, 3] + + # compute distance + distance = torch.norm(source_pos - target_pos, dim=-1) + + # compute reward + if exponential: + # exponential reward: exp(-distance^2 / (2 * sigma^2)) + reward = torch.exp(-(distance**2) / (2 * sigma**2)) + else: + # negative distance reward + reward = -distance + + return reward + + +def joint_velocity_penalty( + env: EmbodiedEnv, + obs: dict, + action: torch.Tensor, + info: dict, + robot_uid: str = "robot", + joint_ids: slice | list[int] | None = None, + part_name: str | None = None, +) -> torch.Tensor: + """Penalize high joint velocities to encourage smooth motion. + + Computes the L2 norm of joint velocities and returns negative value as penalty. + Useful for preventing jerky or unstable robot movements. + + Args: + robot_uid: Robot entity UID in simulation (default: "robot") + joint_ids: Specific joint indices to penalize. Takes priority over part_name. + Example: [0, 1, 2] or slice(0, 6) + part_name: Control part name (e.g., "arm"). Used only if joint_ids is None. + Will penalize all joints in the specified part. + + Returns: + Penalty tensor of shape (num_envs,). Always negative or zero. + Magnitude increases with joint velocity (larger velocity = more negative). + + Example: + ```json + { + "func": "joint_velocity_penalty", + "weight": 0.001, + "params": { + "robot_uid": "robot", + "part_name": "arm" + } + } + ``` + """ + robot = env.sim.get_robot(robot_uid) + + # get joint velocities + if joint_ids is not None: + qvel = robot.get_qvel()[:, joint_ids] + elif part_name is not None: + qvel = robot.get_qvel(name=part_name) + else: + qvel = robot.get_qvel() + + # compute L2 norm of joint velocities + velocity_norm = torch.norm(qvel, dim=-1) + + # negative penalty (higher velocity = more negative reward) + return -velocity_norm + + +def action_smoothness_penalty( + env: EmbodiedEnv, + obs: dict, + action: torch.Tensor, + info: dict, +) -> torch.Tensor: + """Penalize large action changes between consecutive timesteps. + + Encourages smooth control commands by penalizing sudden changes in actions. + Stores previous action in env._reward_states for comparison. + + Returns: + Penalty tensor of shape (num_envs,). Zero on first call (no previous action), + negative on subsequent calls (larger change = more negative). + + Note: + This function maintains state across calls using env._reward_states['prev_actions']. + State is automatically reset when the environment resets. + + Example: + ```json + { + "func": "action_smoothness_penalty", + "weight": 0.01, + "params": {} + } + ``` + """ + # Use dictionary-based state management + if not hasattr(env, "_reward_states"): + env._reward_states = {} + + # compute difference between current and previous action + if "prev_actions" in env._reward_states: + action_diff = action - env._reward_states["prev_actions"] + penalty = -torch.norm(action_diff, dim=-1) + else: + # no previous action, no penalty + penalty = torch.zeros(env.num_envs, device=env.device) + + # store current action for next step + env._reward_states["prev_actions"] = action.clone() + + return penalty + + +def joint_limit_penalty( + env: EmbodiedEnv, + obs: dict, + action: torch.Tensor, + info: dict, + robot_uid: str = "robot", + joint_ids: slice | list[int] = slice(None), + margin: float = 0.1, +) -> torch.Tensor: + """Penalize robot joints that are close to their position limits. + + Prevents joints from reaching their physical limits, which can cause instability + or singularities. Penalty increases as joints approach limits within the margin. + + Args: + robot_uid: Robot entity UID in simulation (default: "robot") + joint_ids: Joint indices to monitor (default: all joints) + margin: Normalized distance threshold (0 to 1). Penalty applied when joint + is within this fraction of its range from either limit. + Example: 0.1 means penalty when within 10% of limits. + + Returns: + Penalty tensor of shape (num_envs,). Always negative or zero. + Sum of penalties across all monitored joints. + + Example: + ```json + { + "func": "joint_limit_penalty", + "weight": 0.01, + "params": { + "robot_uid": "robot", + "joint_ids": [0, 1, 2, 3, 4, 5], + "margin": 0.1 + } + } + ``` + """ + robot = env.sim.get_robot(robot_uid) + + # get joint positions and limits + qpos = robot.get_qpos()[:, joint_ids] + qpos_limits = robot.get_qpos_limits()[:, joint_ids, :] + + # compute normalized position in range [0, 1] + qpos_normalized = (qpos - qpos_limits[:, :, 0]) / ( + qpos_limits[:, :, 1] - qpos_limits[:, :, 0] + ) + + # compute distance to limits (minimum of distance to lower and upper limit) + dist_to_lower = qpos_normalized + dist_to_upper = 1.0 - qpos_normalized + dist_to_limit = torch.min(dist_to_lower, dist_to_upper) + + # penalize joints within margin of limits + penalty_mask = dist_to_limit < margin + penalty = torch.where( + penalty_mask, + -(margin - dist_to_limit), # negative penalty + torch.zeros_like(dist_to_limit), + ) + + # sum over all joints + return penalty.sum(dim=-1) + + +def orientation_alignment( + env: EmbodiedEnv, + obs: dict, + action: torch.Tensor, + info: dict, + source_entity_cfg: SceneEntityCfg = None, + target_entity_cfg: SceneEntityCfg = None, +) -> torch.Tensor: + """Reward rotational alignment between two rigid objects. + + Encourages the source object's orientation to match the target object's orientation. + Uses rotation matrix trace to measure alignment. + + Args: + source_entity_cfg: Configuration for the source object (e.g., {"uid": "cube"}) + target_entity_cfg: Configuration for the target object (e.g., {"uid": "reference"}) + + Returns: + Reward tensor of shape (num_envs,). Ranges from -1 to 1. + - 1.0: Perfect alignment (same orientation) + - 0.0: 90° rotation difference + - -1.0: 180° rotation difference (opposite orientation) + + Example: + ```json + { + "func": "orientation_alignment", + "weight": 0.5, + "params": { + "source_entity_cfg": {"uid": "object"}, + "target_entity_cfg": {"uid": "goal_object"} + } + } + ``` + """ + # get source entity rotation matrix + source_obj = env.sim.get_rigid_object(source_entity_cfg.uid) + source_rot = source_obj.get_local_pose(to_matrix=True)[:, :3, :3] + + # get target entity rotation matrix + target_obj = env.sim.get_rigid_object(target_entity_cfg.uid) + target_rot = target_obj.get_local_pose(to_matrix=True)[:, :3, :3] + + # compute rotation difference + rot_diff = torch.bmm(source_rot, target_rot.transpose(-1, -2)) + + # trace of rotation matrix difference (measure of alignment) + # trace = 1 + 2*cos(theta) for rotation by angle theta + # normalized to range [0, 1] where 1 is perfect alignment + trace = rot_diff.diagonal(dim1=-2, dim2=-1).sum(-1) + alignment = (trace - 1.0) / 2.0 # normalize to [-1, 1] + + return alignment + + +def success_reward( + env: EmbodiedEnv, + obs: dict, + action: torch.Tensor, + info: dict, +) -> torch.Tensor: + """Sparse bonus reward when task succeeds. + + Provides a fixed reward when the task success condition is met. + Reads success status from info['success'] which should be set by the environment. + + Returns: + Reward tensor of shape (num_envs,). + - 1.0 when successful + - 0.0 when not successful or if 'success' key missing + + Note: + The environment's get_info() must populate info['success'] with a boolean + tensor indicating success status for each environment. + + Example: + ```json + { + "func": "success_reward", + "weight": 10.0, + "params": {} + } + ``` + """ + # Check if success info is available in info dict + if "success" in info: + success = info["success"] + if isinstance(success, bool): + success = torch.tensor([success], device=env.device, dtype=torch.bool) + elif not isinstance(success, torch.Tensor): + success = torch.tensor(success, device=env.device, dtype=torch.bool) + else: + # No success info available + return torch.zeros(env.num_envs, device=env.device) + + # return reward + return torch.where( + success, + torch.ones(env.num_envs, device=env.device), + torch.zeros(env.num_envs, device=env.device), + ) + + +def reaching_behind_object( + env: EmbodiedEnv, + obs: dict, + action: torch.Tensor, + info: dict, + object_cfg: SceneEntityCfg = None, + target_pose_key: str = "goal_pose", + behind_offset: float = 0.015, + height_offset: float = 0.015, + distance_scale: float = 5.0, + part_name: str = None, +) -> torch.Tensor: + """Reward for positioning end-effector behind object for pushing. + + Encourages the robot's end-effector to reach a position behind the object along + the object-to-goal direction. Useful for push manipulation tasks. + + Args: + object_cfg: Configuration for the object to push (e.g., {"uid": "cube"}) + target_pose_key: Key in info dict for goal pose (default: "goal_pose") + Can be (num_envs, 3) position or (num_envs, 4, 4) transform + behind_offset: Distance behind object to reach (in meters, default: 0.015) + height_offset: Additional height above object (in meters, default: 0.015) + distance_scale: Scaling factor for tanh function (higher = steeper, default: 5.0) + part_name: Robot part name for FK computation (e.g., "arm") + + Returns: + Reward tensor of shape (num_envs,). Ranges from 0 to 1. + - 1.0: End-effector at ideal pushing position + - 0.0: End-effector far from ideal position + + Example: + ```json + { + "func": "reaching_behind_object", + "weight": 0.1, + "params": { + "object_cfg": {"uid": "cube"}, + "target_pose_key": "goal_pose", + "behind_offset": 0.015, + "height_offset": 0.015, + "distance_scale": 5.0, + "part_name": "arm" + } + } + ``` + """ + # get end effector position from robot FK + robot = env.robot + joint_ids = robot.get_joint_ids(part_name) + qpos = robot.get_qpos()[:, joint_ids] + ee_pose = robot.compute_fk(name=part_name, qpos=qpos, to_matrix=True) + ee_pos = ee_pose[:, :3, 3] + + # get object position + obj = env.sim.get_rigid_object(object_cfg.uid) + obj_pos = obj.get_local_pose(to_matrix=True)[:, :3, 3] + + # get goal position from info + if target_pose_key not in info: + raise ValueError( + f"Target pose '{target_pose_key}' not found in info dict. " + f"Make sure to provide it in get_info()." + ) + + target_poses = info[target_pose_key] + if target_poses.dim() == 2: # (num_envs, 3) + goal_pos = target_poses + else: # (num_envs, 4, 4) + goal_pos = target_poses[:, :3, 3] + + # compute push direction (from object to goal) + push_direction = goal_pos - obj_pos + push_dir_norm = torch.norm(push_direction, dim=-1, keepdim=True) + 1e-6 + push_dir_normalized = push_direction / push_dir_norm + + # compute target "behind" position + height_vec = torch.tensor( + [0, 0, height_offset], device=env.device, dtype=torch.float32 + ) + target_pos = obj_pos - behind_offset * push_dir_normalized + height_vec + + # distance to target position + ee_to_target_dist = torch.norm(ee_pos - target_pos, dim=-1) + + # tanh-shaped reward (1.0 when at target, 0.0 when far) + reward = 1.0 - torch.tanh(distance_scale * ee_to_target_dist) + + return reward + + +def distance_to_target( + env: "EmbodiedEnv", + obs: dict, + action: torch.Tensor, + info: dict, + source_entity_cfg: SceneEntityCfg = None, + target_pose_key: str = "target_pose", + exponential: bool = False, + sigma: float = 1.0, + use_xy_only: bool = False, +) -> torch.Tensor: + """Reward based on absolute distance to a virtual target pose. + + Encourages an object to get closer to a target pose specified in the info dict. + Unlike incremental_distance_to_target, this provides direct distance-based reward. + + Args: + source_entity_cfg: Configuration for the object (e.g., {"uid": "cube"}) + target_pose_key: Key in info dict for target pose (default: "target_pose") + Can be (num_envs, 3) position or (num_envs, 4, 4) transform + exponential: If True, use exponential reward exp(-d²/2σ²), else use -distance + sigma: Standard deviation for exponential reward (default: 1.0) + use_xy_only: If True, ignore z-axis and only consider horizontal distance + + Returns: + Reward tensor of shape (num_envs,). + - Linear mode: -distance (negative, approaches 0 when close) + - Exponential mode: exp(-d²/2σ²) (0 to 1, approaches 1 when close) + + Example: + ```json + { + "func": "distance_to_target", + "weight": 0.5, + "params": { + "source_entity_cfg": {"uid": "cube"}, + "target_pose_key": "goal_pose", + "exponential": false, + "use_xy_only": true + } + } + ``` + """ + # get source entity position + source_obj = env.sim.get_rigid_object(source_entity_cfg.uid) + source_pos = source_obj.get_local_pose(to_matrix=True)[:, :3, 3] + + # get target position from info + if target_pose_key not in info: + raise ValueError( + f"Target pose '{target_pose_key}' not found in info dict. " + f"Make sure to provide it in get_info()." + ) + + target_poses = info[target_pose_key] + if target_poses.dim() == 2: # (num_envs, 3) + target_pos = target_poses + else: # (num_envs, 4, 4) + target_pos = target_poses[:, :3, 3] + + # compute distance + if use_xy_only: + distance = torch.norm(source_pos[:, :2] - target_pos[:, :2], dim=-1) + else: + distance = torch.norm(source_pos - target_pos, dim=-1) + + # compute reward + if exponential: + # exponential reward: exp(-distance^2 / (2 * sigma^2)) + reward = torch.exp(-(distance**2) / (2 * sigma**2)) + else: + # negative distance reward + reward = -distance + + return reward + + +def incremental_distance_to_target( + env: "EmbodiedEnv", + obs: dict, + action: torch.Tensor, + info: dict, + source_entity_cfg: SceneEntityCfg = None, + target_pose_key: str = "target_pose", + tanh_scale: float = 10.0, + positive_weight: float = 1.0, + negative_weight: float = 1.0, + use_xy_only: bool = False, +) -> torch.Tensor: + """Incremental reward for progress toward a virtual target pose. + + Rewards the robot for getting closer to the target compared to previous timestep. + Stores previous distance in env._reward_states for comparison. Uses tanh shaping + to normalize rewards and supports asymmetric weighting for approach vs. retreat. + + Args: + source_entity_cfg: Configuration for the object (e.g., {"uid": "cube"}) + target_pose_key: Key in info dict for target pose (default: "target_pose") + Can be (num_envs, 3) position or (num_envs, 4, 4) transform + tanh_scale: Scaling for tanh normalization (higher = more sensitive, default: 10.0) + positive_weight: Multiplier for reward when getting closer (default: 1.0) + negative_weight: Multiplier for penalty when moving away (default: 1.0) + use_xy_only: If True, ignore z-axis and only consider horizontal distance + + Returns: + Reward tensor of shape (num_envs,). Zero on first call, then: + - Positive when getting closer (scaled by positive_weight) + - Negative when moving away (scaled by negative_weight) + - Magnitude bounded by tanh function + + Note: + This function maintains state using env._reward_states[f"prev_dist_{uid}_{key}"]. + State is automatically reset when the environment resets. + + Example: + ```json + { + "func": "incremental_distance_to_target", + "weight": 1.0, + "params": { + "source_entity_cfg": {"uid": "cube"}, + "target_pose_key": "goal_pose", + "tanh_scale": 10.0, + "positive_weight": 2.0, + "negative_weight": 0.5, + "use_xy_only": true + } + } + ``` + """ + # get source entity position + source_obj = env.sim.get_rigid_object(source_entity_cfg.uid) + source_pos = source_obj.get_local_pose(to_matrix=True)[:, :3, 3] + + # get target position from info + if target_pose_key not in info: + raise ValueError( + f"Target pose '{target_pose_key}' not found in info dict. " + f"Make sure to provide it in get_info()." + ) + + target_poses = info[target_pose_key] + if target_poses.dim() == 2: # (num_envs, 3) + target_pos = target_poses + else: # (num_envs, 4, 4) + target_pos = target_poses[:, :3, 3] + + # compute current distance + if use_xy_only: + current_dist = torch.norm(source_pos[:, :2] - target_pos[:, :2], dim=-1) + else: + current_dist = torch.norm(source_pos - target_pos, dim=-1) + + # initialize previous distance on first call + # Use dictionary-based state management for better organization + if not hasattr(env, "_reward_states"): + env._reward_states = {} + + state_key = f"prev_dist_{source_entity_cfg.uid}_{target_pose_key}" + if state_key not in env._reward_states: + env._reward_states[state_key] = current_dist.clone() + return torch.zeros(env.num_envs, device=env.device) + + # compute distance delta (positive = getting closer) + prev_dist = env._reward_states[state_key] + distance_delta = prev_dist - current_dist + + # apply tanh shaping + distance_delta_normalized = torch.tanh(tanh_scale * distance_delta) + + # asymmetric weighting + reward = torch.where( + distance_delta_normalized >= 0, + positive_weight * distance_delta_normalized, + negative_weight * distance_delta_normalized, + ) + + # update previous distance + env._reward_states[state_key] = current_dist.clone() + + return reward diff --git a/embodichain/lab/gym/envs/rl_env_cfg.py b/embodichain/lab/gym/envs/rl_env_cfg.py deleted file mode 100644 index 34448747..00000000 --- a/embodichain/lab/gym/envs/rl_env_cfg.py +++ /dev/null @@ -1,33 +0,0 @@ -# ---------------------------------------------------------------------------- -# Copyright (c) 2021-2025 DexForce Technology Co., Ltd. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# ---------------------------------------------------------------------------- - -from typing import Any, Dict - -from embodichain.lab.gym.envs.embodied_env import EmbodiedEnvCfg -from embodichain.utils import configclass - - -@configclass -class RLEnvCfg(EmbodiedEnvCfg): - """Extended configuration for RL environments built from gym-style specs.""" - - env_id: str = "" - extensions: Dict[str, Any] = {} - - @classmethod - def from_dict(cls, d): - """Create an instance from a dictionary.""" - return cls(**d) diff --git a/embodichain/lab/gym/envs/tasks/rl/__init__.py b/embodichain/lab/gym/envs/tasks/rl/__init__.py index f8cf3031..be52afc3 100644 --- a/embodichain/lab/gym/envs/tasks/rl/__init__.py +++ b/embodichain/lab/gym/envs/tasks/rl/__init__.py @@ -18,10 +18,10 @@ from copy import deepcopy from embodichain.lab.gym.utils import registration as env_registry -from embodichain.lab.gym.envs.rl_env_cfg import RLEnvCfg +from embodichain.lab.gym.envs.embodied_env import EmbodiedEnvCfg -def build_env(env_id: str, base_env_cfg: RLEnvCfg): +def build_env(env_id: str, base_env_cfg: EmbodiedEnvCfg): """Create env from registry id, auto-inferring cfg class (EnvName -> EnvNameCfg).""" env = env_registry.make(env_id, cfg=deepcopy(base_env_cfg)) return env diff --git a/embodichain/lab/gym/envs/tasks/rl/push_cube.py b/embodichain/lab/gym/envs/tasks/rl/push_cube.py index 4aef16c8..072574fc 100644 --- a/embodichain/lab/gym/envs/tasks/rl/push_cube.py +++ b/embodichain/lab/gym/envs/tasks/rl/push_cube.py @@ -38,27 +38,13 @@ def __init__(self, cfg=None, **kwargs): if cfg is None: cfg = EmbodiedEnvCfg() - extensions = getattr(cfg, "extensions", {}) or {} - - # cfg.sim_cfg.enable_rt = True - - defaults = { - "success_threshold": 0.1, - "reaching_reward_weight": 0.1, - "place_reward_weight": 2.0, - "place_penalty_weight": 0.5, - "action_penalty_weight": 0.01, - "success_bonus_weight": 10.0, - } - for name, default in defaults.items(): - value = extensions.get(name, getattr(cfg, name, default)) - setattr(cfg, name, value) - setattr(self, name, getattr(cfg, name)) - - self.last_cube_goal_dist = None - super().__init__(cfg, **kwargs) + @property + def goal_pose(self) -> torch.Tensor: + """Get current goal poses (4x4 matrices) for all environments.""" + return self._goal_pose + def _draw_goal_marker(self): """Draw axis marker at goal position for visualization.""" goal_sphere = self.sim.get_rigid_object("goal_sphere") @@ -87,32 +73,10 @@ def _draw_goal_marker(self): ) self.sim.draw_marker(cfg=marker_cfg) - def _init_sim_state(self, **kwargs): - super()._init_sim_state(**kwargs) - self.single_action_space = spaces.Box( - low=-self.joint_limits, - high=self.joint_limits, - shape=(6,), - dtype=np.float32, - ) - if self.obs_mode == "state": - self.single_observation_space = spaces.Box( - low=-np.inf, high=np.inf, shape=(15,), dtype=np.float32 - ) - def _initialize_episode( self, env_ids: Sequence[int] | None = None, **kwargs ) -> None: super()._initialize_episode(env_ids=env_ids, **kwargs) - cube = self.sim.get_rigid_object("cube") - - # Calculate previous distance (for incremental reward) based on current (possibly randomized) pose - cube_pos = cube.body_data.pose[:, :3] - goal_sphere = self.sim.get_rigid_object("goal_sphere") - goal_pos = goal_sphere.body_data.pose[ - :, :3 - ] # Get actual goal positions for each environment - self.last_cube_goal_dist = torch.norm(cube_pos[:, :2] - goal_pos[:, :2], dim=1) # Draw marker at goal position # self._draw_goal_marker() @@ -128,84 +92,29 @@ def _step_action(self, action: EnvAction) -> EnvAction: self.robot.set_qpos(qpos=target_qpos) return scaled_action - def get_obs(self, **kwargs) -> EnvObs: - qpos_all = self.robot.body_data.qpos[:, :6] - ee_pose_matrix = self.robot.compute_fk( - name="arm", qpos=qpos_all, to_matrix=True - ) - ee_pos_all = ee_pose_matrix[:, :3, 3] + def get_info(self, **kwargs) -> Dict[str, Any]: cube = self.sim.get_rigid_object("cube") - cube_pos_all = cube.body_data.pose[:, :3] - # Get actual goal positions for each environment - goal_sphere = self.sim.get_rigid_object("goal_sphere") - goal_pos_all = goal_sphere.body_data.pose[:, :3] - if self.obs_mode == "state": - return torch.cat([qpos_all, ee_pos_all, cube_pos_all, goal_pos_all], dim=1) - return { - "robot": {"qpos": qpos_all, "ee_pos": ee_pos_all}, - "object": {"cube_pos": cube_pos_all, "goal_pos": goal_pos_all}, - } + cube_pos = cube.body_data.pose[:, :3] - def get_reward( - self, obs: EnvObs, action: EnvAction, info: Dict[str, Any] - ) -> torch.Tensor: - if self.obs_mode == "state": - ee_pos = obs[:, 6:9] - cube_pos = obs[:, 9:12] - goal_pos = obs[:, 12:15] + # Get goal position from event-managed goal pose + if self.goal_pose is not None: + goal_pos = self.goal_pose[:, :3, 3] + xy_distance = torch.norm(cube_pos[:, :2] - goal_pos[:, :2], dim=1) + is_success = xy_distance < self.success_threshold else: - ee_pos = obs["robot"]["ee_pos"] - cube_pos = obs["object"]["cube_pos"] - goal_pos = obs["object"]["goal_pos"] - push_direction = goal_pos - cube_pos - push_dir_norm = torch.norm(push_direction, dim=1, keepdim=True) + 1e-6 - push_dir_normalized = push_direction / push_dir_norm - push_pose = ( - cube_pos - - 0.015 * push_dir_normalized - + torch.tensor([0, 0, 0.015], device=self.device, dtype=torch.float32) - ) - ee_to_push_dist = torch.norm(ee_pos - push_pose, dim=1) - reaching_reward_raw = 1.0 - torch.tanh(5.0 * ee_to_push_dist) - reaching_reward = self.reaching_reward_weight * reaching_reward_raw - cube_to_goal_dist = torch.norm(cube_pos[:, :2] - goal_pos[:, :2], dim=1) - distance_delta = 10.0 * (self.last_cube_goal_dist - cube_to_goal_dist) - distance_delta_normalized = torch.tanh(distance_delta) - place_reward = torch.where( - distance_delta_normalized >= 0, - self.place_reward_weight * distance_delta_normalized, - self.place_penalty_weight * distance_delta_normalized, - ) - self.last_cube_goal_dist = cube_to_goal_dist - action_magnitude = torch.norm(action, dim=1) - action_penalty = -self.action_penalty_weight * action_magnitude - success_bonus_raw = info["success"].float() - success_bonus = self.success_bonus_weight * success_bonus_raw - reward = reaching_reward + place_reward + action_penalty + success_bonus - # Organize reward components in a dedicated "rewards" dict - # This allows trainer to easily identify and log reward components - if "rewards" not in info: - info["rewards"] = {} - info["rewards"]["reaching_reward"] = reaching_reward - info["rewards"]["place_reward"] = place_reward - info["rewards"]["action_penalty"] = action_penalty - info["rewards"]["success_bonus"] = success_bonus - return reward + # Goal not yet set by randomize_target_pose event (e.g., before first reset) + xy_distance = torch.zeros(self.cfg.num_envs, device=self.device) + is_success = torch.zeros( + self.cfg.num_envs, device=self.device, dtype=torch.bool + ) - def get_info(self, **kwargs) -> Dict[str, Any]: - cube = self.sim.get_rigid_object("cube") - cube_pos = cube.body_data.pose[:, :3] - # Get actual goal positions for each environment - goal_sphere = self.sim.get_rigid_object("goal_sphere") - goal_pos = goal_sphere.body_data.pose[:, :3] - xy_distance = torch.norm(cube_pos[:, :2] - goal_pos[:, :2], dim=1) - is_success = xy_distance < self.success_threshold info = { "success": is_success, "fail": torch.zeros( self.cfg.num_envs, device=self.device, dtype=torch.bool ), "elapsed_steps": self._elapsed_steps, + "goal_pose": self.goal_pose, } info["metrics"] = { "distance_to_goal": xy_distance, diff --git a/embodichain/lab/gym/envs/tasks/rl/push_t.py b/embodichain/lab/gym/envs/tasks/rl/push_t.py new file mode 100644 index 00000000..26768715 --- /dev/null +++ b/embodichain/lab/gym/envs/tasks/rl/push_t.py @@ -0,0 +1,149 @@ +# ---------------------------------------------------------------------------- +# Copyright (c) 2021-2025 DexForce Technology Co., Ltd. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ---------------------------------------------------------------------------- + +from typing import Dict, Any, Sequence + +import torch + +from embodichain.lab.gym.utils.registration import register_env +from embodichain.lab.gym.envs import EmbodiedEnv, EmbodiedEnvCfg +from embodichain.lab.sim.cfg import MarkerCfg +from embodichain.lab.sim.types import EnvObs, EnvAction + + +@register_env("PushTRL", max_episode_steps=50, override=True) +class PushTEnv(EmbodiedEnv): + """Push-T task. + + The task requires pushing a T-shaped block to a goal position on the tabletop. + Success is defined by the object being within a planar distance threshold. + """ + + def __init__(self, cfg=None, **kwargs): + if cfg is None: + cfg = EmbodiedEnvCfg() + + super().__init__(cfg, **kwargs) + + self.control_part_name = getattr(self, "control_part_name", "arm") + self.require_on_table = getattr(self, "require_on_table", True) + self.table_height = getattr(self, "table_height", 0.0) + + @property + def goal_pose(self) -> torch.Tensor | None: + """Get current goal poses (4x4 matrices) for all environments.""" + return getattr(self, "_goal_pose", None) + + def _draw_goal_marker(self) -> None: + """Draw axis marker at goal position for visualization.""" + if self.goal_pose is None: + return + + goal_poses = self.goal_pose.detach().cpu().numpy() + for arena_idx in range(self.cfg.num_envs): + marker_name = f"goal_marker_{arena_idx}" + self.sim.remove_marker(marker_name) + + marker_cfg = MarkerCfg( + name=marker_name, + marker_type="axis", + axis_xpos=[goal_poses[arena_idx]], + axis_size=0.003, + axis_len=0.02, + arena_index=arena_idx, + ) + self.sim.draw_marker(cfg=marker_cfg) + + def _initialize_episode( + self, env_ids: Sequence[int] | None = None, **kwargs + ) -> None: + super()._initialize_episode(env_ids=env_ids, **kwargs) + # self._draw_goal_marker() + + def _step_action(self, action: EnvAction) -> EnvAction: + scaled_action = action * self.action_scale + scaled_action = torch.clamp( + scaled_action, -self.joint_limits, self.joint_limits + ) + current_qpos = self.robot.body_data.qpos + target_qpos = current_qpos.clone() + target_qpos[:, :6] += scaled_action[:, :6] + self.robot.set_qpos(qpos=target_qpos) + return scaled_action + + def _get_eef_pos(self) -> torch.Tensor: + """Get end-effector position using FK.""" + if self.control_part_name: + try: + joint_ids = self.robot.get_joint_ids(self.control_part_name) + qpos = self.robot.get_qpos()[:, joint_ids] + ee_pose = self.robot.compute_fk( + name=self.control_part_name, qpos=qpos, to_matrix=True + ) + except (ValueError, KeyError, AttributeError): + qpos = self.robot.get_qpos() + ee_pose = self.robot.compute_fk(qpos=qpos, to_matrix=True) + else: + qpos = self.robot.get_qpos() + ee_pose = self.robot.compute_fk(qpos=qpos, to_matrix=True) + return ee_pose[:, :3, 3] + + def get_info(self, **kwargs) -> Dict[str, Any]: + t_obj = self.sim.get_rigid_object("t") + t_pos = t_obj.body_data.pose[:, :3] + ee_pos = self._get_eef_pos() + + if self.goal_pose is not None: + goal_pos = self.goal_pose[:, :3, 3] + xy_distance = torch.norm(t_pos[:, :2] - goal_pos[:, :2], dim=1) + is_success = xy_distance < self.success_threshold + if self.require_on_table: + is_success = is_success & (t_pos[:, 2] >= self.table_height - 1e-3) + else: + xy_distance = torch.zeros(self.cfg.num_envs, device=self.device) + is_success = torch.zeros( + self.cfg.num_envs, device=self.device, dtype=torch.bool + ) + + ee_to_t = torch.norm(ee_pos - t_pos, dim=1) + info = { + "success": is_success, + "fail": torch.zeros( + self.cfg.num_envs, device=self.device, dtype=torch.bool + ), + "elapsed_steps": self._elapsed_steps, + "goal_pose": self.goal_pose, + } + info["metrics"] = { + "distance_to_goal": xy_distance, + "eef_to_t": ee_to_t, + "t_height": t_pos[:, 2], + } + return info + + def check_truncated(self, obs: EnvObs, info: Dict[str, Any]) -> torch.Tensor: + is_timeout = self._elapsed_steps >= self.episode_length + t_obj = self.sim.get_rigid_object("t") + t_pos = t_obj.body_data.pose[:, :3] + is_fallen = t_pos[:, 2] < -0.1 + return is_timeout | is_fallen + + def evaluate(self, **kwargs) -> Dict[str, Any]: + info = self.get_info(**kwargs) + return { + "success": info["success"][0].item(), + "distance_to_goal": info["metrics"]["distance_to_goal"], + } diff --git a/embodichain/lab/gym/envs/tasks/tableware/scoop_ice.py b/embodichain/lab/gym/envs/tasks/tableware/scoop_ice.py index 6a3c4c50..8eb37ed4 100644 --- a/embodichain/lab/gym/envs/tasks/tableware/scoop_ice.py +++ b/embodichain/lab/gym/envs/tasks/tableware/scoop_ice.py @@ -99,7 +99,6 @@ def add_xpos_offset(self, arm_qpos: np.ndarray, offset: np.ndarray, is_left: boo return arm_qpos_offset_batch[0].to("cpu").numpy() def pack_qpos(self): - self.num_envs = self.sim.num_envs left_arm_qpos = self.trajectory["left_arm"] # [waypoint_num, dof] logger.log_info("Adding x and z offset to left arm trajectory...") left_arm_qpos = self.add_xpos_offset( diff --git a/embodichain/lab/gym/utils/gym_utils.py b/embodichain/lab/gym/utils/gym_utils.py index ebe06d6e..40ced87f 100644 --- a/embodichain/lab/gym/utils/gym_utils.py +++ b/embodichain/lab/gym/utils/gym_utils.py @@ -126,12 +126,12 @@ def batch(*args: Tuple[Union[np.ndarray, Dict]]): def to_tensor(array: Array, device: Device | None = None): """ - Maps any given sequence to a torch tensor on the CPU/GPU. If physx gpu is not enabled then we use CPU, otherwise GPU, unless specified + Maps any given sequence to a torch tensor on the CPU/GPU. If physics gpu is not enabled then we use CPU, otherwise GPU, unless specified by the device argument Args: array: The data to map to a tensor - device: The device to put the tensor on. By default this is None and to_tensor will put the device on the GPU if physx is enabled + device: The device to put the tensor on. By default this is None and to_tensor will put the device on the GPU if physics is enabled and CPU otherwise """ @@ -323,24 +323,6 @@ def cat_tensor_with_ids( return out -def config_to_rl_cfg(config: dict) -> "RLEnvCfg": - """Parse gym-style configuration dict into an RL-ready config object.""" - - from embodichain.lab.gym.envs.rl_env_cfg import RLEnvCfg - - # Use config_to_cfg to parse shared fields - env_cfg = config_to_cfg(config) - # Convert to RLEnvCfg if needed - if not isinstance(env_cfg, RLEnvCfg): - env_cfg = RLEnvCfg.from_dict(env_cfg.__dict__) - # RL-specific fields - env_cfg.env_id = config.get("id") - env_cfg.num_envs = config["env"].get("num_envs", env_cfg.num_envs) - env_cfg.extensions = deepcopy(config.get("env", {}).get("extensions", {})) - # Add any RL-specific parsing here - return env_cfg - - def config_to_cfg(config: dict) -> "EmbodiedEnvCfg": """Parser configuration file into cfgs for env initialization. @@ -364,6 +346,7 @@ def config_to_cfg(config: dict) -> "EmbodiedEnvCfg": SceneEntityCfg, EventCfg, ObservationCfg, + RewardCfg, DatasetFunctorCfg, ) from embodichain.utils import configclass @@ -452,8 +435,10 @@ class ComponentCfg: env_cfg.articulation.append(cfg) env_cfg.sim_steps_per_control = config["env"].get("sim_steps_per_control", 4) + env_cfg.extensions = deepcopy(config.get("env", {}).get("extensions", {})) + + # TODO: support more env events, eg, grasp pose generation, mesh preprocessing, etc. - # load dataset config env_cfg.dataset = ComponentCfg() if "dataset" in config["env"]: # Define modules to search for dataset functions @@ -471,8 +456,6 @@ class ComponentCfg: raise_if_not_found=True, ) - from embodichain.lab.gym.envs.managers import DatasetFunctorCfg - dataset = DatasetFunctorCfg( func=dataset_func, mode=dataset_params_modified["mode"], @@ -481,8 +464,6 @@ class ComponentCfg: setattr(env_cfg.dataset, dataset_name, dataset) - # TODO: support more env events, eg, grasp pose generation, mesh preprocessing, etc. - env_cfg.events = ComponentCfg() if "events" in config["env"]: # Define modules to search for event functions @@ -547,6 +528,47 @@ class ComponentCfg: setattr(env_cfg.observations, obs_name, observation) + env_cfg.rewards = ComponentCfg() + if "rewards" in config["env"]: + # Define modules to search for reward functions + reward_modules = [ + "embodichain.lab.gym.envs.managers.rewards", + ] + + for reward_name, reward_params in config["env"]["rewards"].items(): + reward_params_modified = deepcopy(reward_params) + + # Handle entity_cfg parameters + for param_key in [ + "entity_cfg", + "source_entity_cfg", + "target_entity_cfg", + "end_effector_cfg", + "object_cfg", + "goal_cfg", + "reference_entity_cfg", + ]: + if param_key in reward_params["params"]: + entity_cfg = SceneEntityCfg( + **reward_params_modified["params"][param_key] + ) + reward_params_modified["params"][param_key] = entity_cfg + + # Find the function from multiple modules using the utility function + reward_func = find_function_from_modules( + reward_params["func"], + reward_modules, + raise_if_not_found=True, + ) + + reward = RewardCfg( + func=reward_func, + mode=reward_params_modified["mode"], + params=reward_params_modified["params"], + ) + + setattr(env_cfg.rewards, reward_name, reward) + return env_cfg diff --git a/embodichain/lab/sim/cfg.py b/embodichain/lab/sim/cfg.py index cedb59ed..68dc6d41 100644 --- a/embodichain/lab/sim/cfg.py +++ b/embodichain/lab/sim/cfg.py @@ -19,7 +19,7 @@ import numpy as np import torch -from typing import Sequence, Union, Dict, Literal, List, Any +from typing import Sequence, Union, Dict, Literal, List, Any, Optional from dataclasses import field, MISSING from dexsim.types import ( @@ -253,7 +253,7 @@ def from_dict( class SoftbodyVoxelAttributesCfg: # voxel config triangle_remesh_resolution: int = 8 - """Resolution to remesh the softbody mesh before building physx collision mesh.""" + """Resolution to remesh the softbody mesh before building physics collision mesh.""" triangle_simplify_target: int = 0 """Simplify mesh faces to target value. Do nothing if this value is zero.""" diff --git a/embodichain/lab/sim/common.py b/embodichain/lab/sim/common.py index 4e5b9b33..8c11ab37 100644 --- a/embodichain/lab/sim/common.py +++ b/embodichain/lab/sim/common.py @@ -24,6 +24,7 @@ from embodichain.lab.sim.cfg import ObjectBaseCfg from embodichain.utils import logger +from copy import deepcopy T = TypeVar("T") @@ -57,7 +58,7 @@ def __init__( if entities is None or len(entities) == 0: logger.log_error("Invalid entities list: must not be empty.") - self.cfg = cfg.copy() + self.cfg = deepcopy(cfg) self.uid = self.cfg.uid if self.uid is None: logger.log_error("UID must be set in the configuration.") diff --git a/embodichain/lab/sim/objects/articulation.py b/embodichain/lab/sim/objects/articulation.py index e18eb79a..c23e187b 100644 --- a/embodichain/lab/sim/objects/articulation.py +++ b/embodichain/lab/sim/objects/articulation.py @@ -20,10 +20,11 @@ from dataclasses import dataclass from functools import cached_property -from typing import List, Sequence, Dict, Union, Tuple +from typing import List, Sequence, Dict, Union, Tuple, Optional from dexsim.engine import Articulation as _Articulation from dexsim.types import ( + ArticulationFlag, ArticulationGPUAPIWriteType, ArticulationGPUAPIReadType, ) @@ -1132,15 +1133,18 @@ def set_drive( drive_args["joint_friction"] = friction[i].cpu().numpy() self._entities[env_idx].set_drive(**drive_args) - def get_user_ids(self) -> torch.Tensor: + def get_user_ids(self, link_name: str | None = None) -> torch.Tensor: """Get the user ids of the articulation. + Args: + link_name: (str | None): The name of the link. If None, returns user ids for all links. + Returns: - torch.Tensor: The user ids of the articulation with shape (N, num_link). + torch.Tensor: The user ids of the articulation with shape (N, 1) for given link_name or (N, num_links) if link_name is None. """ return torch.as_tensor( np.array( - [entity.get_user_ids() for entity in self._entities], + [entity.get_user_ids(link_name) for entity in self._entities], ), dtype=torch.int32, device=self.device, @@ -1587,6 +1591,40 @@ def set_physical_visible( for link_name in link_names: self._entities[env_idx].set_physical_visible(visible, link_name) + def set_fix_base( + self, + fix: bool = True, + env_ids: Sequence[int] | None = None, + ) -> None: + """Set whether the base of the articulation is fixed. + + Args: + fix (bool, optional): Whether to fix the base. Defaults to True. + env_ids (Sequence[int] | None, optional): Environment indices. If None, then all indices are used. + """ + local_env_ids = self._all_indices if env_ids is None else env_ids + for i, env_idx in enumerate(local_env_ids): + self._entities[env_idx].set_articulation_flag( + ArticulationFlag.FIX_BASE, fix + ) + + def set_self_collision( + self, + enable: bool = False, + env_ids: Sequence[int] | None = None, + ) -> None: + """Set whether self-collision is enabled for the articulation. + + Args: + enable (bool, optional): Whether to enable self-collision. Defaults to True. + env_ids (Sequence[int] | None, optional): Environment indices. If None, then all indices are used. + """ + local_env_ids = self._all_indices if env_ids is None else env_ids + for i, env_idx in enumerate(local_env_ids): + self._entities[env_idx].set_articulation_flag( + ArticulationFlag.DISABLE_SELF_COLLISION, not enable + ) + def destroy(self) -> None: env = self._world.get_env() arenas = env.get_all_arenas() diff --git a/embodichain/lab/sim/objects/soft_object.py b/embodichain/lab/sim/objects/soft_object.py index be9bd57d..116925e7 100644 --- a/embodichain/lab/sim/objects/soft_object.py +++ b/embodichain/lab/sim/objects/soft_object.py @@ -340,7 +340,7 @@ def reset(self, env_ids: Sequence[int] | None = None) -> None: local_env_ids = self._all_indices if env_ids is None else env_ids num_instances = len(local_env_ids) - # TODO: set attr for soft body after loading in physx scene + # TODO: set attr for soft body after loading in physics scene. # rest soft body to init_pos pos = torch.as_tensor( diff --git a/embodichain/lab/sim/sensors/__init__.py b/embodichain/lab/sim/sensors/__init__.py index 36c4d0e6..2e57ca42 100644 --- a/embodichain/lab/sim/sensors/__init__.py +++ b/embodichain/lab/sim/sensors/__init__.py @@ -17,3 +17,8 @@ from .base_sensor import BaseSensor, SensorCfg from .camera import Camera, CameraCfg from .stereo import StereoCamera, StereoCameraCfg +from .contact_sensor import ( + ContactSensor, + ContactSensorCfg, + ArticulationContactFilterCfg, +) diff --git a/embodichain/lab/sim/sensors/contact_sensor.py b/embodichain/lab/sim/sensors/contact_sensor.py new file mode 100644 index 00000000..18efadc5 --- /dev/null +++ b/embodichain/lab/sim/sensors/contact_sensor.py @@ -0,0 +1,368 @@ +# ---------------------------------------------------------------------------- +# Copyright (c) 2021-2025 DexForce Technology Co., Ltd. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ---------------------------------------------------------------------------- + +from __future__ import annotations + +import dexsim +import math +import torch + +from typing import Union, Tuple, Sequence, List, Optional, Dict + +from embodichain.lab.sim.sensors import BaseSensor, SensorCfg +from embodichain.utils import logger, configclass +import uuid +import numpy as np + + +@configclass +class ContactSensorCfg(SensorCfg): + """Base class for sensor abstraction in the simulation engine. + + Sensors should inherit from this class and implement the `update` and `get_data` methods. + """ + + rigid_uid_list: List[str] = [] + """rigid body contact filter configs""" + + articulation_cfg_list: List[ArticulationContactFilterCfg] = [] + """articulation link contact filter configs""" + + filter_need_both_actor: bool = True + """Whether to filter contact only when both actors are in the filter list.""" + + sensor_type: str = "ContactSensor" + + +@configclass +class ArticulationContactFilterCfg: + articulation_uid: str = "" + """Articulation unique identifier.""" + + link_name_list: List[str] = [] + """link names in the articulation whose contacts need to be filtered.""" + + +class ContactSensor(BaseSensor): + """Sensor to get contacts from rigid body and articulation links.""" + + SUPPORTED_DATA_TYPES = [ + "position", + "normal", + "friction", + "impulse", + "distance", + "user_ids", + "env_ids", + ] + + def __init__( + self, config: ContactSensorCfg, device: torch.device = torch.device("cpu") + ) -> None: + from embodichain.lab.sim import SimulationManager + + self._sim = SimulationManager.get_instance() + """simulation manager reference""" + + self.item_user_ids: Optional[torch.Tensor] = None + """Dexsim userid of the contact filter items.""" + + self.item_env_ids: Optional[torch.Tensor] = None + """Environment ids of the contact filter items.""" + + self.item_user_env_ids_map: Optional[torch.Tensor] = None + """Map from dexsim userid to environment id.""" + + self._data_buffer = { + "position": torch.empty((0, 3), device=device), + "normal": torch.empty((0, 3), device=device), + "friction": torch.empty((0, 3), device=device), + "impulse": torch.empty((0,), device=device), + "distance": torch.empty((0,), device=device), + "user_ids": torch.empty((0, 2), dtype=torch.int32, device=device), + "env_ids": torch.empty((0,), dtype=torch.int32, device=device), + } + """ + position: [num_contacts, 3] tensor, contact position in arena frame + normal: [num_contacts, 3] tensor, contact normal + friction: [num_contacts, 3] tensor, contact friction. Currently this value is not accurate. + impulse: [num_contacts, ] tensor, contact impulse + distance: [num_contacts, ] tensor, contact distance + user_ids: [num_contacts, 2] of int, contact user ids + , use rigid_object.get_user_id() and find which object it belongs to. + env_ids: [num_contacts, ] of int, which arena the contact belongs to. + """ + + self._visualizer: Optional[dexsim.models.PointCloud] = None + """contact point visualizer. Default to None""" + self.device = device + super().__init__(config, device) + + def _precompute_filter_ids(self, config: ContactSensorCfg): + self.item_user_ids = torch.tensor([], dtype=torch.int32, device=self.device) + self.item_env_ids = torch.tensor([], dtype=torch.int32, device=self.device) + self.item_user_env_ids_map = torch.tensor( + [], dtype=torch.int32, device=self.device + ) + for rigid_uid in config.rigid_uid_list: + rigid_object = self._sim.get_rigid_object(rigid_uid) + if rigid_object is None: + logger.log_warning( + f"Rigid body with uid '{rigid_uid}' not found in simulation." + ) + continue + self.item_user_ids = torch.cat( + (self.item_user_ids, rigid_object.get_user_ids()) + ) + env_ids = torch.tensor( + rigid_object._all_indices, dtype=torch.int32, device=self.device + ) + self.item_env_ids = torch.cat((self.item_env_ids, env_ids)) + + for articulation_cfg in config.articulation_cfg_list: + articulation = self._sim.get_articulation(articulation_cfg.articulation_uid) + if articulation is None: + articulation = self._sim.get_robot(articulation_cfg.articulation_uid) + if articulation is None: + logger.log_warning( + f"Articulation with uid '{articulation_cfg.articulation_uid}' not found in simulation." + ) + continue + all_link_names = articulation.link_names + link_names = ( + all_link_names + if len(articulation_cfg.link_name_list) == 0 + else articulation_cfg.link_name_list + ) + for link_name in link_names: + if link_name not in all_link_names: + logger.log_warning( + f"Link {link_name} not found in articulation {articulation_cfg.uid}." + ) + continue + link_user_ids = articulation.get_user_ids(link_name).reshape(-1) + self.item_user_ids = torch.cat((self.item_user_ids, link_user_ids)) + env_ids = torch.tensor( + articulation._all_indices, dtype=torch.int32, device=self.device + ) + self.item_env_ids = torch.cat((self.item_env_ids, env_ids)) + # build user_id to env_id map + max_user_id = int(self.item_user_ids.max().item()) + self.item_user_env_ids_map = torch.full( + size=(max_user_id + 1,), + fill_value=-1, + dtype=self.item_user_ids.dtype, + device=self.device, + ) + self.item_user_env_ids_map[self.item_user_ids] = self.item_env_ids + + def _build_sensor_from_config(self, config: ContactSensorCfg, device: torch.device): + self._precompute_filter_ids(config) + self._world: dexsim.World = dexsim.default_world() + self._ps = self._world.get_physics_scene() + world_config = dexsim.get_world_config() + self.is_use_gpu_physics = device.type == "cuda" and world_config.enable_gpu_sim + if self.is_use_gpu_physics: + MAX_CONTACT = 65536 + self.contact_data_buffer = torch.zeros( + MAX_CONTACT, 11, dtype=torch.float32, device=device + ) + self.contact_user_ids_buffer = torch.zeros( + MAX_CONTACT, 2, dtype=torch.int32, device=device + ) + else: + self._ps.enable_contact_data_update_on_cpu(True) + + def update(self, **kwargs) -> None: + """Update the sensor state based on the current simulation state. + + This method is called periodically to ensure the sensor data is up-to-date. + + Args: + **kwargs: Additional keyword arguments for sensor update. + """ + + if not self.is_use_gpu_physics: + contact_data_np, body_user_indices_np = self._ps.get_cpu_contact_buffer() + n_contact = contact_data_np.shape[0] + contact_data = torch.tensor( + contact_data_np, dtype=torch.float32, device=self.device + ) + body_user_indices = torch.tensor( + body_user_indices_np, dtype=torch.int32, device=self.device + ) + else: + n_contact = self._ps.gpu_fetch_contact_data( + self.contact_data_buffer, self.contact_user_ids_buffer + ) + contact_data = self.contact_data_buffer[:n_contact] + body_user_indices = self.contact_user_ids_buffer[:n_contact] + if n_contact == 0: + self._data_buffer = { + "position": torch.empty((0, 3), device=self.device), + "normal": torch.empty((0, 3), device=self.device), + "friction": torch.empty((0, 3), device=self.device), + "impulse": torch.empty((0,), device=self.device), + "distance": torch.empty((0,), device=self.device), + "user_ids": torch.empty((0, 2), dtype=torch.int32, device=self.device), + "env_ids": torch.empty((0,), dtype=torch.int32, device=self.device), + } + return + + filter0_mask = torch.isin(body_user_indices[:, 0], self.item_user_ids) + filter1_mask = torch.isin(body_user_indices[:, 1], self.item_user_ids) + if self.cfg.filter_need_both_actor: + filter_mask = torch.logical_and(filter0_mask, filter1_mask) + else: + filter_mask = torch.logical_or(filter0_mask, filter1_mask) + + filtered_contact_data = contact_data[filter_mask] + filtered_user_ids = body_user_indices[filter_mask] + filtered_env_ids = self.item_user_env_ids_map[filtered_user_ids[:, 0]] + # generate contact report + contact_offsets = self._sim.arena_offsets[filtered_env_ids] + filtered_contact_data[:, 0:3] = ( + filtered_contact_data[:, 0:3] - contact_offsets + ) # minus arean offsets + self._data_buffer["position"] = filtered_contact_data[:, 0:3] + self._data_buffer["normal"] = filtered_contact_data[:, 3:6] + self._data_buffer["friction"] = filtered_contact_data[:, 6:9] + self._data_buffer["impulse"] = filtered_contact_data[:, 9] + self._data_buffer["distance"] = filtered_contact_data[:, 10] + self._data_buffer["user_ids"] = filtered_user_ids + self._data_buffer["env_ids"] = filtered_env_ids + + def get_arena_pose(self, to_matrix: bool = False) -> torch.Tensor: + """Not used. + + Args: + to_matrix: If True, return the pose as a 4x4 transformation matrix. + + Returns: + A tensor representing the pose of the sensor in the arena frame. + """ + logger.log_error("`get_arena_pose` for contact sensor is not implemented yet.") + return None + + def get_local_pose(self, to_matrix: bool = False) -> torch.Tensor: + """Get the local pose of the camera. + + Args: + to_matrix (bool): If True, return the pose as a 4x4 matrix. If False, return as a quaternion. + + Returns: + torch.Tensor: The local pose of the camera. + """ + logger.log_error("`get_local_pose` for contact sensor is not implemented yet.") + return None + + def set_local_pose( + self, pose: torch.Tensor, env_ids: Sequence[int] | None = None + ) -> None: + """Set the local pose of the camera. + + Note: The pose should be in the OpenGL coordinate system, which means the Y is up and Z is forward. + + Args: + pose (torch.Tensor): The local pose to set, should be a 4x4 transformation matrix. + env_ids (Sequence[int] | None): The environment IDs to set the pose for. If None, set for all environments. + """ + logger.log_error("`set_local_pose` for contact sensor is not implemented yet.") + return None + + def get_data(self, copy: bool = True) -> Dict[str, torch.Tensor]: + """Retrieve data from the sensor. + + Args: + copy: If True, return a copy of the data buffer. Defaults to True. + Returns: + Dict:{ + "position": Tensor of float32 (num_contact, 3) representing the contact positions, + "normal": Tensor of float32 (num_contact, 3) representing the contact normals, + "friction": Tensor of float32 (num_contact, 3) representing the contact friction, + "impulse": Tensor of float32 (num_contact, ) representing the contact impulses, + "distance": Tensor of float32 (num_contact, ) representing the contact distances, + "user_ids": Tensor of int32 (num_contact, ) representing contact user ids + , use rigid_object.get_user_id() and find which object it belongs to. + "env_ids": [num_contacts, ] of int, which arena the contact belongs to. + } + """ + if copy: + return {key: value.clone() for key, value in self._data_buffer.items()} + return self._data_buffer + + def filter_by_user_ids(self, item_user_ids: torch.Tensor): + """Filter contact report by specific user IDs. + + Args: + item_user_ids (torch.Tensor): Tensor of user IDs to filter by. + + Returns: + data: A new ContactReport instance containing only the filtered contacts. + """ + filter0_mask = torch.isin(self._data_buffer["user_ids"][:, 0], item_user_ids) + filter1_mask = torch.isin(self._data_buffer["user_ids"][:, 1], item_user_ids) + if self.cfg.filter_need_both_actor: + filter_mask = torch.logical_and(filter0_mask, filter1_mask) + else: + filter_mask = torch.logical_or(filter0_mask, filter1_mask) + return { + "position": self._data_buffer["position"][filter_mask], + "normal": self._data_buffer["normal"][filter_mask], + "friction": self._data_buffer["friction"][filter_mask], + "impulse": self._data_buffer["impulse"][filter_mask], + "distance": self._data_buffer["distance"][filter_mask], + "user_ids": self._data_buffer["user_ids"][filter_mask], + "env_ids": self._data_buffer["env_ids"][filter_mask], + } + + def set_contact_point_visibility( + self, + visible: bool = True, + rgba: Optional[Sequence[int]] = None, + point_size: float = 3.0, + ): + if visible: + contact_position_arena = self._data_buffer["position"] + contact_offsets = self._sim.arena_offsets[self._data_buffer["env_ids"]] + contact_position_world = contact_position_arena + contact_offsets + if self._visualizer is None: + # create new visualizer + temp_str = uuid.uuid4().hex + self._visualizer = self._sim.get_env().create_point_cloud(name=temp_str) + else: + # update existing visualizer points + self._visualizer.clear() + rgba = rgba if rgba is not None else (0.8, 0.2, 0.2, 1.0) + if len(rgba) != 4: + logger.log_error( + f"Invalid rgba {rgba}, should be a sequence of 4 floats." + ) + rgba = np.array( + [ + rgba[0], + rgba[1], + rgba[2], + rgba[3], + ] + ) + self._visualizer.add_points( + points=contact_position_world.to("cpu").numpy(), color=rgba + ) + # self._visualizer.set_point_size(point_size) + else: + if isinstance(self._visualizer, dexsim.models.PointCloud): + self._visualizer.clear() diff --git a/embodichain/lab/sim/sim_manager.py b/embodichain/lab/sim/sim_manager.py index ddb2dabb..c2f078b0 100644 --- a/embodichain/lab/sim/sim_manager.py +++ b/embodichain/lab/sim/sim_manager.py @@ -14,6 +14,8 @@ # limitations under the License. # ---------------------------------------------------------------------------- +from __future__ import annotations + import os import sys import dexsim @@ -45,8 +47,8 @@ ) from dexsim.engine import CudaArray, Material from dexsim.models import MeshObject -from dexsim.render import Light as _Light, LightType -from dexsim.engine import GizmoController +from dexsim.render import Light as _Light, LightType, Windows +from dexsim.engine import GizmoController, ObjectManipulator from embodichain.lab.sim.objects import ( RigidObject, @@ -62,6 +64,7 @@ BaseSensor, Camera, StereoCamera, + ContactSensor, ) from embodichain.lab.sim.cfg import ( PhysicsCfg, @@ -159,11 +162,36 @@ class SimulationManager: sim_config (SimulationManagerCfg, optional): simulation configuration. Defaults to SimulationManagerCfg(). """ - SUPPORTED_SENSOR_TYPES = {"Camera": Camera, "StereoCamera": StereoCamera} + _instances = {} + + SUPPORTED_SENSOR_TYPES = { + "Camera": Camera, + "StereoCamera": StereoCamera, + "ContactSensor": ContactSensor, + } + + def __new__(cls, sim_config: SimulationManagerCfg = SimulationManagerCfg()): + """Create or return the instance based on instance_id.""" + n_instance = len(list(cls._instances.keys())) + instance = super(SimulationManager, cls).__new__(cls) + # Store sim_config in the instance for use in __init__ or elsewhere + instance.sim_config = sim_config + cls._instances[n_instance] = instance + return instance def __init__( self, sim_config: SimulationManagerCfg = SimulationManagerCfg() ) -> None: + instance_id = SimulationManager.get_instance_num() - 1 + + # Mark as initialized + self.instance_id = instance_id + + if sim_config.enable_rt and instance_id > 0: + logger.log_error( + f"Ray Tracing rendering backend is only supported for single instance (instance_id=0). " + ) + # Cache paths self._sim_cache_dir = SIM_CACHE_DIR self._material_cache_dir = MATERIAL_CACHE_DIR @@ -186,7 +214,13 @@ def __init__( # Initialize warp runtime context before creating the world. wp.init() - self._world = dexsim.World(world_config) + self._world: dexsim.World = dexsim.World(world_config) + + self._window: Windows | None = None + self._is_registered_window_control = False + if sim_config.headless is False: + self._window = self._world.get_windows() + self._register_default_window_control() fps = int(1.0 / sim_config.physics_dt) self._world.set_physics_fps(fps) @@ -248,6 +282,54 @@ def __init__( self._build_multiple_arenas(sim_config.num_envs) + @classmethod + def get_instance(cls, instance_id: int = 0) -> SimulationManager: + """Get the instance of SimulationManager by id. + + Args: + instance_id (int): The instance id. Defaults to 0. + + Returns: + SimulationManager: The instance. + + Raises: + RuntimeError: If the instance has not been created yet. + """ + if instance_id not in cls._instances: + logger.log_error( + f"SimulationManager (id={instance_id}) has not been instantiated yet. " + f"Create an instance first using SimulationManager(sim_config, instance_id={instance_id})." + ) + return cls._instances[instance_id] + + @classmethod + def get_instance_num(cls) -> int: + """Get the number of instantiated SimulationManager instances. + + Returns: + int: The number of instances. + """ + return len(cls._instances) + + @classmethod + def reset(cls, instance_id: int = 0) -> None: + """Reset the instance. + + This allows creating a new instance with different configuration. + """ + if instance_id in cls._instances: + logger.log_info(f"Resetting SimulationManager instance {instance_id}.") + del cls._instances[instance_id] + + @classmethod + def is_instantiated(cls, instance_id: int = 0) -> bool: + """Check if the instance has been created. + + Returns: + bool: True if the instance exists, False otherwise. + """ + return instance_id in cls._instances + @property def num_envs(self) -> int: """Get the number of arenas in the simulation. @@ -498,6 +580,8 @@ def get_world(self) -> dexsim.World: def open_window(self) -> None: """Open the simulation window.""" self._world.open_window() + self._window = self._world.get_windows() + self._register_default_window_control() self.is_window_opened = True def close_window(self) -> None: @@ -555,9 +639,9 @@ def set_emission_light( color (Sequence[float] | None): color of the light. intensity (float | None): intensity of the light. """ - if color is None: + if color is not None: self._env.set_env_light_emission(color) - if intensity is None: + if intensity is not None: self._env.set_env_light_intensity(intensity) def _create_default_plane(self): @@ -594,7 +678,7 @@ def set_default_background(self) -> None: ) if self.sim_config.enable_rt: - self.set_emission_light([0.1, 0.1, 0.1], 10.0) + self.set_emission_light([1.0, 1.0, 1.0], 80.0) else: self.set_indirect_lighting("lab_day") @@ -884,6 +968,24 @@ def get_rigid_object_group(self, uid: str) -> RigidObjectGroup | None: return None return self._rigid_object_groups[uid] + @cached_property + def arena_offsets(self) -> torch.Tensor: + """Get the arena offsets for all arenas. + + Returns: + torch.Tensor: The arena offsets of shape (num_arenas, 3). + """ + env_list = [self._env] if len(self._arenas) == 0 else self._arenas + arena_offsets = torch.zeros( + (len(env_list), 3), dtype=torch.float32, device=self.device + ) + for i, env in enumerate(env_list): + arena_position = env.get_root_node().get_world_pose()[:3, 3] + arena_offsets[i] = torch.tensor( + arena_position, dtype=torch.float32, device=self.device + ) + return arena_offsets + def _get_non_static_rigid_obj_num(self) -> int: """Get the number of non-static rigid objects in the scene. @@ -1408,6 +1510,47 @@ def remove_marker(self, name: str) -> bool: logger.log_warning(f"Failed to remove marker {name}: {str(e)}") return False + def _register_default_window_control(self) -> None: + """Register default window controls for better simulation interaction.""" + from dexsim.types import InputKey + + if self._is_registered_window_control: + return + + class WindowDefaultEvent(ObjectManipulator): + + def on_key_down(self, key): + if key == InputKey.SCANCODE_C.value: + print(f"Raycast distance: {self.selected_distance}") + print(f"Hit position: {self.selected_position}") + + manipulator = WindowDefaultEvent() + manipulator.enable_selection_cache(True) + self._window.add_input_control(manipulator) + + self._is_registered_window_control = True + + def add_custom_window_control(self, controls: list[ObjectManipulator]) -> None: + """Add one or more custom window input controls. + + This method registers additional :class:`ObjectManipulator` instances + with the simulation window so they can handle input events alongside + any default controls. + + Args: + controls (list[ObjectManipulator]): A list of initialized + ObjectManipulator instances to add to the current window. + Each control will be registered via ``window.add_input_control``. + If no window is available, the controls are not added and a + warning is logged. + """ + if self._window is None: + logger.log_warning("No window available to add custom controls.") + return + + for control in controls: + self._window.add_input_control(control) + def create_visual_material(self, cfg: VisualMaterialCfg) -> VisualMaterial: """Create a visual material with given configuration. @@ -1475,3 +1618,5 @@ def destroy(self) -> None: self._env.clean() self._world.quit() + + SimulationManager.reset(self.instance_id) diff --git a/embodichain/lab/sim/solvers/srs_solver.py b/embodichain/lab/sim/solvers/srs_solver.py index 8558ea22..fe284959 100644 --- a/embodichain/lab/sim/solvers/srs_solver.py +++ b/embodichain/lab/sim/solvers/srs_solver.py @@ -511,6 +511,14 @@ def _get_each_ik( joints_output[5] = (angle6 - dh_params[5, 3]) * rotation_directions[5] joints_output[6] = (angle7 - dh_params[6, 3]) * rotation_directions[6] + # Check if the calculated joint angles are within the limits + in_range = (joints_output >= self.qpos_limits_np[:, 0]) & ( + joints_output <= self.qpos_limits_np[:, 1] + ) + + if not np.all(in_range): + return False, None + return True, joints_output def get_ik( @@ -568,7 +576,7 @@ def get_ik( return ( torch.zeros(num_targets, dtype=torch.bool, device=self.device), torch.zeros( - (num_targets, num_targets, 7), + (num_targets, 7), dtype=qpos_seed.dtype, device=self.device, ), diff --git a/examples/sim/sensors/create_contact_sensor.py b/examples/sim/sensors/create_contact_sensor.py new file mode 100644 index 00000000..81fe967f --- /dev/null +++ b/examples/sim/sensors/create_contact_sensor.py @@ -0,0 +1,298 @@ +# ---------------------------------------------------------------------------- +# Copyright (c) 2021-2025 DexForce Technology Co., Ltd. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ---------------------------------------------------------------------------- + +""" +This script demonstrates how to create a simulation scene using SimulationManager. +It shows the basic setup of simulation context, adding objects, and sensors. +""" + +import argparse +import time +import torch + +from embodichain.lab.sim import SimulationManager, SimulationManagerCfg +from embodichain.lab.sim.cfg import ( + RigidBodyAttributesCfg, +) +from embodichain.lab.sim.sensors import ( + ContactSensorCfg, + ArticulationContactFilterCfg, +) +from embodichain.lab.sim.shapes import CubeCfg +from embodichain.lab.sim.objects import RigidObject, RigidObjectCfg, Robot, RobotCfg +from embodichain.data import get_data_path + + +def create_cube( + sim: SimulationManager, uid: str, position: list = (0.0, 0.0, 0) +) -> RigidObject: + """create cube + + Args: + sim (SimulationManager): simulation manager + uid (str): uid of the rigid object + position (list, optional): init position. Defaults to (0., 0., 0). + + Returns: + RigidObject: rigid object + """ + cube_size = (0.025, 0.025, 0.025) + cube: RigidObject = sim.add_rigid_object( + cfg=RigidObjectCfg( + uid=uid, + shape=CubeCfg(size=cube_size), + body_type="dynamic", + attrs=RigidBodyAttributesCfg( + mass=0.1, + dynamic_friction=0.9, + static_friction=0.95, + restitution=0.01, + sleep_threshold=0.0, + ), + init_pos=position, + ) + ) + return cube + + +def robot_grasp_pose(robot: Robot, cube: RigidObject, sim: SimulationManager): + sim.update(step=100) + arm_ids = robot.get_joint_ids("arm") + gripper_ids = robot.get_joint_ids("hand") + rest_arm_qpos = robot.get_qpos()[:, arm_ids] + ee_xpos = robot.compute_fk(qpos=rest_arm_qpos, name="arm", to_matrix=True) + target_xpos = ee_xpos.clone() + cube_xpos = cube.get_local_pose(to_matrix=True) + cube_position = cube_xpos[:, :3, 3] + + target_xpos[:, :3, 3] = cube_position + + approach_xpos = target_xpos.clone() + approach_xpos[:, 2, 3] += 0.1 + + is_success, approach_qpos = robot.compute_ik( + pose=approach_xpos, joint_seed=rest_arm_qpos, name="arm" + ) + is_success, target_qpos = robot.compute_ik( + pose=target_xpos, joint_seed=approach_qpos, name="arm" + ) + robot.set_qpos(approach_qpos, joint_ids=arm_ids) + sim.update(step=40) + + robot.set_qpos(target_qpos, joint_ids=arm_ids) + sim.update(step=40) + hand_close_qpos = ( + torch.tensor([0.025, 0.025], device=sim.device) + .unsqueeze(0) + .repeat(sim.num_envs, 1) + ) + robot.set_qpos(hand_close_qpos, joint_ids=gripper_ids) + sim.update(step=20) + + +def create_robot( + sim: SimulationManager, uid: str, position: list = (0.0, 0.0, 0) +) -> Robot: + """create robot + + Args: + sim (SimulationManager): _description_ + uid (str): _description_ + position (list, optional): _description_. Defaults to (0., 0., 0). + + Returns: + Robot: _description_ + """ + ur10_urdf_path = get_data_path("UniversalRobots/UR10/UR10.urdf") + pgi_urdf_path = get_data_path("DH_PGC_140_50/DH_PGC_140_50.urdf") + robot_cfg_dict = { + "uid": uid, + "urdf_cfg": { + "components": [ + { + "component_type": "arm", + "urdf_path": ur10_urdf_path, + "transform": [ + [1.0, 0.0, 0.0, 0.0], + [0.0, 1.0, 0.0, 0.0], + [0.0, 0.0, 1.0, 0.0], + [0.0, 0.0, 0.0, 1.0], + ], + }, + { + "component_type": "hand", + "urdf_path": pgi_urdf_path, + "transform": [ + [1.0, 0.0, 0.0, 0.0], + [0.0, 1.0, 0.0, 0.0], + [0.0, 0.0, 1.0, 0.0], + [0.0, 0.0, 0.0, 1.0], + ], + }, + ], + }, + "init_pos": position, + "init_qpos": [0.0, -1.57, 1.57, -1.57, -1.57, 0.0, 0.0, 0.0], + "drive_pros": { + "stiffness": {"JOINT[1-6]": 1e4, "FINGER[1-2]_JOINT": 1e2}, + "damping": {"JOINT[1-6]": 1e3, "FINGER[1-2]_JOINT": 1e1}, + "max_effort": {"JOINT[1-6]": 1e5, "FINGER[1-2]_JOINT": 1e3}, + }, + "solver_cfg": { + "arm": { + "class_type": "PytorchSolver", + "end_link_name": "ee_link", + "root_link_name": "base_link", + "tcp": [ + [1.0, 0.0, 0.0, 0.0], + [0.0, 1.0, 0.0, 0.0], + [0.0, 0.0, 1.0, 0.13], + [0.0, 0.0, 0.0, 1.0], + ], + } + }, + "control_parts": {"arm": ["JOINT[1-6]"], "hand": ["FINGER[1-2]_JOINT"]}, + } + robot: Robot = sim.add_robot(cfg=RobotCfg.from_dict(robot_cfg_dict)) + return robot + + +def main(): + """Main function to create and run the simulation scene.""" + + # Parse command line arguments + parser = argparse.ArgumentParser( + description="Create a simulation scene with SimulationManager" + ) + parser.add_argument( + "--headless", + action="store_true", + default=False, + help="Run simulation in headless mode", + ) + parser.add_argument( + "--num_envs", type=int, default=64, help="Number of parallel environments" + ) + parser.add_argument( + "--device", type=str, default="cpu", help="Simulation device (cuda or cpu)" + ) + parser.add_argument( + "--enable_rt", + action="store_true", + default=False, + help="Enable ray tracing for better visuals", + ) + args = parser.parse_args() + + # Configure the simulation + sim_cfg = SimulationManagerCfg( + width=1920, + height=1080, + num_envs=args.num_envs, + headless=True, + physics_dt=1.0 / 100.0, # Physics timestep (100 Hz) + sim_device=args.device, + enable_rt=args.enable_rt, # Enable ray tracing for better visuals + ) + + # Create the simulation instance + sim = SimulationManager(sim_cfg) + + # Add objects to the scene + cube0 = create_cube(sim, "cube0", position=[0.0, 0.0, 0.03]) + cube1 = create_cube(sim, "cube1", position=[0.0, 0.0, 0.06]) + cube2 = create_cube(sim, "cube2", position=[0.0, 0.0, 0.09]) + robot = create_robot(sim, "UR10_PGI", position=[0.5, 0.0, 0.0]) + + print("[INFO]: Scene setup complete!") + print(f"[INFO]: Running simulation with {args.num_envs} environment(s)") + print("[INFO]: Press Ctrl+C to stop the simulation") + + # Open window when the scene has been set up + if not args.headless: + sim.open_window() + + robot_grasp_pose(robot, cube2, sim) + # Run the simulation + run_simulation(sim) + + +def run_simulation(sim: SimulationManager): + """Run the simulation loop. + + Args: + sim: The SimulationManager instance to run + """ + + # Initialize GPU physics if using CUDA + if sim.is_use_gpu_physics: + sim.init_gpu_physics() + + step_count = 0 + # contact filter config + contact_filter_cfg = ContactSensorCfg() + contact_filter_cfg.rigid_uid_list = ["cube0", "cube1", "cube2"] + contact_filter_art_cfg = ArticulationContactFilterCfg() + contact_filter_art_cfg.articulation_uid = "UR10_PGI" + contact_filter_art_cfg.link_name_list = ["finger1_link", "finger2_link"] + contact_filter_cfg.articulation_cfg_list = [contact_filter_art_cfg] + contact_filter_cfg.filter_need_both_actor = True + + contact_sensor = sim.add_sensor(sensor_cfg=contact_filter_cfg) + + try: + accmulated_cost_time = 0.0 + while True: + # Update physics simulation + sim.update(step=1) + start_time = time.time() + contact_sensor.update() + contact_report = contact_sensor.get_data() + accmulated_cost_time += time.time() - start_time + step_count += 1 + + # Print FPS every second + if step_count % 100 == 0: + average_cost_time = accmulated_cost_time / 100.0 + print( + f"[INFO]: Fetch contact cost time: {average_cost_time * 1000:.2f} ms, num_envs: {sim.num_envs}" + ) + # filter contact report for a rigid object with a articulation link + cube2_user_ids = sim.get_rigid_object("cube2").get_user_ids() + finger1_user_ids = ( + sim.get_robot("UR10_PGI").get_user_ids("finger1_link").reshape(-1) + ) + filter_user_ids = torch.cat([cube2_user_ids, finger1_user_ids]) + filter_contact_report = contact_sensor.filter_by_user_ids( + filter_user_ids + ) + # print("filter_contact_report", filter_contact_report) + # visualize contact points + contact_sensor.set_contact_point_visibility( + visible=True, rgba=(0.0, 0.0, 1.0, 1.0), point_size=6.0 + ) + accmulated_cost_time = 0.0 + + except KeyboardInterrupt: + print("\n[INFO]: Stopping simulation...") + finally: + # Clean up resources + sim.destroy() + print("[INFO]: Simulation terminated successfully") + + +if __name__ == "__main__": + main() diff --git a/pyproject.toml b/pyproject.toml index f1381090..50fef254 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -26,14 +26,14 @@ dynamic = ["version"] # Core install dependencies (kept from requirements.txt). Some VCS links are # specified using PEP 508 direct references where present. dependencies = [ - "dexsim_engine==0.3.8", + "dexsim_engine==0.3.9", "setuptools>=78.1.1", "gymnasium>=0.29.1", - "casadi==3.7.1", - "pin==2.7.0", "toppra==0.6.3", + "pin", + "pin-pink", + "casadi", "qpsolvers==4.8.1", - "pin-pink==3.4.0", "py_opw_kinematics==0.1.6", "pytorch_kinematics==0.7.6", "polars==1.31.0", diff --git a/scripts/tutorials/sim/create_sensor.py b/scripts/tutorials/sim/create_sensor.py index 1ae8f5b6..128ca677 100644 --- a/scripts/tutorials/sim/create_sensor.py +++ b/scripts/tutorials/sim/create_sensor.py @@ -20,9 +20,7 @@ """ import argparse -import cv2 import numpy as np -import time import torch torch.set_printoptions(precision=4, sci_mode=False) diff --git a/tests/sim/objects/test_articulation.py b/tests/sim/objects/test_articulation.py index 072987ff..458733dc 100644 --- a/tests/sim/objects/test_articulation.py +++ b/tests/sim/objects/test_articulation.py @@ -184,6 +184,15 @@ def test_set_physical_visible(self): all_link_names = self.art.link_names self.art.set_physical_visible(visible=True, link_names=all_link_names[:3]) + def test_setter_methods(self): + """Test setter methods for articulation properties.""" + # Test setting fix_base + self.art.set_fix_base(True) + self.art.set_fix_base(False) + + self.art.set_self_collision(False) + self.art.set_self_collision(True) + def teardown_method(self): """Clean up resources after each test method.""" self.sim.destroy() diff --git a/tests/sim/sensors/test_contact.py b/tests/sim/sensors/test_contact.py new file mode 100644 index 00000000..1b0e0709 --- /dev/null +++ b/tests/sim/sensors/test_contact.py @@ -0,0 +1,240 @@ +# ---------------------------------------------------------------------------- +# Copyright (c) 2021-2025 DexForce Technology Co., Ltd. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ----------------------------------------------------------------------------, + +import pytest +import torch +from embodichain.lab.sim import SimulationManager, SimulationManagerCfg +from embodichain.lab.sim.sensors import StereoCamera, SensorCfg +import time +import torch + +from embodichain.lab.sim import SimulationManager, SimulationManagerCfg +from embodichain.lab.sim.cfg import ( + RigidBodyAttributesCfg, +) +from embodichain.lab.sim.sensors import ( + ContactSensorCfg, + ArticulationContactFilterCfg, +) +from embodichain.lab.sim.shapes import CubeCfg +from embodichain.lab.sim.objects import RigidObject, RigidObjectCfg, Robot, RobotCfg +from embodichain.data import get_data_path + +NUM_ENVS = 4 + + +class ContactTest: + def setup_simulation(self, sim_device, enable_rt): + sim_cfg = SimulationManagerCfg( + width=1920, + height=1080, + num_envs=2, + headless=True, + physics_dt=1.0 / 100.0, # Physics timestep (100 Hz) + sim_device=sim_device, + enable_rt=enable_rt, # Enable ray tracing for better visuals + ) + + # Create the simulation instance + self.sim = SimulationManager(sim_cfg) + + # Add objects to the scene + cube2 = self.create_cube("cube2", position=[0.0, 0.0, 0.09]) + self.robot = self.create_robot("UR10_PGI", position=[0.5, 0.0, 0.0]) + + contact_filter_cfg = ContactSensorCfg() + contact_filter_cfg.rigid_uid_list = ["cube2"] + contact_filter_art_cfg = ArticulationContactFilterCfg() + contact_filter_art_cfg.articulation_uid = "UR10_PGI" + contact_filter_art_cfg.link_name_list = ["finger1_link", "finger2_link"] + contact_filter_cfg.articulation_cfg_list = [contact_filter_art_cfg] + contact_filter_cfg.filter_need_both_actor = True + self.contact_sensor = self.sim.add_sensor(sensor_cfg=contact_filter_cfg) + + self.to_grasp_pose(cube2) + + def create_cube(self, uid: str, position: list = (0.0, 0.0, 0)) -> RigidObject: + """create cube + + Args: + sim (SimulationManager): simulation manager + uid (str): uid of the rigid object + position (list, optional): init position. Defaults to (0., 0., 0). + + Returns: + RigidObject: rigid object + """ + cube_size = (0.025, 0.025, 0.025) + cube: RigidObject = self.sim.add_rigid_object( + cfg=RigidObjectCfg( + uid=uid, + shape=CubeCfg(size=cube_size), + body_type="dynamic", + attrs=RigidBodyAttributesCfg( + mass=0.1, + dynamic_friction=0.9, + static_friction=0.95, + restitution=0.01, + sleep_threshold=0.0, + ), + init_pos=position, + ) + ) + return cube + + def create_robot(self, uid: str, position: list = (0.0, 0.0, 0)) -> Robot: + """create robot + + Args: + sim (SimulationManager): _description_ + uid (str): _description_ + position (list, optional): _description_. Defaults to (0., 0., 0). + + Returns: + Robot: _description_ + """ + ur10_urdf_path = get_data_path("UniversalRobots/UR10/UR10.urdf") + pgi_urdf_path = get_data_path("DH_PGC_140_50/DH_PGC_140_50.urdf") + robot_cfg_dict = { + "uid": "UR10_PGI", + "urdf_cfg": { + "components": [ + { + "component_type": "arm", + "urdf_path": ur10_urdf_path, + "transform": [ + [1.0, 0.0, 0.0, 0.0], + [0.0, 1.0, 0.0, 0.0], + [0.0, 0.0, 1.0, 0.0], + [0.0, 0.0, 0.0, 1.0], + ], + }, + { + "component_type": "hand", + "urdf_path": pgi_urdf_path, + "transform": [ + [1.0, 0.0, 0.0, 0.0], + [0.0, 1.0, 0.0, 0.0], + [0.0, 0.0, 1.0, 0.0], + [0.0, 0.0, 0.0, 1.0], + ], + }, + ], + }, + "init_pos": position, + "init_qpos": [0.0, -1.57, 1.57, -1.57, -1.57, 0.0, 0.0, 0.0], + "drive_pros": { + "stiffness": {"JOINT[1-6]": 1e4, "FINGER[1-2]_JOINT": 1e2}, + "damping": {"JOINT[1-6]": 1e3, "FINGER[1-2]_JOINT": 1e1}, + "max_effort": {"JOINT[1-6]": 1e5, "FINGER[1-2]_JOINT": 1e3}, + }, + "solver_cfg": { + "arm": { + "class_type": "PytorchSolver", + "end_link_name": "ee_link", + "root_link_name": "base_link", + "tcp": [ + [1.0, 0.0, 0.0, 0.0], + [0.0, 1.0, 0.0, 0.0], + [0.0, 0.0, 1.0, 0.13], + [0.0, 0.0, 0.0, 1.0], + ], + } + }, + "control_parts": {"arm": ["JOINT[1-6]"], "hand": ["FINGER[1-2]_JOINT"]}, + } + robot: Robot = self.sim.add_robot(cfg=RobotCfg.from_dict(robot_cfg_dict)) + return robot + + def to_grasp_pose(self, cube: RigidObject): + self.sim.update(step=100) + arm_ids = self.robot.get_joint_ids("arm") + gripper_ids = self.robot.get_joint_ids("hand") + rest_arm_qpos = self.robot.get_qpos()[:, arm_ids] + ee_xpos = self.robot.compute_fk(qpos=rest_arm_qpos, name="arm", to_matrix=True) + target_xpos = ee_xpos.clone() + cube_xpos = cube.get_local_pose(to_matrix=True) + cube_position = cube_xpos[:, :3, 3] + + target_xpos[:, :3, 3] = cube_position + + approach_xpos = target_xpos.clone() + approach_xpos[:, 2, 3] += 0.1 + + is_success, approach_qpos = self.robot.compute_ik( + pose=approach_xpos, joint_seed=rest_arm_qpos, name="arm" + ) + is_success, target_qpos = self.robot.compute_ik( + pose=target_xpos, joint_seed=approach_qpos, name="arm" + ) + self.robot.set_qpos(approach_qpos, joint_ids=arm_ids) + self.sim.update(step=40) + + self.robot.set_qpos(target_qpos, joint_ids=arm_ids) + self.sim.update(step=40) + hand_close_qpos = ( + torch.tensor([0.025, 0.025], device=self.sim.device) + .unsqueeze(0) + .repeat(self.sim.num_envs, 1) + ) + self.robot.set_qpos(hand_close_qpos, joint_ids=gripper_ids) + self.sim.update(step=20) + + def test_fetch_contact(self): + self.sim.update(step=1) + self.contact_sensor.update() + contact_report = self.contact_sensor.get_data() + n_contacts = contact_report["position"].shape[0] + assert n_contacts > 0, "No contact detected." + + cube2_user_ids = self.sim.get_rigid_object("cube2").get_user_ids() + finger1_user_ids = ( + self.sim.get_robot("UR10_PGI").get_user_ids("finger1_link").reshape(-1) + ) + filter_user_ids = torch.cat([cube2_user_ids, finger1_user_ids]) + filter_contact_report = self.contact_sensor.filter_by_user_ids(filter_user_ids) + n_filtered_contact = filter_contact_report["position"].shape[0] + assert n_filtered_contact > 0, "No contact detected between gripper and cube." + + def teardown_method(self): + """Clean up resources after each test method.""" + self.sim.destroy() + + +class TestContactRaster(ContactTest): + def setup_method(self): + self.setup_simulation("cpu", enable_rt=False) + + +class TestContactRasterCuda(ContactTest): + def setup_method(self): + self.setup_simulation("cuda", enable_rt=False) + + +class TestContactFastRT(ContactTest): + def setup_method(self): + self.setup_simulation("cpu", enable_rt=True) + + +class TestContactFastRTCuda(ContactTest): + def setup_method(self): + self.setup_simulation("cuda", enable_rt=True) + + +if __name__ == "__main__": + test = ContactTest() + test.setup_simulation("cuda", enable_rt=True) + test.test_fetch_contact() diff --git a/tests/sim/solvers/test_differential_solver.py b/tests/sim/solvers/test_differential_solver.py index 3ce4b734..fc962206 100644 --- a/tests/sim/solvers/test_differential_solver.py +++ b/tests/sim/solvers/test_differential_solver.py @@ -119,14 +119,9 @@ def test_differential_solver(self, arm_name: str): assert res[0] == False assert ik_qpos.shape == (1, dof) - @classmethod - def teardown_class(cls): - if cls.sim is not None: - try: - cls.sim.destroy() - print("sim destroyed successfully") - except Exception as e: - print(f"Error during sim.destroy(): {e}") + def teardown_method(self): + """Clean up resources after each test method.""" + self.sim.destroy() class TestDifferentialSolver(BaseSolverTest): diff --git a/tests/sim/solvers/test_opw_solver.py b/tests/sim/solvers/test_opw_solver.py index bf6a7c4a..2174389f 100644 --- a/tests/sim/solvers/test_opw_solver.py +++ b/tests/sim/solvers/test_opw_solver.py @@ -117,14 +117,9 @@ def test_ik(self, arm_name: str): assert res[0] == False assert ik_qpos.shape == (1, dof) - @classmethod - def teardown_class(cls): - if cls.sim is not None: - try: - cls.sim.destroy() - print("sim destroyed successfully") - except Exception as e: - print(f"Error during sim.destroy(): {e}") + def teardown_method(self): + """Clean up resources after each test method.""" + self.sim.destroy() class TestOPWSolver(BaseSolverTest): diff --git a/tests/sim/solvers/test_pink_solver.py b/tests/sim/solvers/test_pink_solver.py index 537374a3..43af9fed 100644 --- a/tests/sim/solvers/test_pink_solver.py +++ b/tests/sim/solvers/test_pink_solver.py @@ -123,14 +123,9 @@ def test_differential_solver(self): assert res[0] == False assert ik_qpos.shape == (1, dof) - @classmethod - def teardown_class(cls): - if cls.sim is not None: - try: - cls.sim.destroy() - print("sim destroyed successfully") - except Exception as e: - print(f"Error during sim.destroy(): {e}") + def teardown_method(self): + """Clean up resources after each test method.""" + self.sim.destroy() @pytest.mark.skip(reason="Skipping Pink tests temporarily") diff --git a/tests/sim/solvers/test_pinocchio_solver.py b/tests/sim/solvers/test_pinocchio_solver.py index 3650a853..fdd33d54 100644 --- a/tests/sim/solvers/test_pinocchio_solver.py +++ b/tests/sim/solvers/test_pinocchio_solver.py @@ -104,14 +104,9 @@ def test_ik(self, arm_name: str): assert res[0] == False assert ik_qpos.shape == (1, dof) - @classmethod - def teardown_class(cls): - if cls.sim is not None: - try: - cls.sim.destroy() - print("sim destroyed successfully") - except Exception as e: - print(f"Error during sim.destroy(): {e}") + def teardown_method(self): + """Clean up resources after each test method.""" + self.sim.destroy() class TestPinocchioSolver(BaseSolverTest): diff --git a/tests/sim/solvers/test_pytorch_solver.py b/tests/sim/solvers/test_pytorch_solver.py index 75517700..50382f64 100644 --- a/tests/sim/solvers/test_pytorch_solver.py +++ b/tests/sim/solvers/test_pytorch_solver.py @@ -107,14 +107,9 @@ def test_ik(self, arm_name: str): assert res[0] == False assert ik_qpos.shape == (1, dof) - @classmethod - def teardown_class(cls): - if cls.sim is not None: - try: - cls.sim.destroy() - print("sim destroyed successfully") - except Exception as e: - print(f"Error during sim.destroy(): {e}") + def teardown_method(self): + """Clean up resources after each test method.""" + self.sim.destroy() class TestPytorchSolver(BaseSolverTest): diff --git a/tests/sim/solvers/test_srs_solver.py b/tests/sim/solvers/test_srs_solver.py index 96a644b2..b4925c85 100644 --- a/tests/sim/solvers/test_srs_solver.py +++ b/tests/sim/solvers/test_srs_solver.py @@ -269,14 +269,9 @@ def test_robot_ik(self, arm_name: str): assert res[0] == False assert ik_qpos.shape == (1, dof) - @classmethod - def teardown_class(cls): - if cls.sim is not None: - try: - cls.sim.destroy() - print("sim destroyed successfully") - except Exception as e: - print(f"Error during sim.destroy(): {e}") + def teardown_method(self): + """Clean up resources after each test method.""" + self.sim.destroy() class TestSRSCPUSolver(BaseSolverTest): diff --git a/tests/sim/utility/test_workspace_analyze.py b/tests/sim/utility/test_workspace_analyze.py index 2226057e..72b839be 100644 --- a/tests/sim/utility/test_workspace_analyze.py +++ b/tests/sim/utility/test_workspace_analyze.py @@ -76,14 +76,9 @@ def setup_simulation(self): self.robot: Robot = self.sim.add_robot(cfg=CobotMagicCfg.from_dict(cfg_dict)) - @classmethod - def teardown_class(cls): - if cls.sim is not None: - try: - cls.sim.destroy() - print("sim destroyed successfully") - except Exception as e: - print(f"Error during sim.destroy(): {e}") + def teardown_method(self): + """Clean up resources after each test method.""" + self.sim.destroy() class TestJointWorkspaceAnalyzeTest(BaseWorkspaceAnalyzeTest):