Teaching robots to grasp via 3D octrees, not pixels
A ROS 2/Gazebo training stack that swaps camera images for sparse octree structures to learn manipulation policies—then ships them to real arms and Moon rovers.

What it does This repo is a full RL training pipeline for robotic grasping built on ROS 2, Gazebo Fortress, and Stable-Baselines3. It provides Gym-compatible environments where a robot arm learns to reach and grasp objects using continuous Cartesian actions. The twist: observations are 3D octrees, not the usual RGB or depth images, fed through an O-CNN-based feature extractor. Trained policies can deploy to real hardware via ros2_control, and the stack includes a planetary-grasping task for mobile manipulators.
The interesting bit The project treats octrees as the native observation format end-to-end, using a custom PyTorch 3D CNN feature extractor that plugs directly into SB3’s actor-critic networks. That’s unusual—most sim-to-real work leans on 2D vision. The authors also went to the trouble of zero-shot transferring a Moon-rock-grasping policy to a lunar-analogue facility, which is either admirably ambitious or a sign they had access to very patient colleagues in Luxembourg.
Key highlights
- Multiple observation variants per task (state, RGB, depth, octree, octree+color/intensity) for direct ablation
- Built-in domain randomization via
ManipulationGazeboEnvRandomizer: textures, lighting, object properties, even procedural rock generation - Supports TD3, SAC, TQC out of the box; DreamerV2 setup included but RGB-only
- Docker-based setup recommended; real-robot evaluation supported through ROS 2 control stack
- Curriculum learning (
GraspCurriculum) shapes reward and difficulty automatically
Caveats
- No parallel env support, so training is slow—physics + rendering + low-level control bottlenecks are acknowledged
- Default hyperparameters may be suboptimal; the authors note hyperparameter search is “prolonged” and brittle here
- Experiments are nondeterministic even with fixed seeds, due to network comms and GPU non-determinism
Verdict Worth a look if you’re doing sim-to-real manipulation research and want a drop-in ROS 2/Gazebo stack with an unusual 3D representation. Skip it if you need fast iteration or production-grade reproducibility—the training speed and nondeterminism are real constraints.