← all repositories
AndrejOrsula/drl_grasping

Teaching robots to grasp via 3D octrees, not pixels

A ROS 2/Gazebo training stack that swaps camera images for sparse octree structures to learn manipulation policies—then ships them to real arms and Moon rovers.

513 stars Python Domain AppsAgentsML Frameworks
drl_grasping
Velocity · 7d
+0.2
★ / day
Trend
steady
star history

What it does This repo is a full RL training pipeline for robotic grasping built on ROS 2, Gazebo Fortress, and Stable-Baselines3. It provides Gym-compatible environments where a robot arm learns to reach and grasp objects using continuous Cartesian actions. The twist: observations are 3D octrees, not the usual RGB or depth images, fed through an O-CNN-based feature extractor. Trained policies can deploy to real hardware via ros2_control, and the stack includes a planetary-grasping task for mobile manipulators.

The interesting bit The project treats octrees as the native observation format end-to-end, using a custom PyTorch 3D CNN feature extractor that plugs directly into SB3’s actor-critic networks. That’s unusual—most sim-to-real work leans on 2D vision. The authors also went to the trouble of zero-shot transferring a Moon-rock-grasping policy to a lunar-analogue facility, which is either admirably ambitious or a sign they had access to very patient colleagues in Luxembourg.

Key highlights

  • Multiple observation variants per task (state, RGB, depth, octree, octree+color/intensity) for direct ablation
  • Built-in domain randomization via ManipulationGazeboEnvRandomizer: textures, lighting, object properties, even procedural rock generation
  • Supports TD3, SAC, TQC out of the box; DreamerV2 setup included but RGB-only
  • Docker-based setup recommended; real-robot evaluation supported through ROS 2 control stack
  • Curriculum learning (GraspCurriculum) shapes reward and difficulty automatically

Caveats

  • No parallel env support, so training is slow—physics + rendering + low-level control bottlenecks are acknowledged
  • Default hyperparameters may be suboptimal; the authors note hyperparameter search is “prolonged” and brittle here
  • Experiments are nondeterministic even with fixed seeds, due to network comms and GPU non-determinism

Verdict Worth a look if you’re doing sim-to-real manipulation research and want a drop-in ROS 2/Gazebo stack with an unusual 3D representation. Skip it if you need fast iteration or production-grade reproducibility—the training speed and nondeterminism are real constraints.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.