Is MonoScene open source?

Yes — astra-vision/MonoScene is open source, released under the Apache-2.0 license.

What language is MonoScene written in?

astra-vision/MonoScene is primarily written in Python.

How popular is MonoScene?

astra-vision/MonoScene has 815 stars on GitHub.

Where can I find MonoScene?

astra-vision/MonoScene is on GitHub at https://github.com/astra-vision/MonoScene.

← all repositories

astra-vision/MonoScene

Single-image 3D scene completion without the LiDAR bill

MonoScene predicts dense 3D semantic occupancy from one RGB image, filling in occluded geometry that the camera never actually saw.

★815 stars Python Computer Vision Image · Video · Audio Domain Apps

View on GitHub ↗ Homepage ↗

Not currently ranked — collecting fresh signals.

star history

What it does MonoScene takes a single RGB image and outputs a 3D semantic occupancy grid — predicting both visible and occluded geometry along with per-voxel class labels. It targets outdoor driving data (SemanticKITTI, KITTI-360) and indoor scenes (NYUv2), offering pretrained checkpoints for immediate evaluation.

The interesting bit The repository treats monocular scene completion as a practical prediction problem rather than a theoretical exercise, shipping with a Hugging Face demo and multi-GPU training scripts. That said, the README is almost entirely a setup manual; the actual network architecture and loss design are left to the paper.

Key highlights

Single-image inference on SemanticKITTI, NYUv2, and KITTI-360
Pretrained models available for direct download and evaluation
Live demo hosted on Hugging Face for quick browser testing
Visualization pipeline via mayavi, with an Open3D workaround noted in the issues
Multi-GPU PyTorch training scripts included for SemanticKITTI and NYUv2

Caveats

The README is a setup guide with almost no architectural detail; you will need the paper to understand how the model actually works.
Dependency stack is locked to older releases (Python 3.7, PyTorch 1.7.1, CUDA 10.2).
KITTI-360 support is inference-only; training is limited to SemanticKITTI and NYUv2.

Verdict A solid CVPR 2022 baseline if you are benchmarking camera-only 3D occupancy methods. Look elsewhere if you need a modern, well-documented training framework or turnkey dependency management.

Frequently asked

What is astra-vision/MonoScene?: MonoScene predicts dense 3D semantic occupancy from one RGB image, filling in occluded geometry that the camera never actually saw.
Is MonoScene open source?: Yes — astra-vision/MonoScene is open source, released under the Apache-2.0 license.
What language is MonoScene written in?: astra-vision/MonoScene is primarily written in Python.
How popular is MonoScene?: astra-vision/MonoScene has 815 stars on GitHub.
Where can I find MonoScene?: astra-vision/MonoScene is on GitHub at https://github.com/astra-vision/MonoScene.