Is SpatialLM open source?

Yes — manycore-research/SpatialLM is an open-source project tracked on heatdrop.

What language is SpatialLM written in?

manycore-research/SpatialLM is primarily written in Python.

How popular is SpatialLM?

manycore-research/SpatialLM has 4.6k stars on GitHub.

Where can I find SpatialLM?

manycore-research/SpatialLM is on GitHub at https://github.com/manycore-research/SpatialLM.

← all repositories

manycore-research/SpatialLM

A 0.5B-parameter model that reads messy room scans like a floor plan

SpatialLM turns noisy point clouds from any source—phone video, RGBD, LiDAR—into structured 3D layouts with walls, doors, and furniture boxes.

★4.6k stars Python Language Models Computer Vision

View on GitHub ↗ Homepage ↗

Not currently ranked — collecting fresh signals.

star history

What it does

SpatialLM ingests 3D point clouds and outputs structured indoor geometry: walls, doors, windows, and oriented bounding boxes for objects with semantic labels. It accepts data from monocular video, RGBD images, or LiDAR—no specialized capture rig required. The repo includes inference scripts, evaluation tools, and fine-tuning instructions via FINETUNE.md.

The interesting bit

The twist is treating this as a language modeling problem. SpatialLM1.1 uses an LLM backbone (Llama-1B or Qwen-0.5B) and a point encoder called Sonata, letting users prompt with specific furniture categories—“bed nightstand”—and get filtered detections back. The model was trained on a synthetic dataset of 12,328 scenes (54,778 rooms) designed by professional 3D artists, then fine-tuned on real benchmarks like Structured3D.

Key highlights

Four model sizes on Hugging Face, down to 0.5B parameters
Three task modes: full structured reconstruction, layout-only (walls/doors/windows), or object detection
59 furniture categories available for user-specified filtering in v1.1
Test set of 107 point clouds reconstructed from monocular RGB video via MASt3R-SLAM, intentionally noisy to stress-test robustness
Benchmarks include corrected metrics after a community bugfix in the eval script

Caveats

Input point clouds must be axis-aligned with z-up; the README calls this “crucial for consistency”
Installation involves building wheels for torchsparse or flash-attn, which “will take a while”
The README notes v1.1 doubles point cloud resolution over v1.0, but doesn’t quantify the memory or speed tradeoff

Verdict

Worth a look if you’re doing embodied robotics, indoor navigation, or any pipeline that needs to go from raw 3D capture to structured scene graphs. Skip it if you need outdoor scenes or real-time on-edge inference—the smallest model is 0.5B parameters and the setup assumes CUDA 12.4.

Frequently asked

What is manycore-research/SpatialLM?: SpatialLM turns noisy point clouds from any source—phone video, RGBD, LiDAR—into structured 3D layouts with walls, doors, and furniture boxes.
Is SpatialLM open source?: Yes — manycore-research/SpatialLM is an open-source project tracked on heatdrop.
What language is SpatialLM written in?: manycore-research/SpatialLM is primarily written in Python.
How popular is SpatialLM?: manycore-research/SpatialLM has 4.6k stars on GitHub.
Where can I find SpatialLM?: manycore-research/SpatialLM is on GitHub at https://github.com/manycore-research/SpatialLM.