Is lingbot-map open source?

Yes — Robbyant/lingbot-map is open source, released under the Apache-2.0 license.

What language is lingbot-map written in?

Robbyant/lingbot-map is primarily written in Python.

How popular is lingbot-map?

Robbyant/lingbot-map has 14.7k stars on GitHub and is currently accelerating.

Where can I find lingbot-map?

Robbyant/lingbot-map is on GitHub at https://github.com/Robbyant/lingbot-map.

← all repositories

Robbyant/lingbot-map

Real-time 3D mapping without the per-scene training ritual

It reconstructs 3D scenes from streaming video in real time without per-scene optimization, using a feed-forward transformer that remembers trajectory and corrects drift as it goes.

★14.7k stars Python Computer Vision

View on GitHub ↗

Velocity · 7d

+1071

★ / day

Trend

↗accelerating

star history

What it does

LingBot-Map ingests a stream of RGB frames and emits a 3D reconstruction on the fly. It is a feed-forward foundation model, so it skips the hours-long per-scene training that NeRFs and Gaussian splatting usually require. For short sequences it launches an interactive browser viewer; for very long walkthroughs it falls back to an offline batch renderer.

The interesting bit

The architecture treats streaming reconstruction like a language model: it uses paged KV-cache attention, anchor contexts, and a pose-reference window to maintain trajectory memory and correct long-range drift. This lets it process over 10,000 frames at roughly 20 FPS on 518×378 video without iterative refinement. The README notes a hard boundary—performance degrades if the KV cache exceeds 320 views unless you apply keyframe subsampling or switch to windowed mode.

Key highlights

Feed-forward inference: no scene-specific training, just a pretrained checkpoint and a folder of images.
Long-sequence support: handles 10,000+ frames via keyframe intervals and a sliding-window mode for sequences beyond 3,000 frames.
Built-in drift correction: unifies coordinate grounding, geometric cues, and trajectory memory in a single transformer pass.
Hardware-aware attention: uses FlashInfer for paged KV-cache attention, falling back to PyTorch SDPA if unavailable.
Released benchmarks: evaluation scripts are up for KITTI and Oxford Spires, with more datasets on the way.

Caveats

The model’s effective range is bounded by the longest distance seen during training; beyond that, poses can collapse unless you enable windowed inference.
Interactive visualization tops out for very long sequences—the 25,000-frame demo requires the offline rendering pipeline instead of the live viewer.
Several benchmarks (TUM-D, 7-scenes, ETH3D, Tanks and Temples, NRGBD) and outdoor aerial demos are still listed as pending in the project TODO.

Verdict

Worth a look if you need fast, pretrained 3D reconstruction for robotics, AR, or video analysis without tuning a radiance field. Skip it if you need guaranteed metric accuracy beyond the training distribution or a fully polished evaluation against every classic SLAM benchmark.

Frequently asked

What is Robbyant/lingbot-map?: It reconstructs 3D scenes from streaming video in real time without per-scene optimization, using a feed-forward transformer that remembers trajectory and corrects drift as it goes.
Is lingbot-map open source?: Yes — Robbyant/lingbot-map is open source, released under the Apache-2.0 license.
What language is lingbot-map written in?: Robbyant/lingbot-map is primarily written in Python.
How popular is lingbot-map?: Robbyant/lingbot-map has 14.7k stars on GitHub and is currently accelerating.
Where can I find lingbot-map?: Robbyant/lingbot-map is on GitHub at https://github.com/Robbyant/lingbot-map.