Is visualnav-transformer open source?

Yes — robodhruv/visualnav-transformer is open source, released under the MIT license.

What language is visualnav-transformer written in?

robodhruv/visualnav-transformer is primarily written in Python.

How popular is visualnav-transformer?

robodhruv/visualnav-transformer has 1.3k stars on GitHub.

Where can I find visualnav-transformer?

robodhruv/visualnav-transformer is on GitHub at https://github.com/robodhruv/visualnav-transformer.

← all repositories

robodhruv/visualnav-transformer

Foundation models for robots that drive anything with a camera

Official release of BAIR's family of visual navigation models, letting you fine-tune a pre-trained policy instead of teaching your robot to see from scratch.

★1.3k stars Python Domain Apps Agents ML Frameworks

View on GitHub ↗ Homepage ↗

Not currently ranked — collecting fresh signals.

star history

What it does The repository holds the training pipeline, model checkpoints, and deployment scripts for GNM, ViNT, and NoMaD, a trio of goal-conditioned visual navigation models from Berkeley. Trained on pooled data from multiple robot platforms—RECON, TartanDrive, SCAND, and others—the policies learn to navigate toward a target image rather than following a coordinate trail. The release includes tools to process trajectory data into training sets, fine-tune from published checkpoints, and deploy on a physical robot using a topological graph built from demo trajectories.

The interesting bit The core pitch is cross-embodiment generalization: the same policy is meant to drive a LoCoBot, a DJI Tello, or a Unitree A1 in zero-shot fashion because the training data mixes many morphologies. NoMaD extends this with a diffusion-based policy that can mask the goal and switch between directed navigation and random exploration.

Key highlights

Pre-trained checkpoints available for GNM, ViNT, and NoMaD
Trained on pooled public datasets including RECON, TartanDrive, SCAND, GoStanford2, and SACSoN/HuRoN
Independently deployed by researchers on LoCoBot, Clearpath Jackal, DJI Tello, Unitree A1, TurtleBot2, Vizbot, and in CARLA simulation
NoMaD uses goal masking diffusion policies to support both navigation and autonomous exploration
Topological graph navigation generated from subsampled demo trajectories

Caveats

The stack assumes a specific legacy setup: Ubuntu 18.04/20.04, Python 3.7+, CUDA 10+, and ROS Noetic for deployment
Several training datasets used in the papers are unreleased; the README directs you to contact authors for access
Deployment documentation is heavily LoCoBot-centric; adapting to other platforms is noted as possible but left to the researcher

Verdict Worth a look if you run a ROS-based robot with a wide-angle camera and want a pre-trained navigation backbone to fine-tune. Skip it if you are hoping for a drop-in, platform-agnostic navigation app without hardware wrangling.

Frequently asked

What is robodhruv/visualnav-transformer?: Official release of BAIR's family of visual navigation models, letting you fine-tune a pre-trained policy instead of teaching your robot to see from scratch.
Is visualnav-transformer open source?: Yes — robodhruv/visualnav-transformer is open source, released under the MIT license.
What language is visualnav-transformer written in?: robodhruv/visualnav-transformer is primarily written in Python.
How popular is visualnav-transformer?: robodhruv/visualnav-transformer has 1.3k stars on GitHub.
Where can I find visualnav-transformer?: robodhruv/visualnav-transformer is on GitHub at https://github.com/robodhruv/visualnav-transformer.