Is tribev2 open source?

Yes — facebookresearch/tribev2 is an open-source project tracked on heatdrop.

What language is tribev2 written in?

facebookresearch/tribev2 is primarily written in Jupyter Notebook.

How popular is tribev2?

facebookresearch/tribev2 has 3k stars on GitHub.

Where can I find tribev2?

facebookresearch/tribev2 is on GitHub at https://github.com/facebookresearch/tribev2.

← all repositories

facebookresearch/tribev2

Predicting fMRI responses from video, audio, and text

It predicts fMRI brain responses to video, audio, and text so researchers can experiment on an average virtual cortex instead of a live subject.

★3k stars Jupyter Notebook Domain Apps Image · Video · Audio Language Models

View on GitHub ↗

Not currently ranked — collecting fresh signals.

star history

What it does TRIBE v2 is a multimodal brain encoding model that predicts fMRI responses to naturalistic stimuli—video, audio, or text—by fusing state-of-the-art vision, audio, and language models into a single Transformer. It maps the resulting representation onto the standard fsaverage5 cortical surface mesh, outputting predictions for an average subject across roughly 20,000 vertices. The model accounts for the sluggish hemodynamic response by offsetting predictions five seconds into the past, keeping the simulation tied to biological reality.

The interesting bit The project treats brain activity as just another target modality, but the clever part is the plumbing: it auto-converts text to speech and transcribes it back to extract word-level timings, forcing alignment between fundamentally different input streams. That, plus the built-in hemodynamic lag compensation, suggests the authors actually understand the measurement they are modeling rather than just throwing a Transformer at a brain-shaped target.

Key highlights

Pretrained weights are on HuggingFace and loadable via TribeModel for immediate inference.
Predictions live on the standardized fsaverage5 surface, so results are directly comparable across labs and studies.
Accepts raw video, audio, or text; text inputs are automatically converted to speech and transcribed to derive temporal alignment cues.
Training pipeline is built on PyTorch Lightning and includes Slurm grid-search configurations for cortical and subcortical runs.
Ships with brain-visualization backends (PyVista and Nilearn) for plotting predictions directly onto cortical surfaces.
Licensed under CC-BY-NC-4.0, which keeps it firmly in the research domain.

Caveats

Predictions are explicitly for an “average” subject, so individual-level or clinical applications would require retraining on specific datasets.
The non-commercial license blocks any product or commercial deployment.
Training from scratch requires external fMRI study data; the repo provides dataset definitions (e.g., Algonauts2025, Lahner2024) but not the raw scans themselves.

Verdict Neuroscientists and ML researchers probing the alignment between deep multimodal models and human sensory cortex will find this a solid starting point. If you are looking for a general-purpose video-audio-text embedding model, this is the wrong tree—its entire architecture is bent toward predicting brain-shaped outputs.

Frequently asked

What is facebookresearch/tribev2?: It predicts fMRI brain responses to video, audio, and text so researchers can experiment on an average virtual cortex instead of a live subject.
Is tribev2 open source?: Yes — facebookresearch/tribev2 is an open-source project tracked on heatdrop.
What language is tribev2 written in?: facebookresearch/tribev2 is primarily written in Jupyter Notebook.
How popular is tribev2?: facebookresearch/tribev2 has 3k stars on GitHub.
Where can I find tribev2?: facebookresearch/tribev2 is on GitHub at https://github.com/facebookresearch/tribev2.