Is LTX-2 open source?

Yes — Lightricks/LTX-2 is an open-source project tracked on heatdrop.

What language is LTX-2 written in?

Lightricks/LTX-2 is primarily written in Python.

How popular is LTX-2?

Lightricks/LTX-2 has 8.4k stars on GitHub and is currently cooling off.

Where can I find LTX-2?

Lightricks/LTX-2 is on GitHub at https://github.com/Lightricks/LTX-2.

← all repositories

Lightricks/LTX-2

The first open DiT model that generates video and its own audio

LTX-2 gives developers open-weight access to a DiT foundation model that generates synchronized audio and video from text, images, or sound.

★8.4k stars Python Image · Video · Audio Inference · Serving ML Frameworks

View on GitHub ↗ Homepage ↗

Velocity · 7d

+14

★ / day

Trend

↘cooling

star history

What it does LTX-2 is a diffusion transformer (DiT) foundation model that generates video with synchronized audio. The repository provides official Python inference pipelines and LoRA trainers. It supports text-to-video, image-to-video, video-to-video, audio-to-video, keyframe interpolation, retakes of specific time regions, HDR output, and lip dubbing. The model uses a two-stage pipeline for production quality—generating at a lower resolution then upscaling spatially—and offers a distilled variant that runs in 8 steps for the first stage and 4 for the second.

The interesting bit Most generative video models treat audio as an afterthought; LTX-2 bakes synchronized sound directly into the same DiT architecture. It also ships with a small army of task-specific LoRAs—camera dollies, motion tracking, pose control, HDR tonemapping, and lip dubbing—turning one base model into a Swiss Army knife for video production.

Key highlights

Synchronized audio-video generation from a single DiT foundation model.
Ten distinct inference pipelines, from fast prototyping to production two-stage upsampling and distilled generation.
Native FP8 quantization and optional memory cleanup for VRAM management, plus support for datacenter Blackwell and Hopper GPUs.
Automatic prompt enhancement and a separate ComfyUI integration repository.
HDR output via LogC3 inverse decode for linear float frames suitable for EXR export.

Caveats

Production-quality generation relies on a two-stage pipeline that requires several discrete model files—base checkpoint, spatial upscaler, distilled LoRA, and the Gemma 3 text encoder—so disk footprint and dependency management are non-trivial.
Temporal upscaler weights are published but not yet integrated into current pipelines; the README states they will be required for future implementations.
FlashAttention 4 support is locked to a specific beta revision (4.0.0b9) verified against PyTorch 2.9.1+cu128, with newer betas reportedly causing issues on consumer Blackwell GPUs.

Verdict Worth a look if you are building video generation tools, VFX pipelines, or research prototypes that need synchronized audio and fine-grained camera or motion control. Skip it if you were hoping for a single-file, run-anywhere model; this is a professional-grade toolbox that expects serious GPU hardware and patience.

Frequently asked

What is Lightricks/LTX-2?: LTX-2 gives developers open-weight access to a DiT foundation model that generates synchronized audio and video from text, images, or sound.
Is LTX-2 open source?: Yes — Lightricks/LTX-2 is an open-source project tracked on heatdrop.
What language is LTX-2 written in?: Lightricks/LTX-2 is primarily written in Python.
How popular is LTX-2?: Lightricks/LTX-2 has 8.4k stars on GitHub and is currently cooling off.
Where can I find LTX-2?: Lightricks/LTX-2 is on GitHub at https://github.com/Lightricks/LTX-2.