Is FlagGems open source?

Yes — flagos-ai/FlagGems is open source, released under the Apache-2.0 license.

What language is FlagGems written in?

flagos-ai/FlagGems is primarily written in Python.

How popular is FlagGems?

flagos-ai/FlagGems has 1.1k stars on GitHub.

Where can I find FlagGems?

flagos-ai/FlagGems is on GitHub at https://github.com/flagos-ai/FlagGems.

← all repositories

flagos-ai/FlagGems

Replacing PyTorch's native operators with portable Triton kernels

It reimplements PyTorch ATen operators in Triton so LLMs can run on new AI hardware without touching model code.

★1.1k stars Python Inference · Serving ML Frameworks

View on GitHub ↗

Not currently ranked — collecting fresh signals.

star history

What it does

FlagGems is a library of large-language-model operators written in OpenAI’s Triton language. It registers with PyTorch’s ATen backend, letting existing models call familiar torch APIs while executing backend-neutral Triton kernels rather than vendor-specific ones. The project targets both training and inference across what the README describes as “over 10 supported backends.”

The interesting bit

Instead of rewriting CUDA kernels for every new accelerator, FlagGems bets that Triton is readable enough for developers and portable enough for hardware vendors. It also works in eager mode, so it does not depend on torch.compile to function.

Key highlights

PyTorch-compatible operator collection tested on BERT, Llama-2, and Llava
Hand-optimized kernels for selective operators, plus automatic code generation for pointwise ops
Fast per-function runtime kernel dispatching without waiting on a compiler
Multi-backend interface meant to support diverse hardware platforms
C++ Triton function dispatcher listed as work in progress

Caveats

The README claims “over 10 supported backends” but names none of them, so actual hardware coverage is vague
The C++ dispatcher is explicitly unfinished

Verdict

Worth exploring if you are deploying LLMs on non-NVIDIA accelerators and want to keep the PyTorch API surface intact. If your stack is already optimized for CUDA, this is likely not your problem.

Frequently asked

What is flagos-ai/FlagGems?: It reimplements PyTorch ATen operators in Triton so LLMs can run on new AI hardware without touching model code.
Is FlagGems open source?: Yes — flagos-ai/FlagGems is open source, released under the Apache-2.0 license.
What language is FlagGems written in?: flagos-ai/FlagGems is primarily written in Python.
How popular is FlagGems?: flagos-ai/FlagGems has 1.1k stars on GitHub.
Where can I find FlagGems?: flagos-ai/FlagGems is on GitHub at https://github.com/flagos-ai/FlagGems.