chiennv2000/orthrus
Dual-architecture framework combining autoregressive LLM fidelity with parallel diffusion token generation for fast, lossless inference.

Velocity · 7d
+16
★ / day
Trend
→steady
star history
Orthrus implements a dual-view diffusion decoding approach that unifies autoregressive generation accuracy with diffusion model parallelization. The framework modifies Qwen3 with specialized components enabling 4-5x inference speedups while guaranteeing strictly lossless output compared to the base model. Available in 1.7B, 4B, and 8B parameter variants.