← all repositories

chiennv2000/orthrus

Dual-architecture framework combining autoregressive LLM fidelity with parallel diffusion token generation for fast, lossless inference.

orthrus
Velocity · 7d
+16
★ / day
Trend
steady
star history

Orthrus implements a dual-view diffusion decoding approach that unifies autoregressive generation accuracy with diffusion model parallelization. The framework modifies Qwen3 with specialized components enabling 4-5x inference speedups while guaranteeing strictly lossless output compared to the base model. Available in 1.7B, 4B, and 8B parameter variants.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.