An N-gram memory module that outperforms LoRA
TinyEngram open-sources experiments showing that DeepSeek's Engram architecture can inject domain knowledge into Qwen and Stable Diffusion more efficiently than LoRA, with less catastrophic forgetting.

What it does
TinyEngram is a research toolkit that implements DeepSeek’s Engram architecture—an N-gram memory module with gated retrieval inserted into transformer layers—to test whether memory injection can replace or surpass standard parameter-efficient fine-tuning. Built on Qwen-3 and extended to Stable Diffusion, it provides training scripts, pre-processed datasets, and reproduction logs for both language and vision experiments. The project treats visual concepts as retrievable memories injected into the text encoder, leaving the diffusion backbone untouched.
The interesting bit
The unusual angle is that Engram uses exact N-gram matching with hard hash collisions, meaning injected memories strictly do not interfere with each other or the base model’s general capabilities. In their biomedical fine-tuning runs, the model not only avoided catastrophic forgetting but actually improved on general MMLU benchmarks compared to the untouched baseline—suggesting the memory module may stabilize rather than disrupt underlying knowledge.
Key highlights
- Outperforms LoRA on biomedical tasks using only added Engram parameters, per the project’s reported benchmarks on Qwen3-0.6B
- General MMLU scores rose after domain-specific fine-tuning (0.4034 to 0.4500), indicating resistance to catastrophic forgetting
- Extends to Stable Diffusion 1.5 and 3.5 for visual concept injection without fine-tuning the U-Net or DiT weights
- Vocabulary scaling experiments show larger Engram memory banks do not automatically yield better performance, revealing a collision-vs-utilization trade-off
- Full technical report available on arXiv (2605.20309) with open training logs and reproduction scripts
Verdict
Worth a look if you’re researching memory-augmented transformers or need a lightweight, composable alternative to LoRA for concept injection. Skip it if you want a battle-tested production framework rather than an open research notebook.
Frequently asked
- What is AutoArk/TinyEngram?
- TinyEngram open-sources experiments showing that DeepSeek's Engram architecture can inject domain knowledge into Qwen and Stable Diffusion more efficiently than LoRA, with less catastrophic forgetting.
- Is TinyEngram open source?
- Yes — AutoArk/TinyEngram is an open-source project tracked on heatdrop.
- What language is TinyEngram written in?
- AutoArk/TinyEngram is primarily written in Python.
- How popular is TinyEngram?
- AutoArk/TinyEngram has 523 stars on GitHub.
- Where can I find TinyEngram?
- AutoArk/TinyEngram is on GitHub at https://github.com/AutoArk/TinyEngram.