A proofreader for your image prompts
Tencent's Hunyuan team trained a 7B–32B LLM to rewrite vague text-to-image prompts into structured, detail-rich instructions that actually produce what you meant.

What it does
PromptEnhancer is a prompt-rewriting utility that takes your sloppy, half-baked image generation prompts and restructures them into clearer, more detailed versions. It supports both text-to-image enhancement and image-to-image editing instructions, with a multi-level fallback parser to keep outputs reliable.
The interesting bit
The project treats prompt engineering as a translation problem: your vague intent goes in, a chain-of-thought rewrite comes out. It ships quantized GGUF variants (down to 20GB) so you can run the 32B model on a consumer RTX 3090 instead of renting an H100. There’s also a vision-language variant that looks at your source image when refining editing instructions.
Key highlights
- Three model sizes: 7B (13GB, “most users”), 32B full precision (64GB), and GGUF quantized variants (Q4_K_M through Q8_0)
- Dual-mode: text-only T2I enhancement and image-aware Img2Img editing via Qwen2.5-VL
- GGUF support via llama.cpp with claimed 50–75% VRAM reduction
- Chinese and English input supported
- Gradio demo available for the 32B model
Caveats
- The README is upfront that this is specifically tuned for image prompt rewriting; override the system prompt if you want other tasks
- Memory requirements are substantial even for the “entry-level” 7B model (8GB+ VRAM)
- No quantitative benchmarks or comparison data shown in the README itself; evaluation claims reference an arXiv paper and HuggingFace dataset not reproduced here
Verdict
Worth a look if you’re burning GPU hours iterating on prompts for Stable Diffusion, Hunyuan, or similar models. Skip it if you already have a prompt workflow you’re happy with, or if your GPU can’t spare 8GB+ for a sidecar model.