UCSC-VLAA/story-iter
A training-free iterative diffusion model framework for generating coherent image sequences from long narrative text.

Story-Iter is a research implementation for long story visualization using diffusion models. It introduces an external iterative paradigm that refines each generated image by incorporating reference images from previous rounds, addressing semantic consistency and fine-grained interaction challenges in multi-frame visual storytelling. The framework proposes a plug-and-play global reference cross-attention mechanism that operates without additional training.