shikiw/OPERA
OPERA is a technique to reduce hallucinations in multimodal large language models by penalizing over-trust and allocating retrospection during inference.

OPERA addresses the hallucination problem in multimodal LLMs through two novel mechanisms: an over-trust penalty that discourages excessive reliance on language model priors, and a retrospection-allocation strategy that enables the model to review and verify its visual commitments. The method is training-free and can be applied during inference to existing MLLMs without additional fine-tuning. It achieves state-of-the-art results on hallucination benchmarks.