FADE Method Reduces Hallucinations in Vision-Language Models

Yichen Guo, Kai Tang, Fenglai Lin, Yiding Sun, Dongshuo Zhang, Wenya Wang, Lin William Cong, Shanghang Zhang· June 30, 2026 View original

Summary

Researchers propose FADE (FFN Attenuation for DEcoding), a training-free method to mitigate hallucinations in Large Vision-Language Models (LVLMs) by reducing the dominance of language priors. The method attenuates Feed-Forward Network (FFN) outputs, which are identified as a source of language priors overriding visual evidence.

Large Vision-Language Models (LVLMs) often generate content that doesn't align with input images, a phenomenon known as hallucination. Recent studies suggest this issue stems from language priors dominating visual inputs. New research delves into the mechanistic origins, finding that transformer attention modules consistently gather visual evidence, but Feed-Forward Network (FFN) modules in critical layers introduce language priors that can override this visual information, leading to incorrect outputs. Based on this insight, a novel, training-free method called FADE (FFN Attenuation for DEcoding) has been introduced. FADE works by attenuating the outputs of the FFN modules, thereby reducing the influence of these language priors. This approach aims to rebalance the model's reliance on visual versus linguistic information during decoding. Evaluations across various benchmarks (POPE, CHAIR, MME) and LVLMs (LLaVA-1.5, mPLUG-Owl2, InstructBLIP) confirm that FADE effectively reduces hallucinations. Crucially, it achieves this without requiring additional training and maintains inference efficiency, offering a practical solution to a significant challenge in multimodal AI.

Why it matters

Hallucinations are a major barrier to reliable deployment of LVLMs. This training-free method offers a practical and efficient way to improve the factual consistency of AI-generated content, making LVLMs more trustworthy for real-world applications.

How to implement this in your domain

  1. 1Integrate FADE into existing LVLM inference pipelines to reduce hallucinations.
  2. 2Benchmark the impact of FADE on specific application-critical hallucination metrics.
  3. 3Experiment with FFN attenuation levels to optimize for specific use cases.
  4. 4Educate development teams on the mechanistic origins of LVLM hallucinations and FADE's solution.
  5. 5Prioritize LVLM models that can easily incorporate post-hoc decoding adjustments like FADE.

Who benefits

AI DevelopmentContent CreationE-commerceHealthcareRobotics

Key takeaways

  • FFN modules in LVLMs are a primary source of language priors causing hallucinations.
  • FADE is a training-free method that attenuates FFN outputs to mitigate hallucinations.
  • The method effectively improves factual consistency across various LVLMs and benchmarks.
  • FADE maintains inference efficiency, making it practical for immediate deployment.

Original post by Yichen Guo, Kai Tang, Fenglai Lin, Yiding Sun, Dongshuo Zhang, Wenya Wang, Lin William Cong, Shanghang Zhang

"arXiv:2606.29431v1 Announce Type: new Abstract: Despite the impressive capabilities of Large Vision-Language Models (LVLMs), they remain susceptible to hallucination, generating content inconsistent with the input image. Recent studies attribute this to the dominance of language…"

View on X

Originally posted by Yichen Guo, Kai Tang, Fenglai Lin, Yiding Sun, Dongshuo Zhang, Wenya Wang, Lin William Cong, Shanghang Zhang on X · view source

Want to go deeper?

Turn these trends into skills with Learnijoy's hands-on AI & tech courses.

Explore courses