ResearchAI Research AI Engineering & DevTools

MLLMs Offer Low-Cost, Training-Free Concept Explanations

Darian Fern\'andez-Guti\'errez, Rafael Bello, Marilyn Bello, Natalia D\'iaz-Rodr\'iguez· June 30, 2026 View original

Summary

Researchers evaluated mid-scale Multimodal Large Language Models (MLLMs) for localized concept naming without specific training, achieving high accuracy in assigning semantic labels to image regions. This highlights the potential for cost-effective, concept-based Explainable AI (C-XAI) using existing MLLMs.

Explainable AI (XAI) often struggles with validating concept-based explanations due to a lack of detailed annotations. This paper explores whether existing Multimodal Large Language Models (MLLMs) can provide localized, concept-based explanations without requiring additional training. The study introduced a zero-shot evaluation protocol, called Concept Naming (CoNa), to assess MLLMs' ability to label bounding-box regions at both object and part levels. Experiments with various MLLMs demonstrated strong performance, achieving up to 88% object-level accuracy. This suggests that MLLMs can be a powerful, low-cost solution for generating human-understandable concept annotations directly from localized image regions.

Why it matters

Professionals can leverage off-the-shelf MLLMs for generating concept-based explanations, reducing the need for expensive, fine-grained concept annotations and accelerating the development of more transparent AI systems.

How to implement this in your domain

1Integrate mid-scale MLLMs into existing computer vision pipelines to automatically generate localized concept explanations for model predictions.
2Utilize zero-shot concept naming protocols to quickly prototype and evaluate concept-based XAI features without extensive data labeling.
3Explore MLLM capabilities for data annotation, using them to generate initial concept labels for new datasets, reducing manual effort.
4Apply this approach in domains requiring high transparency, such as medical imaging or autonomous driving, to better understand model decisions.

Who benefits

HealthcareAutomotiveManufacturingRetailAI/Tech

Key takeaways

Mid-scale MLLMs can perform localized concept naming in a zero-shot manner.
Training-free approaches offer a low-cost solution for concept-based XAI.
MLLMs can achieve high accuracy in assigning semantic labels to image regions.
This method reduces the need for extensive, fine-grained concept annotations.

Original post by Darian Fern\'andez-Guti\'errez, Rafael Bello, Marilyn Bello, Natalia D\'iaz-Rodr\'iguez

"arXiv:2606.29069v1 Announce Type: new Abstract: Concept-based Explainable AI (C-XAI) seeks human-understandable explanations grounded in semantic concepts, yet validation is limited by the scarcity of fine-grained concept annotations. We evaluate whether mid-scale Multimodal Larg…"

View on X

Originally posted by Darian Fern\'andez-Guti\'errez, Rafael Bello, Marilyn Bello, Natalia D\'iaz-Rodr\'iguez on X · view source

Want to go deeper?

Turn these trends into skills with Learnijoy's hands-on AI & tech courses.

Explore courses

More in AI Research

AI ResearchAI Engineering & DevTools

BaRA Improves LoRA Fine-Tuning with Adaptive Rank Allocation

Researchers introduce BaRA, a Bayesian Adaptive Rank Allocation framework for parameter-efficient fine-tuning, which dynamically adjusts adaptation capacity based on context. This method enhances predictive performance, robustness, and uncertainty calibration compared to standard LoRA and other Bayesian LoRA variants.

Zhibin Duan, Yuhong Wang, Jiahong Fu, Zongsheng Yue, Bo Chen, Zongben XuJun 30, 2026

AI ResearchAI Engineering & DevTools

New Preconditioner Improves Deep Network Training Stability and Performance

Researchers introduce Dead-Direction Conditioners (DDC), a novel preconditioning method that leverages gauge-equivariant optimization to prevent deep network training from drifting along symmetry orbits. This technique improves model stability, reduces overfitting, and enhances performance in language and vision models.

Tejas Pradeep ShirodkarJun 30, 2026

AI ResearchAI Engineering & DevTools

SMDA Traces Training Data Influence on LLM Behavioral Policies

Researchers introduce Symbolic Mechanistic Data Attribution (SMDA), a framework that attributes specific training examples to the interpretable symbolic policies governing an LLM's high-level behavior. SMDA offers a fine-grained diagnostic tool to understand how training data shapes model decisions, revealing safety gaps and unintended influences.

Reza Habibi, Darian Lee, Magy Seif El-NasrJun 30, 2026