MMIR-TCM Enhances Traditional Chinese Medicine Diagnosis with AI

Lihui Luo, Joongwon Chae, Ziyan Chen, Yang Liu, Siyi Cheng, Weihan Gao, Zelin Zeng, Xiaoming Yin, Samaneh Beheshti Kashi, Dongmei Yu, Lian Zhang, Jing Sui, Zeming Liang, Jiansong Ji, Peter E. Lobie, Peiwu Qin· July 3, 2026 View original

Summary

MMIR-TCM is a novel AI framework that integrates multimodal large language models with memory-augmented segmentation and retrieval-augmented generation to improve Traditional Chinese Medicine (TCM) diagnosis, particularly tongue inspection. It addresses challenges like subjectivity and data scarcity by emulating expert diagnostic processes and utilizing a new large-scale multimodal dataset, MedTCM.

Traditional Chinese Medicine (TCM) diagnosis, especially through visual tongue inspection, suffers from subjectivity and a lack of reproducibility. Applying multimodal AI to TCM tasks like syndrome differentiation and prescription generation is further complicated by the semantic gap between visual features and textual reasoning, alongside a scarcity of large, standardized datasets. To overcome these hurdles, researchers have developed MMIR-TCM, a new framework designed to mimic the diagnostic workflow of TCM experts. MMIR-TCM employs a three-stage architecture. It begins with a training-free Memory-SAM module for robust tongue image extraction. This is followed by a fine-tuned Qwen3-VL model that generates structured tongue diagnoses. Finally, a Qwen3-based Retrieval-Augmented Generation (RAG) component provides evidence-grounded clinical decision support. The framework was developed and validated using MedTCM, a newly introduced large-scale multimodal dataset. To accurately assess clinical accuracy, a domain-specific metric, TDEU, was also created. Comprehensive experiments show MMIR-TCM significantly outperforms leading models like GPT-4o and Gemini 2.5 Flash, demonstrating its potential to enhance TCM clinical decision support.

Why it matters

For healthcare professionals and AI developers, MMIR-TCM offers a promising pathway to standardize and improve the accuracy of TCM diagnosis, potentially leading to more consistent and effective patient care through AI-driven insights.

How to implement this in your domain

  1. 1Explore integrating multimodal AI frameworks like MMIR-TCM into clinical decision support systems for specialized diagnostics.
  2. 2Invest in developing or acquiring large-scale, standardized multimodal datasets for niche medical domains.
  3. 3Collaborate with domain experts to fine-tune and validate AI models for specific diagnostic tasks.
  4. 4Develop domain-specific evaluation metrics to accurately assess the clinical utility and safety of AI systems.
  5. 5Pilot AI-assisted diagnostic tools in a controlled clinical setting to gather real-world performance data.

Who benefits

HealthcareAI DevelopmentMedical DiagnosticsPharmaceuticals

Key takeaways

  • MMIR-TCM improves TCM diagnosis, especially tongue inspection, using multimodal AI.
  • The framework addresses subjectivity and data scarcity in TCM through expert emulation.
  • A new large-scale multimodal dataset, MedTCM, was introduced for TCM research.
  • MMIR-TCM outperforms leading general-purpose MLLMs in TCM diagnostic accuracy.

Original post by Lihui Luo, Joongwon Chae, Ziyan Chen, Yang Liu, Siyi Cheng, Weihan Gao, Zelin Zeng, Xiaoming Yin, Samaneh Beheshti Kashi, Dongmei Yu, Lian Zhang, Jing Sui, Zeming Liang, Jiansong Ji, Peter E. Lobie, Peiwu Qin

"arXiv:2607.01814v1 Announce Type: new Abstract: Traditional Chinese Medicine (TCM) diagnosis, particularly through tongue inspection, faces persistent challenges in subjectivity and reproducibility. The application of multimodal artificial intelligence to TCM clinical tasks, such…"

View on X

Originally posted by Lihui Luo, Joongwon Chae, Ziyan Chen, Yang Liu, Siyi Cheng, Weihan Gao, Zelin Zeng, Xiaoming Yin, Samaneh Beheshti Kashi, Dongmei Yu, Lian Zhang, Jing Sui, Zeming Liang, Jiansong Ji, Peter E. Lobie, Peiwu Qin on X · view source

Want to go deeper?

Turn these trends into skills with Learnijoy's hands-on AI & tech courses.

Explore courses

More in AI Research

AI ResearchAI Engineering & DevTools

New Methods for Log-Density-Ratio Estimation in Gaussian Models

This research compares ridge-regularized variational and spectral log-density-ratio estimation in Gaussian location models, deriving high-dimensional asymptotic equivalents to analyze their population risks. It concludes that variational estimators perform better with many observations, while spectral estimators are favored with fewer due to lower variance.

Francis Bach (SIERRA)Jul 3, 2026
AI ResearchAI Engineering & DevTools

Dynamic Support Learning Enhances Reinforcement Learning Value Estimation

This paper introduces an approach that dynamically learns the lower and upper bounds of support intervals for categorical critics in reinforcement learning, improving value function estimation. The method, which forms a tighter upper bound on the mean-squared Bellman error, enhances stability and performance on continuous-control tasks without requiring pre-defined support intervals.

Jen-Yen Chang, Takayuki Osa, Tatsuya HaradaJul 3, 2026
AI Engineering & DevToolsAI Research

Decomposer Recovers Music Programs from Symbolic MIDI Data

Decomposer is a new framework that decompiles symbolic MIDI music into executable Strudel programs, allowing for the recovery of high-level musical instructions. It addresses challenges of low-resource language data and code readability by using synthetic data for fine-tuning and reinforcement learning to optimize both reconstruction faithfulness and code clarity.

Yewon Kim, Apurva Gandhi, David Chung, Graham Neubig, Chris DonahueJul 3, 2026