FactoryLLM: Safe Open-Source AI Playground for Smart Factories

Yash Pulse, Yong-Bin Kang, Abhik Banerjee, Abdur Forkan, Prem Prakash Jayaraman· June 15, 2026 View original

Summary

FactoryLLM is a new open-source AI playground designed for safely evaluating LLM-based retrieval-augmented generation (RAG) models in smart factories. It enables cross-machine document reasoning for fault diagnostics and recovery without sharing sensitive industrial data.

Diagnosing and recovering from faults in smart factories presents a significant challenge due to the vast amount of critical information dispersed across numerous machine manuals and interconnected manufacturing processes. Large Language Models (LLMs) offer a promising solution for navigating this complexity. This paper introduces FactoryLLM, an open-source and secure AI testing environment specifically developed for evaluating different LLM-based RAG models. It allows users to analyze documentation from multiple machines across an entire manufacturing process. A key feature is its safety, as it supports running local or open-source LLMs, ensuring that sensitive industrial data remains within the user's control and is not shared externally. FactoryLLM provides a dual evaluation setup, utilizing both RAGAS and NVIDIA's LLM-as-a-Judge metrics, to assess model performance in multi-document reasoning. A case study involving an Autonomous Intelligent Vehicle demonstrated its effectiveness, with models achieving high groundedness scores in cross-machine document reasoning. The full code and documentation are publicly available for community use.

Why it matters

Professionals in manufacturing and industrial automation can leverage FactoryLLM to safely experiment with and deploy LLMs for critical tasks like fault diagnostics, improving operational efficiency and reducing downtime without compromising data security.

How to implement this in your domain

  1. 1Download and deploy FactoryLLM in a secure, local environment to begin evaluating LLM performance.
  2. 2Configure various open-source or local LLMs within FactoryLLM to analyze manufacturing-specific documentation.
  3. 3Utilize the dual evaluation metrics (RAGAS and LLM-as-a-Judge) to rigorously assess the reasoning capabilities of RAG models.
  4. 4Develop custom maintenance queries and scenarios based on specific factory operations to test LLM efficacy.
  5. 5Integrate successful LLM configurations into existing fault diagnostics and recovery workflows to enhance decision-making.

Who benefits

ManufacturingIndustrial AutomationLogisticsAerospaceAutomotive

Key takeaways

  • FactoryLLM provides a safe, open-source platform for evaluating LLMs in smart factory environments.
  • It enables effective cross-machine document reasoning for fault diagnostics and recovery.
  • The platform supports dual evaluation metrics for comprehensive performance assessment.
  • Users can experiment with LLMs without compromising sensitive industrial data.

Original post by Yash Pulse, Yong-Bin Kang, Abhik Banerjee, Abdur Forkan, Prem Prakash Jayaraman

"arXiv:2606.14119v1 Announce Type: new Abstract: Fault diagnostics and recovery in smart factories is challenging because critical information is dispersed across manuals of multiple machines which are interconnected through the manufacturing process. Large Language Models (LLMs)…"

View on X

Originally posted by Yash Pulse, Yong-Bin Kang, Abhik Banerjee, Abdur Forkan, Prem Prakash Jayaraman on X · view source

Want to go deeper?

Turn these trends into skills with Learnijoy's hands-on AI & tech courses.

Explore courses