Insulin4RL Dataset Enables Realistic Offline RL for ICU Insu

Insulin4RL Dataset Enables Realistic Offline RL for ICU Insulin Management

Thomas Frost, Steve Harris· June 19, 2026 View original

Summary

This paper introduces Insulin4RL, a new healthcare offline reinforcement learning (ORL) dataset derived from MIMIC-IV, featuring naturally irregular inputs and actions for real-time insulin infusion titration in the ICU. It addresses the limitations of current ORL practices that rely on temporally discretized EHR data, providing a more realistic resource for research into clinical decision-making.

Offline reinforcement learning (ORL) holds significant promise for enhancing clinical decision-making by utilizing historical electronic health record (EHR) data. However, a major challenge in current ORL research and evaluation is the reliance on EHR datasets that have been artificially discretized into fixed, regular time intervals. This discretization creates unrealistic representations of complex clinical scenarios, compromising the generalizability of retrospective model evaluations. To address this, the Insulin4RL dataset has been introduced. This healthcare ORL dataset is unique in that it features naturally irregular inputs and actions, reflecting real clinical trajectories. It is derived from the MIMIC-IV database and comprises over 375,000 labeled decisions from 12,209 patients who required insulin infusion titration in the Intensive Care Unit (ICU). Insulin4RL is designed to support research into ORL model performance under realistic clinical sampling assumptions. The paper provides a detailed description of the dataset's structure and characteristics, along with baseline performance metrics using model-free offline reinforcement learning. A standardized evaluation protocol using fitted Q-evaluation is also included, offering a robust framework for future research in this critical area of healthcare AI.

Why it matters

For healthcare AI developers and clinicians, Insulin4RL provides a crucial, realistic dataset for developing and evaluating offline reinforcement learning models for critical care. This can lead to more accurate and generalizable AI-driven decision support systems for complex medical interventions like insulin management, ultimately improving patient outcomes.

How to implement this in your domain

1Utilize the Insulin4RL dataset to develop and test offline reinforcement learning models for real-time clinical decision support in critical care.
2Focus on developing ORL algorithms that can effectively handle naturally irregular time series data, moving beyond fixed-interval discretizations.
3Apply the provided standardized evaluation protocol to ensure robust and comparable assessment of new ORL models.
4Collaborate with clinicians to integrate and validate ORL-driven insulin management strategies in simulated or real-world ICU settings.

Who benefits

HealthcarePharmaceuticalsMedical DevicesAI Research

Key takeaways

Insulin4RL is a new, realistic ORL dataset for real-time insulin management in the ICU.
It features naturally irregular inputs and actions, addressing limitations of discretized EHR data.
The dataset comprises over 375,000 labeled decisions from 12,209 ICU patients.
It provides a standardized evaluation protocol for robust ORL model assessment in healthcare.

Original post by Thomas Frost, Steve Harris

"arXiv:2606.19481v1 Announce Type: new Abstract: Offline reinforcement learning (ORL) offers the potential to improve the quality of clinical decision-making using historical electronic health record (EHR) data. Current training and evaluative practices in this field rely heavily…"

View on X

Originally posted by Thomas Frost, Steve Harris on X · view source

Want to go deeper?

Turn these trends into skills with Learnijoy's hands-on AI & tech courses.

Explore courses

Insulin4RL Dataset Enables Realistic Offline RL for ICU Insulin Management

Why it matters

How to implement this in your domain

Who benefits

Key takeaways

Want to go deeper?

More in AI Research

VISReg Enhances JEPA Training with Novel Regularization

Margaret Atwood Criticizes AI for "Garbage In, Garbage Out" Flaw

Podcast Explores Large Test-Time Compute and AI Model Budgets