Accelerometry Predicts Cardiometabolic Risk in New Benchmark

Federico Felizzi· July 1, 2026 View original

Summary

Researchers introduced the NHANES Accelerometry Cardiometabolic Benchmark, a population-representative tabular dataset, to evaluate machine learning models in predicting cardiometabolic risk biomarkers from accelerometry and lifestyle data. TabPFN v2 showed the best performance for predicting HbA1c and CRP, while also highlighting subgroup fairness issues with prediction intervals.

A new benchmark dataset, the NHANES Accelerometry Cardiometabolic Benchmark, has been developed to advance the use of digital biomarkers in healthcare. This dataset, derived from the NHANES 2003-2006 study, includes accelerometry data from hip-worn devices, alongside fasting laboratory biomarkers, dietary intake, and anthropometrics for over a thousand adults. Its design specifically addresses real-world challenges in clinical data, such as complex survey sampling and demographic oversampling, which are often overlooked in existing benchmarks. The study evaluated three tabular learning methods—ridge regression, XGBoost, and TabPFN v2—for their ability to predict key cardiometabolic indicators like glycated haemoglobin (HbA1c), fasting triglycerides, and C-reactive protein (CRP). TabPFN v2 emerged as the top performer for HbA1c and CRP prediction, though triglycerides remained largely unpredictable, consistent with their known genetic dominance. Crucially, the research also applied split conformal prediction to quantify uncertainty and assess fairness across demographic subgroups. While overall prediction intervals met targets for CRP and HbA1c, localized undercoverage was observed for specific subgroups, such as Mexican American participants for HbA1c. This highlights a critical gap between marginal guarantees and the conditional coverage needed for equitable clinical application.

Why it matters

This benchmark and research demonstrate the potential of wearable device data (accelerometry) for predicting cardiometabolic risk, while also emphasizing the critical need for robust uncertainty quantification and fairness evaluation in AI models used in healthcare.

How to implement this in your domain

  1. 1Explore integrating accelerometry data from wearables into predictive models for early disease risk assessment.
  2. 2Implement uncertainty quantification methods like conformal prediction in clinical AI models to provide reliable prediction intervals.
  3. 3Conduct thorough subgroup fairness analyses for all AI models deployed in healthcare to identify and mitigate biases.
  4. 4Collaborate with data scientists to leverage population-representative datasets for developing more robust and equitable health AI solutions.

Who benefits

HealthcareWearable TechnologyInsurancePharmaceuticals

Key takeaways

  • Accelerometry data can predict cardiometabolic risk biomarkers like HbA1c and CRP.
  • TabPFN v2 shows strong performance on this new population-representative benchmark.
  • Uncertainty quantification and subgroup fairness are crucial for clinical AI deployment.
  • Marginal coverage guarantees do not always translate to equitable conditional coverage across demographics.

Original post by Federico Felizzi

"arXiv:2606.30702v1 Announce Type: new Abstract: Structured tabular data dominates clinical medicine, yet existing benchmarks fail to reflect real-world properties like complex survey sampling, demographic oversampling, and subgroup fairness. We introduce the NHANES Accelerometry…"

View on X

Originally posted by Federico Felizzi on X · view source

Want to go deeper?

Turn these trends into skills with Learnijoy's hands-on AI & tech courses.

Explore courses