DECIDE-AI: A Reporting Guideline for Early-Stage Clinical Evaluation of AI Decision Support Systems
DECIDE-AI is a comprehensive reporting guideline comprising 17 AI-specific reporting items and 10 generic items, developed through multi-stakeholder consensus specifically for the early-stage clinical evaluation of artificial intelligence decision support systems in healthcare settings. 1
Core Purpose and Structure
The DECIDE-AI guideline was created to address a critical gap in AI research reporting that has obstructed reproducibility of published studies and created challenges for evidence-based AI implementation. 1 The framework serves as a standardized approach to evaluate AI systems before they are deployed at scale in clinical practice.
Key Components
The guideline includes:
- 17 AI-specific reporting items that address unique aspects of artificial intelligence systems 1
- 10 generic items applicable to broader clinical evaluation contexts 1
- Distinction between "live evaluation" (directly affecting patient care) and "shadow mode" (not affecting care), which has important implications for appropriate implementation at different stages 1
Clinical Rationale
Early-stage clinical evaluation is critical to assess an AI system's actual clinical performance at small scale, ensure safety, evaluate human factors, and pave the way for larger trials. 1 This addresses a fundamental problem in AI implementation: few AI tools have demonstrated real benefit to patient care despite promising preclinical performance, a phenomenon known as the "AI chasm." 1
Why This Matters for Patient Outcomes
The guideline recognizes that AI systems must account for:
- Complex intervention nature 1
- User variability across different clinical settings 1
- Human-computer collaboration dynamics 1
- Changing system versions over time 1
These factors directly impact whether AI tools will actually improve morbidity, mortality, and quality of life when deployed in real-world clinical environments.
Implementation Framework
Learning Curve Assessment
DECIDE-AI recommends analyzing learning curves by graphically plotting user performance against experience, providing specific metrics for assessing competency development with AI tools. 1 This is particularly relevant as the American Heart Association has emphasized that despite enormous academic interest and industry financing, AI-based tools have yet to improve patient outcomes at scale. 2
Evaluation Modes
The guideline distinguishes between two evaluation approaches:
- Live evaluation: AI system actively affects patient care decisions 1
- Shadow mode: AI system runs parallel without affecting care, allowing safety assessment 1
Context Within Broader AI Implementation
The DECIDE-AI guideline aligns with broader recommendations from the American Heart Association, which emphasizes that AI development should incorporate patient-centered outcomes research (PCOR) principles to ensure tools address meaningful clinical questions. 1 The guideline also supports the recommendation that multidisciplinary teams including bioinformatics experts, medical specialists, and patient representatives should develop AI tools. 1
Critical Gap Being Addressed
The lack of appropriate reporting guidelines has obstructed reproducibility of published studies, creating a significant barrier to evidence-based AI integration in clinical practice. 1 DECIDE-AI provides the structured framework needed to overcome this obstacle and ensure that AI systems are rigorously evaluated before widespread deployment.
Relevance to Cardiovascular Medicine
While DECIDE-AI is applicable across medical specialties, it has particular relevance to cardiovascular medicine where AI applications include automated segmentation, volumetric analysis, ejection fraction calculation, and automated disease detection. 1 The American Heart Association has recognized AI's potential to further precision medicine by enabling more precise approaches to cardiovascular research and individualized care, but only with proper evaluation frameworks like DECIDE-AI. 3