Clinical Accuracy Metrics
Triage Accuracy
Measures the percentage of cases where the AI assigned the correct urgency level.| Metric | Formula | Interpretation |
|---|---|---|
| Raw Accuracy | Correct / Total | Simple correctness rate |
| Weighted Accuracy | Σ(w × correct) / Σw | Accounts for severity differences |
| Under-triage Rate | Under-triaged / Total | Cases assigned less urgent than needed |
| Over-triage Rate | Over-triaged / Total | Cases assigned more urgent than needed |
Sensitivity & Specificity
Critical metrics for evaluating detection of specific conditions or red flags.Clinical Context
| Condition | Priority Metric | Target | Rationale |
|---|---|---|---|
| Chest Pain → MI | Sensitivity | ≥ 99% | Cannot miss heart attacks |
| Stroke Symptoms | Sensitivity | ≥ 99% | Time-critical intervention |
| Pediatric Sepsis | Sensitivity | ≥ 98% | Rapid deterioration risk |
| Routine Follow-up | Specificity | ≥ 85% | Avoid unnecessary burden |
| Mental Health Crisis | Sensitivity | ≥ 95% | Safety-critical detection |
sensitivity_calculation.py
Rubric-Based Scoring
For complex outputs like clinical notes or patient communications, rubric-based scoring provides structured evaluation across multiple dimensions.rubric_definition.py
Custom Metrics
Define metrics specific to your clinical domain and use cases.custom_metrics.py
Confidence Intervals
All metrics include confidence intervals to quantify uncertainty, especially important for small sample sizes.Sample Size Matters: For rare conditions, you may need larger datasets to achieve narrow confidence intervals. Rubric warns when sample sizes are too small for reliable conclusions.
Metric Aggregation
Combine multiple metrics into composite scores for overall model assessment.| Method | Description | Use Case |
|---|---|---|
| Weighted Average | Sum of (weight × metric) | General performance score |
| Minimum | Lowest individual metric | Ensure no weak spots |
| Geometric Mean | ∏(metric)^(1/n) | Balance across metrics |
| Threshold Gate | Pass only if all thresholds met | Safety-critical deployment |
