Skip to main content

Overview

Patient triage is one of the highest-stakes applications of healthcare AI. An incorrect triage level can lead to delayed care for serious conditions or unnecessary emergency visits. Rubric provides specialized evaluators to assess triage quality.

Triage Levels

Rubric supports standard triage classification:
LevelCodeResponse TimeExamples
EmergentEImmediateChest pain + arm radiation, stroke symptoms, severe bleeding
UrgentU1-2 hoursHigh fever, moderate pain, concerning symptoms
Semi-urgentS24-48 hoursPersistent symptoms, medication questions
RoutineRStandardPrescription refills, follow-up scheduling

Triage Accuracy Evaluator

The triage accuracy evaluator compares the AI’s decision against expected outcomes:
{
    "type": "triage_accuracy",
    "config": {
        "severity_weights": {
            "under_triage": 5.0,  # Critical: missed urgency
            "over_triage": 1.0,   # Costly but safe
            "correct": 0.0
        },
        "levels": ["emergent", "urgent", "semi-urgent", "routine"]
    }
}

Scoring Logic

Under-triage is weighted more heavily than over-triage. A patient with chest pain classified as “routine” is far more dangerous than a routine case sent to urgent care.
Score = 100 - (weighted_errors / total_samples * 100)

where weighted_errors = Σ(under_triage * 5.0 + over_triage * 1.0)

Confusion Matrix

Rubric generates a triage confusion matrix showing:
Predicted →EmergentUrgentSemi-urgentRoutine
Actual: Emergent✓ Correct⚠️ Under🔴 Under🔴 Under
Actual: Urgent✓ Over✓ Correct⚠️ Under🔴 Under
Actual: Semi-urgent✓ Over✓ Over✓ Correct⚠️ Under
Actual: Routine✓ Over✓ Over✓ Over✓ Correct

Red Flag Detection

Critical symptoms that should trigger immediate escalation:

Cardiac

  • Chest pain with radiation (arm, jaw, back)
  • Chest pain with shortness of breath
  • Chest pain with sweating, nausea
  • Known cardiac history + new symptoms

Neurological

  • “Worst headache of my life”
  • Sudden vision changes
  • Facial drooping, arm weakness, speech difficulty (FAST)
  • Sudden confusion or altered mental status

Respiratory

  • Severe difficulty breathing
  • Cyanosis (blue lips/fingertips)
  • Choking/airway obstruction

Pediatric

  • Fever in infant < 3 months
  • Lethargy/unresponsiveness
  • Signs of dehydration
  • Possible abuse indicators
{
    "type": "red_flag_detection",
    "config": {
        "protocols": ["cardiac", "neurological", "respiratory", "pediatric"],
        "sensitivity_threshold": 0.95  # Must catch 95%+ of red flags
    }
}

Expected Data Format

client.calls.log(
    project="triage",
    transcript=[...],
    
    ai_decision={
        "triage_level": "urgent",
        "extracted_symptoms": [
            {"name": "chest_pain", "severity": "moderate", "duration": "2_days"},
            {"name": "shortness_of_breath", "severity": "mild", "exertional": true}
        ],
        "red_flags_detected": ["exertional_chest_pain"],
        "recommended_action": "same_day_appointment",
        "reasoning": "Chest pain with exertional component warrants cardiac workup"
    },
    
    # Ground truth (if available)
    expected={
        "triage_level": "urgent",
        "red_flags": ["exertional_chest_pain"],
        "actual_outcome": "ACS_ruled_out"  # From follow-up
    }
)

Best Practices

Configure evaluators to heavily penalize under-triage. It’s better for the AI to escalate unnecessarily than miss a serious condition.
Duration, severity, and associated symptoms matter. “Headache for 5 minutes” differs from “sudden severe headache.”
When possible, log actual patient outcomes to improve evaluation accuracy over time.
Set up automatic flagging for cases near decision boundaries (e.g., borderline urgent/semi-urgent).

Next Steps