Overview
Voice AI for healthcare involves a complex pipeline: speech recognition, natural language understanding, clinical reasoning, and response generation. Rubric helps you evaluate the entire pipeline end-to-end, or drill down into individual components.Patient Triage Calls
Nurse hotlines, symptom assessment, urgent care routing
Voice Assistants
In-clinic voice AI, patient intake, follow-up calls
Call Center AI
Appointment scheduling, prescription refills, results delivery
Ambient Documentation
Visit recording, note generation, clinical summarization
The Voice AI Pipeline
A typical healthcare voice AI system has multiple stages, each requiring evaluation:Audio Input
Patient speech captured via phone, app, or deviceMetrics: Quality, noise level, duration
Rubric evaluates the final clinical decision, but also lets you log intermediate outputs to pinpoint where errors originate in your pipeline.
Logging Voice Interactions
Log the complete interaction including audio, transcript, and AI decisions:Voice-Specific Evaluators
Configure evaluators designed for voice AI healthcare applications:ASR (Speech-to-Text) Evaluation
Evaluate your speech recognition separately:Multimodal Evaluation
For systems that combine voice with other modalities (images, documents):Critical Metrics for Voice AI
Key metrics to track for healthcare voice systems:| Metric | Target | Why It Matters |
|---|---|---|
| Triage Accuracy | > 95% | Core safety metric - correct urgency level |
| Red Flag Recall | > 99% | Must catch nearly all danger signs |
| Under-Triage Rate | < 1% | Missing urgent cases is unacceptable |
| Escalation Latency | < 60s | How quickly urgent cases are escalated |
| ASR Medical WER | < 5% | Transcription accuracy for clinical terms |
| Symptom F1 | > 90% | Correct symptom extraction |
| Guideline Adherence | > 90% | Following clinical protocols |
| Patient Satisfaction | > 4.0/5 | Empathy, clarity, helpfulness |
