Overview
Voice AI for healthcare involves a complex pipeline: speech recognition, natural language understanding, clinical reasoning, and response generation. Rubric helps you evaluate the entire pipeline end-to-end, or drill down into individual components.Patient Triage Calls
Nurse hotlines, symptom assessment, urgent care routing
Voice Assistants
In-clinic voice AI, patient intake, follow-up calls
Call Center AI
Appointment scheduling, prescription refills, results delivery
Ambient Documentation
Visit recording, note generation, clinical summarization
The Voice AI Pipeline
A typical healthcare voice AI system has multiple stages, each requiring evaluation:Audio Input
Patient speech captured via phone, app, or deviceMetrics: Quality, noise level, duration
Rubric evaluates the final clinical decision, but also lets you log intermediate outputs to pinpoint where errors originate in your pipeline.
Logging Voice Interactions
Log the complete interaction including audio, transcript, and AI decisions:Voice-Specific Evaluators
Configure evaluators designed for voice AI healthcare applications:ASR (Speech-to-Text) Evaluation
Evaluate your speech recognition separately:Multimodal Evaluation
For systems that combine voice with other modalities (images, documents):Critical Metrics for Voice AI
Key metrics to track for healthcare voice systems:| Metric | Target | Why It Matters |
|---|---|---|
| Triage Accuracy | > 95% | Core safety metric - correct urgency level |
| Red Flag Recall | > 99% | Must catch nearly all danger signs |
| Under-Triage Rate | < 1% | Missing urgent cases is unacceptable |
| Escalation Latency | < 60s | How quickly urgent cases are escalated |
| ASR Medical WER | < 5% | Transcription accuracy for clinical terms |
| Symptom F1 | > 90% | Correct symptom extraction |
| Guideline Adherence | > 90% | Following clinical protocols |
| Patient Satisfaction | > 4.0/5 | Empathy, clarity, helpfulness |
Setting Up Clinician Review
Route flagged calls to physicians and nurses for expert review:Integration Example
Complete example integrating Rubric into a voice triage pipeline:Next Steps
Create Your First Evaluation
Run evaluators on your voice data
Clinician Review Workflow
Set up human-in-the-loop review
Transcript Formats
Supported audio and transcript formats
Evaluating LLMs
LLM-specific evaluation
