Evaluations
Create Evaluation
Create a new evaluation to assess a dataset using one or more evaluators.
POST
Create a new evaluation to assess a dataset using one or more evaluators. The evaluation runs asynchronously — use the Get Evaluation endpoint to check status.
Authentication
Bearer token with
write scope. Example: Bearer gr_live_xxxxxxxxRequest Body
A descriptive name for this evaluation run. Useful for identifying evaluations in the dashboard.
The project ID to run this evaluation in. Must be a valid project you have access to.
The dataset ID containing samples to evaluate. All samples in the dataset will be processed.
List of evaluator configurations to run against each sample.
Arbitrary key-value pairs to attach to this evaluation for filtering and organization.
Whether to run the evaluation asynchronously.
Evaluator TypesSee the Evaluators Reference for a complete list of built-in evaluator types and their configuration options.
Response
Unique identifier for the evaluation. Example:
eval_def456Always
evaluationThe name provided for this evaluation
The project ID this evaluation belongs to
The dataset ID being evaluated
Current status:
pending, running, completed, failed, cancelledThe evaluator configurations for this evaluation
ISO 8601 timestamp when the evaluation was created
ISO 8601 timestamp when processing started (null if pending)
ISO 8601 timestamp when processing completed (null if not finished)
Custom metadata attached to this evaluation
Related Endpoints
Get Evaluation
Retrieve evaluation details and results
List Evaluations
List all evaluations in a project
Get Status
Check evaluation progress
