Evaluation¶
Use this async workflow to start model evaluations and poll for completion.
Start async evaluation¶
POST /cr/finetune_model/evaluate_async
Queues evaluation and returns immediately with job_id.
| Field | Type | Required | Description |
|---|---|---|---|
dataset_name | string | yes | Dataset id. |
model_name | string | yes | Model name to evaluate. |
Returns
200 OKwith evaluation job identifier.
{
"job_id": "5b6273d2-c7e6-4a8f-a8cd-c4b2f9be17cd",
"status": "queued"
}
Poll evaluation status¶
GET /cr/finetune_model/evaluate_status?job_id=...
Poll until status reaches completed or failed.
| Field | Type | Required | Description |
|---|---|---|---|
job_id | query string | yes | Job id from async submit. |
Returns
200 OKwith job status.- Terminal states include
completed(withresult) andfailed(witherror).
{
"status": "completed",
"result": {
"dataset_name": "your_dataset_name",
"model_name": "Generalist: Base",
"split": "test",
"correct": 81,
"total": 100,
"accuracy": 0.81,
"judge_model": "gpt-5.4-nano"
}
}