Structural Grader
Deterministic Zod grader.
The structural grader validates an agent's output against a caller-supplied Zod schema. It runs synchronously, requires no LLM call, and is the only grader that is required in BFSI strict mode for development and QA sessions.
Use the structural grader for any assertion that can be expressed as a schema - required fields, types, value constraints, format checks. Reserve model graders for semantic quality questions that cannot be reduced to a schema.
Grader ID
structural/zod-v1How it works
StructuralZodGrader calls schema.safeParse(ctx.output). The result is binary:
| Outcome | Score | Pass |
|---|---|---|
result.success === true | 1.0 | true (when threshold ≤ 1.0) |
result.success === false | 0.0 | false (unless threshold is 0) |
The grader surfaces Zod's error message in reasoning when validation fails, giving the operator actionable detail on what the agent output violated.
Usage
import { z } from 'zod'
import { StructuralZodGrader } from '@/lib/evals/graders/structural-zod'
// Define the schema the agent output must satisfy
const PrReviewSchema = z.object({
summary: z.string().min(10),
approved: z.boolean(),
concerns: z.array(z.string()),
})
const grader = new StructuralZodGrader(PrReviewSchema)
// Evaluate
const result = await grader.evaluate({
input: { prUrl: 'https://github.com/...' },
output: { summary: 'LGTM', approved: true, concerns: [] },
traceRef: 'evt_abc123',
})
// result:
// {
// graderId: 'structural/zod-v1',
// score: 1.0,
// pass: true,
// reasoning: 'Output matches schema.'
// }Failure output
const result = await grader.evaluate({
input: { prUrl: '...' },
output: { approved: 'yes' }, // string instead of boolean, missing fields
traceRef: 'evt_abc123',
})
// result:
// {
// graderId: 'structural/zod-v1',
// score: 0.0,
// pass: false,
// reasoning: 'Required at "summary"; Expected boolean, received string at "approved"; Required at "concerns"'
// }Constructor
new StructuralZodGrader(schema: ZodTypeAny, threshold?: number)| Parameter | Type | Default | Notes |
|---|---|---|---|
schema | ZodTypeAny | required | Any Zod schema - objects, arrays, unions, transforms |
threshold | number | 1.0 | Minimum score for pass: true. Useful for partial-match configurations. |
The structural grader is inherently binary - score is always exactly 0.0 or 1.0. A threshold below 1.0 makes pass: true even on a full validation failure, which is unusual but valid (for example, when the grader is used purely for scoring without a hard pass/fail gate).
GradeResult schema
Prop
Type
Registering in an AgentCard
Reference the grader by ID in the AgentCard's evalConfig.graders list:
evalConfig:
enabled: true
graders:
- structural/zod-v1
- model-grader/llm-judge-v1 # optional - runs after structuralGraders run in the listed order. The structural grader runs first (sync, zero cost) before any LLM grader is invoked.
Performance characteristics
| Property | Value |
|---|---|
| Mode | sync |
| LLM call | None |
| Marginal cost | Zero |
| Latency | Microseconds (Zod safeParse) |
| Throughput | Suitable for high-frequency regression suites |
Because the structural grader is zero-cost and synchronous, it is well-suited for regression test suites that run on every session completion.
BFSI strict mode
In BFSI strict mode, the structural grader is required for development and qa session types. Dispatch is blocked on sessions matching those work types if no structural grader is configured in the AgentCard's evalConfig.
Related pages
- Eval Emission - how graders are triggered after session completion
- Model Grader - LLM-as-judge grader for semantic quality
- Human Grader - operator review queue for acceptance-critical sessions
- Eval Replay - re-run any grader against a frozen trace
- BFSI Eval Mode - mandatory structural grader requirements