SDK Overview

The Judgeval SDK provides a complete toolkit for evaluating, monitoring, and optimizing AI agents. Trace agent execution in production, run evaluations with custom scoring rubrics, manage test datasets, and version prompts with integrated evaluation workflows.

Core SDK Components

Judgeval

Primary client for evaluation, datasets, and prompt management

Tracer

Capture and monitor agent execution traces for debugging and analysis

Dataset

Manage collections of examples for batch evaluation

Judge

Create custom local scorers with typed responses

Prompt

Version and manage prompts with integrated evaluation

Judgeval

The main client for running evaluations and managing projects

Tracer

Capture and monitor agent execution traces for debugging and analysis

JudgmentTracerProvider

Global singleton managing tracer registration and context propagation

SDK Overview

Core SDK Components

Judgeval

Tracer

Dataset

Judge

Prompt

Judgeval

Tracer

JudgmentTracerProvider

On this page