SDK Overview
Reference documentation for the Judgeval SDK
The Judgeval SDK provides a complete toolkit for evaluating, monitoring, and optimizing AI agents. Trace agent execution in production, run evaluations with custom scoring rubrics, manage test datasets, and version prompts with integrated evaluation workflows.
Core SDK Components
Judgeval
Primary client for evaluation, datasets, and prompt management
Tracer
Capture and monitor agent execution traces for debugging and analysis
Dataset
Manage collections of examples for batch evaluation
Judge
Create custom local scorers with typed responses
Prompt
Version and manage prompts with integrated evaluation