Judgeval
The main entry point for interacting with the Judgment platform.
The main entry point for interacting with the Judgment platform.
Judgeval connects to your Judgment project and gives you access to
evaluations, datasets, and prompt versioning through
convenient properties.
Credentials are resolved in order: explicit arguments first, then
environment variables JUDGMENT_API_KEY, JUDGMENT_ORG_ID, and
JUDGMENT_API_URL.
ValueError: If any required credential or project_name is missing.
Minimal setup (credentials from environment variables):
from judgeval import Judgeval
client = Judgeval(project_name="search-assistant")Explicit credentials:
client = Judgeval(
project_name="search-assistant",
api_key="jdg_...",
organization_id="org_...",
)Once initialized, use the evaluation, datasets, and prompts
properties:
eval_runner = client.evaluation.create()
dataset = client.datasets.get(name="golden-set")
prompt = client.prompts.get(name="system-prompt", tag="production")Attributes
evaluation
Access evaluations for scoring examples with hosted or custom judges.
Use .create() to get an Evaluation you
can call .run() on.
eval_runner = client.evaluation.create()
results = eval_runner.run(
examples=examples,
scorers=["faithfulness", "answer_relevancy"],
eval_run_name="nightly-eval",
)datasets
Manage datasets of evaluation examples.
Use .create(), .get(), or .list() to work
with datasets.
dataset = client.datasets.create(
name="golden-set",
examples=[
Example.create(input="What is 2+2?", expected_output="4"),
],
)prompts
Manage versioned prompt templates with tagging support.
Use .create(), .get(), .tag(), or .list()
to work with prompts.
prompt = client.prompts.create(
name="system-prompt",
prompt="You are a helpful assistant for {{product}}.",
tags=["v1"],
)
compiled = prompt.compile(product="Acme Search")