Python

Judgeval

The main entry point for interacting with the Judgment platform.

The main entry point for interacting with the Judgment platform.

Judgeval connects to your Judgment project and gives you access to evaluations, datasets, and prompt versioning through convenient properties.

Credentials are resolved in order: explicit arguments first, then environment variables JUDGMENT_API_KEY, JUDGMENT_ORG_ID, and JUDGMENT_API_URL.

ValueError: If any required credential or project_name is missing.

Minimal setup (credentials from environment variables):

from judgeval import Judgeval

client = Judgeval(project_name="search-assistant")

Explicit credentials:

client = Judgeval(
    project_name="search-assistant",
    api_key="jdg_...",
    organization_id="org_...",
)

Once initialized, use the evaluation, datasets, and prompts properties:

eval_runner = client.evaluation.create()
dataset = client.datasets.get(name="golden-set")
prompt = client.prompts.get(name="system-prompt", tag="production")

Attributes

evaluation

Access evaluations for scoring examples with hosted or custom judges.

Use .create() to get an Evaluation you can call .run() on.

eval_runner = client.evaluation.create()
results = eval_runner.run(
    examples=examples,
    scorers=["faithfulness", "answer_relevancy"],
    eval_run_name="nightly-eval",
)

datasets

Manage datasets of evaluation examples.

Use .create(), .get(), or .list() to work with datasets.

dataset = client.datasets.create(
    name="golden-set",
    examples=[
        Example.create(input="What is 2+2?", expected_output="4"),
    ],
)

prompts

Manage versioned prompt templates with tagging support.

Use .create(), .get(), .tag(), or .list() to work with prompts.

prompt = client.prompts.create(
    name="system-prompt",
    prompt="You are a helpful assistant for {{product}}.",
    tags=["v1"],
)
compiled = prompt.compile(product="Acme Search")

On this page