Judgment Labs Logo
PythonAgent Judges

AgentJudgeFactory

Create and update prompt-based Agent Judges on the Judgment platform.

Create and update prompt-based Agent Judges on the Judgment platform.

Access this via client.agent_judges — you don't instantiate it directly.

client = Judgeval(project_name="my-project")

judge = client.agent_judges.create(
    name="helpfulness",
    prompt="Rate the assistant's helpfulness on a scale of 0 to 1.",
    model="gpt-5.2",
    score_type="numeric",
)

client.agent_judges.update(
    judge_id=judge.judge_id,
    prompt="Updated rubric prompt.",
)

__init__()

def __init__(client, project_id, project_name):

Parameters

client

required

:

JudgmentSyncClient

project_id

required

:

Optional[str]

project_name

required

:

str


create()

Create a new Agent Judge.

def create(*, name, prompt, model, score_type, description=None, judge_description=None, categories=None, min_score=None, max_score=None) -> typing.Optional:

Parameters

name

required

:

str

Unique judge name within the project.

prompt

required

:

str

Rubric prompt template used by the agent judge.

model

required

:

str

LiteLLM model id (e.g. "gpt-5.2").

score_type

required

:

ScoreType

One of "numeric", "binary", or "categorical".

description

:

Optional[str]

Description stored on the underlying scorer version.

Default:

None

judge_description

:

Optional[str]

Description shown in the UI.

Default:

None

categories

:

Optional[List[Dict[str, Any]]]

Choice list for categorical judges.

Default:

None

min_score

:

Optional[float]

Lower bound for numeric judges (defaults to 0).

Default:

None

max_score

:

Optional[float]

Upper bound for numeric judges (defaults to 1).

Default:

None

Returns

typing.Optional - The created AgentJudge, or None if the project is unresolved.


update()

Update an existing Agent Judge.

Passing any of prompt, model, categories, min_score, or max_score writes a new version of the underlying prompt scorer. When target_major_version / target_minor_version are omitted, the server auto-bumps the latest version's minor by 1 — matching the UI's default "save" behaviour.

def update(*, judge_id, prompt=None, model=None, score_type=None, description=None, judge_description=None, categories=None, min_score=None, max_score=None, source_major_version=None, source_minor_version=None, target_major_version=None, target_minor_version=None) -> typing.Optional:

Parameters

judge_id

required

:

str

ID of the judge to update.

prompt

:

Optional[str]

New rubric prompt template.

Default:

None

model

:

Optional[str]

New LiteLLM model id.

Default:

None

score_type

:

Optional[ScoreType]

New score type (numeric, binary, categorical).

Default:

None

description

:

Optional[str]

New scorer-version description.

Default:

None

judge_description

:

Optional[str]

New UI-facing description.

Default:

None

categories

:

Optional[List[Dict[str, Any]]]

New choices for categorical judges.

Default:

None

min_score

:

Optional[float]

New lower bound for numeric judges.

Default:

None

max_score

:

Optional[float]

New upper bound for numeric judges.

Default:

None

source_major_version

:

Optional[int]

Major version to copy unspecified fields from. Defaults to the latest version.

Default:

None

source_minor_version

:

Optional[int]

Minor version to copy unspecified fields from. Defaults to the latest version.

Default:

None

target_major_version

:

Optional[int]

Major version to write to. Defaults to the current latest major.

Default:

None

target_minor_version

:

Optional[int]

Minor version to write to. Defaults to latest minor + 1.

Default:

None

Returns

typing.Optional - The updated AgentJudge, or None if the project is unresolved.