AgentJudgeFactory
Create and update prompt-based Agent Judges on the Judgment platform.
Create and update prompt-based Agent Judges on the Judgment platform.
Access this via client.agent_judges — you don't instantiate it directly.
client = Judgeval(project_name="my-project")
judge = client.agent_judges.create(
name="helpfulness",
prompt="Rate the assistant's helpfulness on a scale of 0 to 1.",
model="gpt-5.2",
score_type="numeric",
)
client.agent_judges.update(
judge_id=judge.judge_id,
prompt="Updated rubric prompt.",
)__init__()
def __init__(client, project_id, project_name):Parameters
client
required:JudgmentSyncClient
project_id
required:Optional[str]
project_name
required:str
create()
Create a new Agent Judge.
def create(*, name, prompt, model, score_type, description=None, judge_description=None, categories=None, min_score=None, max_score=None) -> typing.Optional:Parameters
name
required:str
Unique judge name within the project.
prompt
required:str
Rubric prompt template used by the agent judge.
model
required:str
LiteLLM model id (e.g. "gpt-5.2").
score_type
required:ScoreType
One of "numeric", "binary", or "categorical".
description
:Optional[str]
Description stored on the underlying scorer version.
None
judge_description
:Optional[str]
Description shown in the UI.
None
categories
:Optional[List[Dict[str, Any]]]
Choice list for categorical judges.
None
min_score
:Optional[float]
Lower bound for numeric judges (defaults to 0).
None
max_score
:Optional[float]
Upper bound for numeric judges (defaults to 1).
None
Returns
typing.Optional - The created AgentJudge, or None if the project is unresolved.
update()
Update an existing Agent Judge.
Passing any of prompt, model, categories, min_score, or
max_score writes a new version of the underlying prompt scorer.
When target_major_version / target_minor_version are omitted,
the server auto-bumps the latest version's minor by 1 — matching
the UI's default "save" behaviour.
def update(*, judge_id, prompt=None, model=None, score_type=None, description=None, judge_description=None, categories=None, min_score=None, max_score=None, source_major_version=None, source_minor_version=None, target_major_version=None, target_minor_version=None) -> typing.Optional:Parameters
judge_id
required:str
ID of the judge to update.
prompt
:Optional[str]
New rubric prompt template.
None
model
:Optional[str]
New LiteLLM model id.
None
score_type
:Optional[ScoreType]
New score type (numeric, binary, categorical).
None
description
:Optional[str]
New scorer-version description.
None
judge_description
:Optional[str]
New UI-facing description.
None
categories
:Optional[List[Dict[str, Any]]]
New choices for categorical judges.
None
min_score
:Optional[float]
New lower bound for numeric judges.
None
max_score
:Optional[float]
New upper bound for numeric judges.
None
source_major_version
:Optional[int]
Major version to copy unspecified fields from. Defaults to the latest version.
None
source_minor_version
:Optional[int]
Minor version to copy unspecified fields from. Defaults to the latest version.
None
target_major_version
:Optional[int]
Major version to write to. Defaults to the current latest major.
None
target_minor_version
:Optional[int]
Minor version to write to. Defaults to
latest minor + 1.
None
Returns
typing.Optional - The updated AgentJudge, or None if the project is unresolved.