Judgment Labs Logo
PythonOffline TestsTypes

OfflineTestResult

The outcome of an offline test run.

The outcome of an offline test run.

Returned by client.offline_tests.run(). Contains the per-example scoring results plus run-level metadata.

Attributes

test_run_id

:

str

The test run ID.

status

:

str

Final run status.

ui_results_url

:

Optional[str]

Link to the results page in the dashboard.

Default:

None

results

:

List[ScoringResult]

One ScoringResult per dataset example, with per-judge ScorerData entries. When a pass_condition_fn was supplied, each ScorerData.success carries the per-row outcome.

Default:

field(default_factory=list)

agent_offline_trace_ids

:

Dict[str, str]

Mapping of example ID to the offline trace produced by the agent entrypoint (agent testing only).

Default:

field(default_factory=dict)

passed

:

Optional[bool]

Whether every row passed its pass condition.

Returns None when no pass condition was evaluated.

On this page