Judgeval Python SDKResponse Types
ScoringResult
Complete evaluation results for a single example
Contains the output of one or more scorers applied to a single example. Represents the complete evaluation results for one input with its actual output, expected output, and all applied scorer results.
successrequired
:boolWhether the evaluation was successful. True when all scorers applied to this example returned a success.
scorers_data
:List[ScorerData]List of individual scorer results for this evaluation
data_object
:ExampleThe original example object that was evaluated
name
:strOptional name identifier for this scoring result
trace_id
:strUnique identifier linking this result to trace data
run_duration
:floatTime taken to complete the evaluation in seconds
evaluation_cost
:floatEstimated cost of running the evaluation (e.g., API costs)
Usage Examples
from judgeval import JudgmentClient
client = JudgmentClient()
results = client.evaluate(examples=[...], scorers=[...])
for result in results:
if result.success:
print(f"Evaluation succeeded in {result.run_duration:.2f}s")
for scorer_data in result.scorers_data:
print(f" {scorer_data.name}: {scorer_data.score}")
else:
print("Evaluation failed")