Python

ScoringResult

The combined result of running scorers against a single example.

The combined result of running scorers against a single example.

Returned by Evaluation.run(). Check success to see if all scorers passed, or inspect scorers_data for per-scorer details.

results = evaluation.run(
    examples=examples,
    scorers=["faithfulness", "answer_relevancy"],
    eval_run_name="nightly",
)
for result in results:
    if not result.success:
        for scorer in result.scorers_data:
            print(f"{scorer.name}: {scorer.score} - {scorer.reason}")

Attributes

success

:

bool

True only if every scorer met its threshold.

scorers_data

:

List[ScorerData]

Per-scorer results (see ScorerData).

data_object

:

Union[TraceSpan, Example]

The Example or TraceSpan that was scored.

name

:

Optional[str]

The evaluation run name.

Default:

None

trace_id

:

Optional[str]

Associated trace ID, if applicable.

Default:

None

run_duration

:

Optional[float]

How long the evaluation took (seconds).

Default:

None

evaluation_cost

:

Optional[float]

Total cost in USD.

Default:

None


to_dict()

def to_dict() -> APIScoringResult:

Returns

APIScoringResult