ExampleScorer

A custom scorer class that extends BaseScorer for creating specialized evaluation logic for individual examples.

score_typerequired

:str

Type identifier for the scorer, defaults to "Custom"

Example: "Custom"

required_params

:List[str]

List of required parameters for the scorer

Example: ["temperature", "model_version"]

Methods

a_score_examplerequired

:async def

Asynchronously measures the score on a single example. Must be implemented by subclasses.

a_score_example.py

async def a_score_example(self, example: Example, *args, **kwargs)-> float:
  # Custom scoring logic here return score

Usage Examples

from judgeval import JudgmentClient
from judgeval.data import Example
from judgeval.scorers.example_scorer import ExampleScorer

client = JudgmentClient()

class CorrectnessScorer(ExampleScorer):
    score_type: str = "Correctness"

    async def a_score_example(self, example: Example) -> float:
        if "Washington, D.C." in example.actual_output:
            self.reason = "The answer is correct because it contains 'Washington, D.C.'."
            return 1.0

        self.reason = "The answer is incorrect because it does not contain 'Washington, D.C.'."
        return 0.0

example = Example(
    input="What is the capital of the United States?",
    expected_output="Washington, D.C.",
    actual_output="The capital of the U.S. is Washington, D.C."
)

client.run_evaluation(
    examples=[example],
    scorers=[CorrectnessScorer()],
    project_name="default_project",
)