Evaluation
Example
An Example
is a basic unit of data in judgeval
that allows you to run evaluation scorers on your agents.
An Example
can be composed of a mixture of the following fields:
Field | Type | Description |
---|---|---|
input | Optional[Union[str, Dict[str, Any]]] | Sample input to your agent/task |
actual_output | Optional[Union[str, List[str]]] | What your agent outputs based on the input |
expected_output | Optional[Union[str, List[str]]] | The ideal output of your agent |
retrieval_context | Optional[List[str]] | Context retrieved from a vector database |
expected_tools | Optional[List[str]] | Tools you expect your agent to use |
additional_metadata | Optional[Dict[str, Any]] | Extra information attached to the example |
Creating an Example
from judgeval.data import Example
example = Example(
input="Who founded Microsoft?",
actual_output="Bill Gates and Paul Allen.",
expected_output="Bill Gates and Paul Allen founded Microsoft in New Mexico in 1975.",
retrieval_context=["Bill Gates co-founded Microsoft with Paul Allen in 1975."],
expected_tools=["research_person", "research_company"],
)