Judgeval Python SDK

Dataset

Dataset class for managing datasets of Examples and Traces in Judgeval

The Dataset class lets you create, retrieve, and manage reusable evaluation datasets that are visible on the Judgment platform.

from judgeval.dataset import Dataset
from judgeval.data import Example

dataset = Dataset.create(
  name="qa_dataset",
  project_name="default_project",
  examples=[Example(input="What is the powerhouse of the cell?", actual_output="The mitochondria.")]
)

dataset = Dataset.get(
  name="qa_dataset",
  project_name="default_project",
)

examples = []

example = Example(input="Sample question?", output="Sample answer.")

examples.append(example)

dataset.add_examples(examples=examples)

Static Method

Dataset.create()

Create a new evaluation dataset for storage and reuse across multiple evaluation runs.

This method will also save the dataset to the Judgment platform.

Dataset.create(
  name: str,
  project_name: str,
  examples: Optional[List[Example]] = None,
  traces: Optional[List[Trace]] = None,
  overwrite: bool = False
)

Parameters

namerequired:str
Name of the dataset
Example: "qa_dataset"
project_namerequired:str
Name of the project
Example: "question_answering"
examples:Optional[List[Example]]

List of examples to include in the dataset. See Example for details on the structure.

Example: [Example(input="...", actual_output="...")]
traces:Optional[List[Trace]]

List of traces to include in the dataset. See Trace for details on the structure.

Example: [Trace(...)]
overwrite:bool

Whether to overwrite an existing dataset with the same name.

Default: False

Returns

A Dataset instance for further operations.

Example

dataset.py
from judgeval.dataset import Dataset
from judgeval.data import Example

dataset = Dataset.create(
  name="qa_dataset",
  project_name="default_project",
  examples=[Example(input="What is the powerhouse of the cell?", actual_output="The mitochondria.")]
)

Exceptions

JudgmentAPIError:Exception

Raised when a dataset with the same name already exists in the project and overwrite=False. See JudgmentAPIError for details.


Static Method

Dataset.get()

Retrieve a dataset from the Judgment platform by its name and project name.

Dataset.get(
  name: str,
  project_name: str
)

Parameters

namerequired:str
The name of the dataset to retrieve.
Example: "my_dataset"
project_namerequired:str

The name of the project where the dataset is stored.

Example: "default_project"

Returns

A Dataset instance for further operations.

Example

retrieve_dataset.py
from judgeval.dataset import Dataset

dataset = Dataset.get(
  name="qa_dataset",
  project_name="default_project",
)

print(dataset.examples)

add_examples()

Add Examples to the dataset once you have created or retrieved it.

All instance methods of Dataset automatically update the dataset and push changes to the Judgment platform.

dataset.add_examples(
  examples: List[Example]
)

Parameters

examplesrequired:List[Example]
List of examples to add to the dataset.

Returns

True if examples were added successfully.

Example

add_examples.py
from judgeval.dataset import Dataset
from judgeval.data import Example

dataset = Dataset.get(
  name="qa_dataset",
  project_name="default_project",
)

example = Example(input="Sample question?", output="Sample answer.")

dataset.add_examples(examples=[example])

Return Types

Dataset

The Dataset object contains the following properties:

namereadonly:str
The name of the dataset
project_namereadonly:str

The project name where the dataset is stored

examplesreadonly:List[Example]

List of examples contained in the dataset. See Example for details on the structure.

tracesreadonly:List[Trace]

List of traces contained in the dataset (if any).

idreadonly:str

Unique identifier for the dataset on the Judgment platform