PythonDatasetsDataset

Dataset

A collection of `Example` objects stored on the Judgment platform.

A collection of Example objects stored on the Judgment platform.

Datasets are created and retrieved via client.datasets. Once you have a Dataset, you can add examples, iterate over them, export to JSON/YAML, or display a rich table preview.

Create a dataset and add examples:

dataset = client.datasets.create(name="golden-set")
dataset.add_examples([
    Example.create(input="What is AI?", expected_output="Artificial Intelligence"),
    Example.create(input="What is ML?", expected_output="Machine Learning"),
])

Retrieve and iterate:

dataset = client.datasets.get(name="golden-set")
for example in dataset:
    print(example["input"])

Export to file:

dataset.save_as("json", dir_path="./exports", save_name="golden-set")

Attributes

name

:

str

Dataset name.

project_id

:

str

Owning project ID.

project_name

:

str

Project name.

dataset_kind

:

str

Kind of dataset (default "example").

Default:

'example'

examples

:

Optional[List[Example]]

The loaded examples (populated when using .get()).

Default:

None

client

:

Optional[JudgmentSyncClient]

Internal API client (set automatically).

Default:

None


add_from_json()

Upload examples from a JSON file.

The file should contain a JSON array of objects, where each object has the fields you want as example properties (e.g. input, actual_output, expected_output).

dataset.add_from_json("./data/golden-set.json")
def add_from_json(file_path, batch_size=100) -> None:

Parameters

file_path

required

:

str

Path to the JSON file.

batch_size

:

int

Number of examples uploaded per API call.

Default:

100

Returns

None


add_from_yaml()

Upload examples from a YAML file.

Same as add_from_json but reads YAML format.

def add_from_yaml(file_path, batch_size=100) -> None:

Parameters

file_path

required

:

str

Path to the YAML file.

batch_size

:

int

Number of examples uploaded per API call.

Default:

100

Returns

None


add_examples()

Upload Example objects to this dataset.

Examples are uploaded in batches with a progress bar. Accepts any iterable, including generators.

TypeError: If a single Example is passed instead of a list.

def add_examples(examples, batch_size=100) -> None:

Parameters

examples

required

:

Iterable[Example]

A list (or iterable) of Example objects.

batch_size

:

int

Number of examples per upload batch.

Default:

100

Returns

None


save_as()

Export the dataset to a local JSON or YAML file.

dataset = client.datasets.get(name="golden-set")
dataset.save_as("json", dir_path="./exports")
def save_as(file_type, dir_path, save_name=None) -> None:

Parameters

file_type

required

:

Literal['json', 'yaml']

"json" or "yaml".

dir_path

required

:

str

Directory to write into (created if it doesn't exist).

save_name

:

Optional[str]

File name without extension. Defaults to a timestamp.

Default:

None

Returns

None


display()

Print a formatted table preview to the terminal.

def display(max_examples=5) -> None:

Parameters

max_examples

:

int

Maximum number of examples to show.

Default:

5

Returns

None