Dataset
A schema-enforced collection of `Example` objects on the Judgment platform.
A schema-enforced collection of Example objects on the Judgment platform.
Datasets are created and retrieved via client.datasets. Every
dataset has a JSON Schema that all examples are validated against
server-side. Once you have a Dataset, you can append examples, add
trace-backed examples, list versions, iterate over examples, export
to JSON/YAML, or display a rich table preview.
Create a dataset and add examples:
dataset = client.datasets.create(
name="golden-set",
schema={
"type": "object",
"properties": {
"input": {"type": "string"},
"expected_output": {"type": "string"},
},
},
)
dataset.add_examples([
Example.create(input="What is AI?", expected_output="Artificial Intelligence"),
])Retrieve and iterate:
dataset = client.datasets.get(name="golden-set")
for example in dataset:
print(example["input"])Attributes
name
:str
Dataset name.
project_id
:str
Owning project ID.
project_name
:str
Project name.
dataset_id
:Optional[str]
Unique dataset identifier (set when created/fetched).
None
schema
:Optional[Dict[str, Any]]
The dataset's JSON Schema. Examples must conform to it.
field(default=None)
current_version
:Optional[int]
Latest dataset version number.
None
dataset_kind
:str
Kind of dataset (default "example").
'example'
examples
:Optional[List[Example]]
The loaded examples (populated when using .get()).
None
client
:Optional[JudgmentSyncClient]
Internal API client (set automatically).
None
_identifier
:str
add_from_json()
Upload examples from a JSON file.
The file should contain a JSON array of objects, where each object
has the fields you want as example properties (e.g. input,
actual_output, expected_output).
dataset.add_from_json("./data/golden-set.json")def add_from_json(file_path, batch_size=100) -> None:Parameters
file_path
required:str
Path to the JSON file.
batch_size
:int
Number of examples uploaded per API call.
100
Returns
None
add_from_yaml()
Upload examples from a YAML file.
Same as add_from_json but reads YAML format.
def add_from_yaml(file_path, batch_size=100) -> None:Parameters
file_path
required:str
Path to the YAML file.
batch_size
:int
Number of examples uploaded per API call.
100
Returns
None
add_examples()
Append Example objects to this dataset.
Examples are validated server-side against the dataset schema and uploaded in batches with a progress bar. Each successful batch advances the dataset version. Accepts any iterable, including generators.
TypeError: If a single Example is passed instead of a list.
JudgmentValidationError: If examples fail schema validation.
def add_examples(examples, batch_size=100) -> None:Parameters
examples
required:Iterable[Example]
A list (or iterable) of Example objects.
batch_size
:int
Number of examples per upload batch.
100
Returns
None
versions()
List all versions of this dataset, newest first.
def versions() -> typing.List:Returns
typing.List - A list of DatasetVersion objects.
delete()
Delete this dataset from the platform.
Dependent test configs are deleted along with the dataset.
def delete() -> None:Returns
None
save_as()
Export the dataset to a local JSON or YAML file.
dataset = client.datasets.get(name="golden-set")
dataset.save_as("json", dir_path="./exports")def save_as(file_type, dir_path, save_name=None) -> None:Parameters
file_type
required:Literal['json', 'yaml']
"json" or "yaml".
dir_path
required:str
Directory to write into (created if it doesn't exist).
save_name
:Optional[str]
File name without extension. Defaults to a timestamp.
None
Returns
None
display()
Print a formatted table preview to the terminal.
def display(max_examples=5) -> None:Parameters
max_examples
:int
Maximum number of examples to show.
5
Returns
None