Response & Exception Types

Overview

Response and exception types define the structure of data returned by SDK methods and the errors that may occur during operation. Understanding these types helps with proper error handling and result processing.

Evaluation Result Types

`ScoringResult`

Contains the output of one or more scorers applied to a single example. Represents the complete evaluation results for one input with its actual output, expected output, and all applied scorer results.

Properties

`success`

bool

Required

Whether the evaluation was successful. True when all scorers applied to this example returned a success.

`scorers_data`

List[ScorerData]

Optional

List of individual scorer results for this evaluation

`data_object`

Example

Optional

The original example object that was evaluated

`name`

str

Optional

Optional name identifier for this scoring result

`trace_id`

str

Optional

Unique identifier linking this result to trace data

`run_duration`

float

Optional

Time taken to complete the evaluation in seconds

`evaluation_cost`

float

Optional

Estimated cost of running the evaluation (e.g., API costs)

Usage Examples

from judgeval import JudgmentClient

client = JudgmentClient()
results = client.evaluate(examples=[...], scorers=[...])

for result in results:
    if result.success:
        print(f"Evaluation succeeded in {result.run_duration:.2f}s")
        for scorer_data in result.scorers_data:
            print(f"  {scorer_data.name}: {scorer_data.score}")
    else:
        print("Evaluation failed")

`ScorerData`

Individual scorer result containing the score, reasoning, and metadata for a single scorer applied to an example.

Properties

`name`

str

Required

Name of the scorer that generated this result

`threshold`

float

Required

Threshold value used to determine pass/fail for this scorer

`success`

bool

Required

Whether this individual scorer succeeded (score >= threshold)

`score`

float

Optional

Numerical score returned by the scorer (typically 0.0-1.0)

`reason`

str

Optional

Human-readable explanation of why the scorer gave this result

`id`

str

Optional

Unique identifier for this scorer instance

`strict_mode`

bool

Optional

Whether the scorer was run in strict mode

`evaluation_model`

Union[List[str], str]

Optional

Model(s) used for evaluation (e.g., "gpt-4", ["gpt-4", "claude-3"])

`error`

str

Optional

Error message if the scorer failed to execute

`additional_metadata`

Dict[str, Any]

Optional

Extra information specific to this scorer or evaluation run

Usage Examples

# Access scorer data from a ScoringResult
scoring_result = client.evaluate(examples=[example], scorers=[faithfulness_scorer])[0]

for scorer_data in scoring_result.scorers_data:
    print(f"Scorer: {scorer_data.name}")
    print(f"Score: {scorer_data.score} (threshold: {scorer_data.threshold})")
    print(f"Success: {scorer_data.success}")
    print(f"Reason: {scorer_data.reason}")
    
    if scorer_data.error:
        print(f"Error: {scorer_data.error}")

Dataset Operation Types

`DatasetInfo`

Information about a dataset after creation or retrieval operations.

Properties

`dataset_id`

str

Required

Unique identifier for the dataset

`name`

str

Required

Human-readable name of the dataset

`example_count`

int

Required

Number of examples in the dataset

`created_at`

datetime

Required

When the dataset was created

`updated_at`

datetime

Optional

When the dataset was last modified

Exception Types

`JudgmentAPIError`

Primary exception raised when API operations fail due to network, authentication, or server issues.

Properties

`message`

str

Required

Human-readable error description

`status_code`

int

Optional

HTTP status code from the failed API request

`response_data`

Dict[str, Any]

Optional

Additional details from the API response, if available

Common Scenarios

Authentication failures (401): Invalid API key or organization ID
Rate limiting (429): Too many requests in a short time period
Server errors (500+): Temporary issues with the Judgment platform
Bad requests (400): Invalid parameters or malformed data

Recommended Error Handling

Exception Hierarchy

try:
    # SDK operations
    result = client.evaluate([...])
except JudgmentAPIError as api_error:
    # Handle API-specific errors
    logger.error(f"API error: {api_error.message}")
    if api_error.status_code >= 500:
        # Retry logic for server errors
        pass
except ConnectionError:
    # Handle network issues
    logger.error("Network connection failed")
except Exception as e:
    # Handle unexpected errors
    logger.error(f"Unexpected error: {e}")

Class Instance Types

Some SDK methods return class instances that also serve as API clients:

`Dataset`

Class instances returned by Dataset.create() and Dataset.get() that provide both data access and additional methods for dataset management.

Usage Pattern

# Static methods return Dataset instances
dataset = Dataset.create(name="my_dataset", project_name="default_project")
retrieved_dataset = Dataset.get(name="my_dataset", project_name="default_project")

# Both return Dataset instances with properties and methods
print(dataset.name)  # Access properties
dataset.add_examples([...])  # Call instance methods

Documentation

See Dataset for complete API documentation including:

Static methods (Dataset.create(), Dataset.get())
Instance methods (.add_examples(), .add_traces(), etc.)
Instance properties (.name, .examples, .traces, etc.)

`PromptScorer`

Class instances returned by PromptScorer.create() and PromptScorer.get() that provide scorer configuration and management methods.

Usage Pattern

# Static methods return PromptScorer instances
scorer = PromptScorer.create(
    name="positivity_scorer", 
    prompt="Is the response positive? Response: {{actual_output}}",
    options={"positive": 1, "negative": 0}
)
retrieved_scorer = PromptScorer.get(name="positivity_scorer")

# Both return PromptScorer instances with configuration methods
print(scorer.get_name())  # Access properties
scorer.set_threshold(0.8)  # Update configuration
scorer.append_to_prompt("Consider tone and sentiment.")  # Modify prompt

Documentation

See PromptScorer for complete API documentation including:

Static methods (PromptScorer.create(), PromptScorer.get())
Configuration methods (.set_prompt(), .set_options(), .set_threshold())
Getter methods (.get_prompt(), .get_options(), .get_config())

Response & Exception Types

Properties

Usage Examples

Properties

Usage Examples

Properties

Properties

Common Scenarios

Exception Hierarchy

Usage Pattern

Documentation

Usage Pattern

Documentation

On this page