Response & Exception Types
Return types and exceptions used throughout the JudgmentEval SDK
Overview
Response and exception types define the structure of data returned by SDK methods and the errors that may occur during operation. Understanding these types helps with proper error handling and result processing.
Evaluation Result Types
ScoringResult
Contains the output of one or more scorers applied to a single example. Represents the complete evaluation results for one input with its actual output, expected output, and all applied scorer results.
Properties
success
bool
Required
Whether the evaluation was successful. True when all scorers applied to this example returned a success.
scorers_data
List[ScorerData]
Optional
List of individual scorer results for this evaluation
data_object
Example
Optional
The original example object that was evaluated
name
str
Optional
Optional name identifier for this scoring result
trace_id
str
Optional
Unique identifier linking this result to trace data
run_duration
float
Optional
Time taken to complete the evaluation in seconds
evaluation_cost
float
Optional
Estimated cost of running the evaluation (e.g., API costs)
Usage Examples
from judgeval import JudgmentClient
client = JudgmentClient()
results = client.evaluate(examples=[...], scorers=[...])
for result in results:
if result.success:
print(f"Evaluation succeeded in {result.run_duration:.2f}s")
for scorer_data in result.scorers_data:
print(f" {scorer_data.name}: {scorer_data.score}")
else:
print("Evaluation failed")
ScorerData
Individual scorer result containing the score, reasoning, and metadata for a single scorer applied to an example.
Properties
name
str
Required
Name of the scorer that generated this result
threshold
float
Required
Threshold value used to determine pass/fail for this scorer
success
bool
Required
Whether this individual scorer succeeded (score >= threshold)
score
float
Optional
Numerical score returned by the scorer (typically 0.0-1.0)
reason
str
Optional
Human-readable explanation of why the scorer gave this result
id
str
Optional
Unique identifier for this scorer instance
strict_mode
bool
Optional
Whether the scorer was run in strict mode
evaluation_model
Union[List[str], str]
Optional
Model(s) used for evaluation (e.g., "gpt-4", ["gpt-4", "claude-3"])
error
str
Optional
Error message if the scorer failed to execute
additional_metadata
Dict[str, Any]
Optional
Extra information specific to this scorer or evaluation run
Usage Examples
# Access scorer data from a ScoringResult
scoring_result = client.evaluate(examples=[example], scorers=[faithfulness_scorer])[0]
for scorer_data in scoring_result.scorers_data:
print(f"Scorer: {scorer_data.name}")
print(f"Score: {scorer_data.score} (threshold: {scorer_data.threshold})")
print(f"Success: {scorer_data.success}")
print(f"Reason: {scorer_data.reason}")
if scorer_data.error:
print(f"Error: {scorer_data.error}")
Dataset Operation Types
DatasetInfo
Information about a dataset after creation or retrieval operations.
Properties
dataset_id
str
Required
Unique identifier for the dataset
name
str
Required
Human-readable name of the dataset
example_count
int
Required
Number of examples in the dataset
created_at
datetime
Required
When the dataset was created
updated_at
datetime
Optional
When the dataset was last modified
Exception Types
JudgmentAPIError
Primary exception raised when API operations fail due to network, authentication, or server issues.
Properties
message
str
Required
Human-readable error description
status_code
int
Optional
HTTP status code from the failed API request
response_data
Dict[str, Any]
Optional
Additional details from the API response, if available
Common Scenarios
- Authentication failures (401): Invalid API key or organization ID
- Rate limiting (429): Too many requests in a short time period
- Server errors (500+): Temporary issues with the Judgment platform
- Bad requests (400): Invalid parameters or malformed data
Exception Hierarchy
try:
# SDK operations
result = client.evaluate([...])
except JudgmentAPIError as api_error:
# Handle API-specific errors
logger.error(f"API error: {api_error.message}")
if api_error.status_code >= 500:
# Retry logic for server errors
pass
except ConnectionError:
# Handle network issues
logger.error("Network connection failed")
except Exception as e:
# Handle unexpected errors
logger.error(f"Unexpected error: {e}")
Class Instance Types
Some SDK methods return class instances that also serve as API clients:
Dataset
Class instances returned by Dataset.create()
and Dataset.get()
that provide both data access and additional methods for dataset management.
Usage Pattern
# Static methods return Dataset instances
dataset = Dataset.create(name="my_dataset", project_name="default_project")
retrieved_dataset = Dataset.get(name="my_dataset", project_name="default_project")
# Both return Dataset instances with properties and methods
print(dataset.name) # Access properties
dataset.add_examples([...]) # Call instance methods
Documentation
See Dataset for complete API documentation including:
- Static methods (
Dataset.create()
,Dataset.get()
) - Instance methods (
.add_examples()
,.add_traces()
, etc.) - Instance properties (
.name
,.examples
,.traces
, etc.)
PromptScorer
Class instances returned by PromptScorer.create()
and PromptScorer.get()
that provide scorer configuration and management methods.
Usage Pattern
# Static methods return PromptScorer instances
scorer = PromptScorer.create(
name="positivity_scorer",
prompt="Is the response positive? Response: {{actual_output}}",
options={"positive": 1, "negative": 0}
)
retrieved_scorer = PromptScorer.get(name="positivity_scorer")
# Both return PromptScorer instances with configuration methods
print(scorer.get_name()) # Access properties
scorer.set_threshold(0.8) # Update configuration
scorer.append_to_prompt("Consider tone and sentiment.") # Modify prompt
Documentation
See PromptScorer for complete API documentation including:
- Static methods (
PromptScorer.create()
,PromptScorer.get()
) - Configuration methods (
.set_prompt()
,.set_options()
,.set_threshold()
) - Getter methods (
.get_prompt()
,.get_options()
,.get_config()
)