June 16, 2025
New Features
Tool Order Scoring
- Now supports vanilla Python and Langgraph
- Ordering Match mode (default): Checks that tools are called in the right sequence, ignoring extras — and gives a score of 1.0 only if they appear in the correct sequence
- Exact Match mode: Requires the call list to match exactly, with no omissions or additions. Additionally, when you include parameter expectations, it also verifies that each tool was invoked with the correct arguments.
New @observe_tools
Decorator
judgeval #319
- Simplified Tool Tracing: Automatically trace all methods within a class as 'tool' spans
- Eliminates the need to decorate each method individually
- Perfect for users who organize their tools under a class structure
Trace Metadata Management
judgeval #261- New
set_metadata()
function to update and set metadata of a trace - Supported keys:
customer_id
: ID of the customer using this tracetags
: List of tags for this tracehas_notification
: Whether this trace has a notificationoverwrite
: Whether to overwrite existing tracesrules
: Rules for this tracename
: Name of the trace
PagerDuty Alert Support
judgeval #389
- Users can now integrate with PagerDuty as a notification option for Rules
Improvements
Tracing Configuration
- Disabled deep tracing by default, reducing confusion for users when adding
@judgment.observe
judgeval #324 - Added support for tracing both synchronous and asynchronous OpenAI
client.beta.chat.completions.parse
judgeval #322
Bug Fixes
- Fixed AWS S3 Bucket US-East-1 bug judgeval #321
- For
us-east-1
, theLocationConstraint
parameter is now omitted, as it is not permitted by AWS S3 for buckets created in this specific region
- For
Documentation Updates
- New documentation on
@observe_tools
is now available - Added PagerDuty alert setup guide
- Updated metadata documentation