v0.14 Release Notes (Sep 28, 2025)

2025-09-28
v0.14.0

New Features

Work with trace datasets in the SDK

The Dataset class now supports trace datasets. Use Dataset.get() to retrieve trace datasets with full OpenTelemetry structure including spans, scores, and triggered rules. This makes it easy to export production traces for optimization (ie. SFT, DPO, RFT) or create test datasets from real agent executions for sanity checking agent updates.

Export datasets and traces

Export datasets and traces for data portability, offline analysis, or integration with external tools. This gives you full control over your evaluation data and production traces.

Fixes

Cumulative cost tracking issues

Fixed issues with cumulative cost tracking for better billing insights.

Column rendering in example datasets

Fixed column rendering in example datasets.

Improvements

Accurate, up-to-date LLM cost tracking

LLM costs are now calculated server-side with the latest pricing information, ensuring accurate cost tracking as providers update their rates.

Simpler rule configuration

Rules now trigger based on whether scores pass or fail, replacing the previous custom threshold system. This makes it easier to set up alerts without tuning specific score values.

Better multimodal content display

Enhanced display for multimodal OpenAI chat completions with proper formatting for images and text. Added fullscreen view for large content with scroll-to-bottom functionality.

Configure models per scorer

Trace prompt scorers now include model configuration, making it visible which model evaluates each trace. This gives you more control over scorer quality and cost tradeoffs.

Improved form validation

Annotation forms now make comments optional while requiring at least one scorer. Clear error messages and visual indicators guide you when required fields are missing.

Performance and visual polish

Optimized keyboard navigation for traces and improved span loading states with better icons.