v0.5 Release Notes (Aug 4, 2025)

2025-08-04
v0.5.0

New Features

Annotation Queue System

  • Automated Queue Management: Failed traces are automatically added to an annotation queue for manual review and scoring
  • Human Evaluation Workflow: Add comments and scores to queued traces, with automatic removal from queue upon completion
  • Dataset Integration: Export annotated traces to datasets for long-term storage and analysis purposes

Enhanced Async Evaluations

  • Sampling Control: Added sampling rate parameter to async evaluations, allowing you to control how frequently evaluations run on your production data (e.g., evaluate 5% of production traces for hallucinations). Configure sampling →
  • Easier Async Evaluations: Simplified async evaluation interface to make running evaluations on live traces smoother

Local Scorer Execution

  • Local Execution: Custom scorers for online evaluations now run locally with asynchronous background processing, providing faster evaluation results without slowing down the critical path. Set up local scorers →

PromptScorer Website Management

  • Platform-Based PromptScorer Creation: Create, edit, delete, and manage custom prompt-based evaluation scorers with an interactive playground to test configurations in real-time before deployment. Manage PromptScorers →

Fixes

No bug fixes in this release.

Improvements

Platform Reliability

  • Improved Data Serialization: Standardized JSON encoding across the platform using FastAPI's proven serialization methods for more reliable trace data handling and API communication

Community Contributions

Special thanks to @dedsec995 and our other community contributors for helping improve the platform's data serialization capabilities.