v0.1 Release Notes (July 1, 2025)
New Features
Trace Management
- Custom Trace Tagging: Add and remove custom tags on individual traces to better organize and categorize your trace data (e.g., environment, feature, or workflow type)
Fixes
Improved Markdown Display
Fixed layout issues where markdown content wasn't properly fitting container width, improving readability.
Improvements
No improvements in this release.
New Features
Enhanced Prompt Scorer Integration
- Automatic Database Sync: Prompt scorers automatically push to database when created or updated through the SDK. Learn about PromptScorers →
- Smart Initialization: Initialize ClassifierScorer objects with automatic slug generation or fetch existing scorers from database using slugs
Fixes
No bug fixes in this release.
Improvements
Performance
- Faster Evaluations: All evaluations now route through optimized async worker servers for improved experiment speed
- Industry-Standard Span Export: Migrated to batch OpenTelemetry span exporter in C++ from custom Python implementation for better reliability, scalability, and throughput
- Enhanced Network Resilience: Added intelligent timeout handling for network requests, preventing blocking threads and potential starvation in production environments
- Advanced Span Lifecycle Management: Improved span object lifecycle management for better span ingestion event handling
Developer Experience
- Updated Cursor Rules: Enhanced Cursor integration rules to assist with building agents using Judgeval. Set up Cursor rules →
User Experience
- Consistent Error Pages: Standardized error and not-found page designs across the platform for a more polished user experience
New Features
Role-Based Access Control
- Multi-Tier Permissions: Implement viewer, developer, admin, and owner roles to control user access within organizations
- Granular Access Control: Viewers get read-only access to non-sensitive data, developers handle all non-administrative tasks, with finer controls coming soon
Customer Usage Analytics
- Usage Monitoring Dashboard: Track and monitor customer usage trends with visual graphs showing usage vs time and top customers by cost and token consumption
- SDK Customer ID Assignment: Set customer id to track customer usage by using
tracer.set_customer_id(). Track customer LLM usage →
API Integrations
- Enhanced Token Tracking: Added support for input cache tokens across OpenAI, Gemini, and Anthropic APIs
- Together API Support: Extended
wrap()functionality to include Together API clients. Set up Together tracing →
Fixes
No bug fixes in this release.
Improvements
Platform Reliability
- Standardized Parameters: Consistent naming conventions across evaluation and tracing methods
- Improved Database Performance: Optimized trace span ingestion for increased throughput and decreased latency
Initial Release
- Initial platform launch!