v0.1 Release Notes (July 1, 2025)
New Features
Trace Management
- Custom Trace Tagging: Add and remove custom tags on individual traces to better organize and categorize your trace data (e.g., environment, feature, or workflow type)
Bug Fixes
- Improved Markdown Display: Fixed layout issues where markdown content wasn't properly fitting container width, improving readability
New Features
Enhanced Prompt Scorer Integration
- Automatic Database Sync: Prompt scorers automatically push to database when created or updated through the SDK. Learn about PromptScorers →
- Smart Initialization: Initialize ClassifierScorer objects with automatic slug generation or fetch existing scorers from database using slugs
Improvements
Performance
- Faster Evaluations: All evaluations now route through optimized async worker servers for improved experiment speed
- Industry-Standard Span Export: Migrated to batch OpenTelemetry span exporter in C++ from custom Python implementation for better reliability, scalability, and throughput
- Enhanced Network Resilience: Added intelligent timeout handling for network requests, preventing blocking threads and potential starvation in production environments
- Advanced Span Lifecycle Management: Improved span object lifecycle management for better span ingestion event handling
Developer Experience
- Updated Cursor Rules: Enhanced Cursor integration rules to assist with building agents using Judgeval. Set up Cursor rules →
User Experience
- Consistent Error Pages: Standardized error and not-found page designs across the platform for a more polished user experience
New Features
Role-Based Access Control
- Multi-Tier Permissions: Implement viewer, developer, admin, and owner roles to control user access within organizations
- Granular Access Control: Viewers get read-only access to non-sensitive data, developers handle all non-administrative tasks, with finer controls coming soon
Customer Usage Analytics
- Usage Monitoring Dashboard: Track and monitor customer usage trends with visual graphs showing usage vs time and top customers by cost and token consumption
- SDK Customer ID Assignment: Set customer id to track customer usage by using
tracer.set_customer_id()
. Track customer LLM usage →
API Integrations
- Enhanced Token Tracking: Added support for input cache tokens across OpenAI, Gemini, and Anthropic APIs
- Together API Support: Extended
wrap()
functionality to include Together API clients. Set up Together tracing →
Improvements
Platform Reliability
- Standardized Parameters: Consistent naming conventions across evaluation and tracing methods
- Improved Database Performance: Optimized trace span ingestion for increased throughput and decreased latency
Initial Release
- Initial platform launch!