The Tracer class provides comprehensive observability for AI agents and LLM applications. It automatically captures execution traces, spans, and performance metrics while enabling real-time evaluation and monitoring through the Judgment platform.
The Tracer is your primary interface for adding observability to your AI agents. It provides methods for tracing function execution, evaluating performance, and collecting comprehensive environment interaction data.
OpenTelemetry resource attributes to attach to all spans. Resource attributes describe the entity producing the telemetry data (e.g., service name, version, environment). See the OpenTelemetry Resource specification for standard attributes.
Basic Example
tracer.py
from judgeval.tracer import Tracerjudgment = Tracer( project_name="default_project")@judgment.observe(span_type="function")def answer_question(question: str) -> str: answer = "The capital of the United States is Washington, D.C." return answer@judgment.observe(span_type="tool")def process_request(question: str) -> str: answer = answer_question(question) return answerif __name__ == "__main__": print(process_request("What is the capital of the United States?"))
OpenTelemetry Integration
tracer_otel.py
from judgeval.tracer import Tracerfrom opentelemetry.sdk.trace import TracerProvidertracer_provider = TracerProvider()# Initialize tracer with OpenTelemetry configurationjudgment = Tracer( project_name="default_project", resource_attributes={ "service.name": "my-ai-agent", "service.version": "1.2.0", "deployment.environment": "production" })tracer_provider.add_span_processor(judgment.get_processor())tracer = tracer_provider.get_tracer(__name__)def answer_question(question: str) -> str: with tracer.start_as_current_span("answer_question_span") as span: span.set_attribute("question", question) answer = "The capital of the United States is Washington, D.C." span.set_attribute("answer", answer) return answerdef process_request(question: str) -> str: with tracer.start_as_current_span("process_request_span") as span: span.set_attribute("input", question) answer = answer_question(question) span.set_attribute("output", answer) return answerif __name__ == "__main__": print(process_request("What is the capital of the United States?"))
Records an observation or output during a trace. This is useful for capturing intermediate steps, tool results, or decisions made by the agent. Optionally, provide a scorer config to run an evaluation on the trace.
Configuration for running an evaluation on the trace or sub-trace. When scorer_config is provided, a trace evaluation will be run for the sub-trace/span tree with the decorated function as the root. See TraceScorerConfig for more details
Example:
# retrieve/create a trace scorer to be used with the TraceScorerConfigtrace_scorer = TracePromptScorer.get(name="sample_trace_scorer")TraceScorerConfig( scorer=trace_scorer, sampling_rate=0.5,)
Example Code
trace.py
from openai import OpenAIfrom judgeval.tracer import Tracerclient = OpenAI()tracer = Tracer(project_name='default_project', deep_tracing=False)@tracer.observe(span_type="tool")def search_web(query): return f"Results for: {query}"@tracer.observe(span_type="retriever")def get_database(query): return f"Database results for: {query}"@tracer.observe(span_type="function")def run_agent(user_query): # Use tools based on query if "database" in user_query: info = get_database(user_query) else: info = search_web(user_query) prompt = f"Context: {info}, Question: {user_query}" # Generate response response = client.chat.completions.create( model="gpt-5", messages=[{"role": "user", "content": prompt}] ) return response.choices[0].message.content
Wraps an API client to add tracing capabilities. Supports OpenAI, Together, Anthropic, and Google GenAI clients. Patches methods like .create, Anthropic's .stream, and OpenAI's .responses.create and .beta.chat.completions.parse methods using a wrapper class.
Runs quality evaluations on the current trace/span using specified scorers. You can provide either an Example object or individual evaluation parameters (input, actual_output, etc.).
A float between 0 and 1 representing the chance the eval should be sampled
Example:
0.75 # Eval occurs 75% of the time
Example Code
async_evaluate.py
from judgeval.scorers import AnswerRelevancyScorerfrom judgeval.data import Examplefrom judgeval.tracer import Tracerjudgment = Tracer(project_name="default_project")@judgment.observe(span_type="function")def agent(question: str) -> str: answer = "Paris is the capital of France" # Create example object example = Example( input=question, actual_output=answer, ) # Evaluate using Example judgment.async_evaluate( scorer=AnswerRelevancyScorer(threshold=0.5), example=example, model="gpt-5", sampling_rate=0.9 ) return answerif __name__ == "__main__": print(agent("What is the capital of France?"))
Method decorator for agentic systems that assigns an identifier to each agent and enables tracking of their internal state variables. Essential for monitoring and debugging single or multi-agent systems where you need to track each agent's behavior and state separately. This decorator should be used on the entry point method of your agent class.
A function that returns a boolean indicating whether the eval should be run. When TraceScorerConfig is used in @tracer.observe(), run_condition is called with the decorated function's arguments
Example:
lambda x: x > 10
For the above example, if this TraceScorerConfig instance is passed into a @tracer.observe() that decorates a function taking x as an argument, then the trace eval will only run if x > 10 when the decorated function is called
Example Code
trace_scorer_config.py
judgment = Tracer(project_name="default_project")# Retrieve a trace scorer to be used with the TraceScorerConfigtrace_scorer = TracePromptScorer.get(name="sample_trace_scorer")# A trace eval is only triggered if process_request() is called with x > 10@judgment.observe(span_type="function", scorer_config=TraceScorerConfig( scorer=trace_scorer, sampling_rate=1.0, run_condition=lambda x: x > 10))def process_request(x): return x + 1
In the above example, a trace eval will be run for the trace/sub-trace with the process_request() function as the root.