Tracer
Track agent behavior and evaluate performance in real-time with the Tracer class.
The Tracer class provides comprehensive observability for AI agents and LLM applications. It automatically captures execution traces, spans, and performance metrics while enabling real-time evaluation and monitoring through the Judgment platform.
from judgeval.tracer import Tracer, wrap
from openai import OpenAI
judgment = Tracer(
project_name="default_project"
)
client = wrap(OpenAI())
class QAAgent:
def __init__(self, client):
self.client = client
@judgment.observe(span_type="tool")
def process_query(self, query):
response = self.client.chat.completions.create(
model="gpt-5",
messages=[
{"role": "system", "content": "You are a helpful assitant"},
{"role": "user", "content": f"I have a query: {query}"}]
)
return f"Response: {response.choices[0].message.content}"
@judgment.agent()
@judgment.observe(span_type="agent")
def invoke_agent(self, query):
result = self.process_query(query)
return result
if __name__ == "__main__":
agent = QAAgent(client)
print(agent.invoke_agent("What is the capital of the United States?"))Tracer()
Initialize a Tracer object.
Tracer(
api_key: str,
organization_id: str,
project_name: str,
enable_monitoring: bool = True,
enable_evaluations: bool = True,
resource_attributes: Optional[Dict[str, Any]] = None
)Parameters
Recommended: set with JUDGMENT_API_KEY environment variable.
Recommended: set with JUDGMENT_ORG_ID environment variable.
async_evaluate()OpenTelemetry resource attributes to attach to all spans. Resource attributes describe the entity producing the telemetry data (e.g., service name, version, environment). See the OpenTelemetry Resource specification for standard attributes.
Example
from judgeval.tracer import Tracer
judgment = Tracer(
project_name="default_project"
)@tracer.observe()
Decorate a function to record an observation or output during a trace. This is useful for capturing intermediate steps, tool results, or decisions made by the agent.
Optionally, provide a scorer config to run an evaluation on the trace.
Parameters
The function to decorate (automatically provided when used as decorator)
Optional custom name for the span (defaults to function name)
Type of span to create. Available options:
"span": General span (default)"tool": For functions that should be tracked and exported as agent tools"function": For main functions or entry points"llm": For language model calls (automatically applied to wrapped clients)
LLM clients wrapped using wrap() automatically use the "llm" span type
without needing manual decoration.
Configuration for running an evaluation on the trace or sub-trace. When
scorer_config is provided, a trace evaluation will be run for the
sub-trace/span tree with the decorated function as the root. See
TraceScorerConfig for more details
Examples
from judgeval.tracer import Tracer
judgment = Tracer(project_name="default_project")
@judgment.observe(span_type="function")
def answer_question(question: str) -> str:
answer = "The capital of the United States is Washington, D.C."
return answer
@judgment.observe(span_type="tool")
def process_request(question: str) -> str:
answer = answer_question(question)
return answer
if __name__ == "__main__":
print(process_request("What is the capital of the United States?"))from openai import OpenAI
from judgeval.tracer import Tracer
client = OpenAI()
tracer = Tracer(project_name='default_project')
@tracer.observe(span_type="tool")
def search_web(query):
return f"Results for: {query}"
@tracer.observe(span_type="retriever")
def get_database(query):
return f"Database results for: {query}"
@tracer.observe(span_type="function")
def run_agent(user_query):
# Use tools based on query
if "database" in user_query:
info = get_database(user_query)
else:
info = search_web(user_query)
prompt = f"Context: {info}, Question: {user_query}"
# Generate response
response = client.chat.completions.create(
model="gpt-5",
messages=[{"role": "user", "content": prompt}]
)
return response.choices[0].message.contentwrap()
Wraps an API client to add tracing capabilities. Supports OpenAI, Together, Anthropic, and Google GenAI clients. Patches methods like .create, Anthropic's .stream, and OpenAI's .responses.create and .beta.chat.completions.parse methods using a wrapper class.
Parameters
API client to wrap (OpenAI, Anthropic, Together, Google GenAI, Groq)
Example
from openai import OpenAI
from judgeval.tracer import wrap
client = OpenAI()
wrapped_client = wrap(client)
# All API calls are now automatically traced
response = wrapped_client.chat.completions.create(
model="gpt-5",
messages=[{"role": "user", "content": "Hello"}]
)
# Streaming calls are also traced
stream = wrapped_client.chat.completions.create(
model="gpt-5",
messages=[{"role": "user", "content": "Hello"}],
stream=True
)tracer.async_evaluate()
Runs quality evaluations on the current trace/span using specified scorers. You can provide either an Example object or individual evaluation parameters (input, actual_output, etc.).
Parameters
A evaluation scorer to run. See Configuration Types for available scorer options.
Example object containing evaluation data. See Example for structure details.
A float between 0 and 1 representing the chance the eval should be sampled
Example
from judgeval.scorers import AnswerRelevancyScorer
from judgeval.data import Example
from judgeval.tracer import Tracer
judgment = Tracer(project_name="default_project")
@judgment.observe(span_type="function")
def agent(question: str) -> str:
answer = "Paris is the capital of France"
# Create example object
example = Example(
input=question,
actual_output=answer,
)
# Evaluate using Example
judgment.async_evaluate(
scorer=AnswerRelevancyScorer(threshold=0.5),
example=example,
model="gpt-5",
sampling_rate=0.9
)
return answer
if __name__ == "__main__":
print(agent("What is the capital of France?"))@tracer.agent()
Method decorator for agentic systems that assigns an identifier to each agent and enables tracking of their internal state variables. Essential for monitoring and debugging single or multi-agent systems where you need to track each agent's behavior and state separately. This decorator should be used on the entry point method of your agent class.
Parameters
The identifier to associate with the class whose method is decorated. This will be used as the instance name in traces.
Examples
Basic Usage
from judgeval.tracer import Tracer
judgment = Tracer(project_name="default_project")
class TravelAgent:
def __init__(self, id):
self.id = id
@judgment.observe(span_type="tool")
def book_flight(self, destination):
return f"Flight booked to {destination}!"
@judgment.agent(identifier="id")
@judgment.observe(span_type="function")
def invoke_agent(self, destination):
flight_info = self.book_flight(destination)
return f"Here is your requested flight info: {flight_info}"
if __name__ == "__main__":
agent = TravelAgent("travel_agent_1")
print(agent.invoke_agent("Paris"))
agent2 = TravelAgent("travel_agent_2")
print(agent2.invoke_agent("New York"))Multi-Agent System Tracing
from planning_agent import PlanningAgent
if __name__ == "__main__":
planning_agent = PlanningAgent("planner-1")
goal = "Build a multi-agent system"
result = planning_agent.plan(goal)
print(result)from judgeval.tracer import Tracer
judgment = Tracer(project_name="multi-agent-system")from utils import judgment
from research_agent import ResearchAgent
from task_agent import TaskAgent
class PlanningAgent:
def __init__(self, id):
self.id = id
@judgment.agent() # Only add @judgment.agent() to the entry point function of the agent
@judgment.observe()
def invoke_agent(self, goal):
print(f"Agent {self.id} is planning for goal: {goal}")
research_agent = ResearchAgent("Researcher1")
task_agent = TaskAgent("Tasker1")
research_results = research_agent.invoke_agent(goal)
task_result = task_agent.invoke_agent(research_results)
return f"Results from planning and executing for goal '{goal}': {task_result}"
@judgment.observe() # No need to add @judgment.agent() here
def random_tool(self):
passfrom utils import judgment
class ResearchAgent:
def __init__(self, id):
self.id = id
@judgment.agent()
@judgment.observe()
def invoke_agent(self, topic):
return f"Research notes for topic: {topic}: Findings and insights include..."from utils import judgment
class TaskAgent:
def __init__(self, id):
self.id = id
@judgment.agent()
@judgment.observe()
def invoke_agent(self, task):
result = f"Performed task: {task}, here are the results: Results include..."
return resultThe trace will show up in the Judgment platform clearly indicating which agent called which method:
Each agent's tool calls are clearly associated with their respective classes, making it easy to follow the execution flow across your multi-agent system.
TraceScorerConfig()
Initialize a TraceScorerConfig object for running an evaluation on the trace.
Parameters
A float between 0 and 1 representing the chance the eval should be sampled
A function that returns a boolean indicating whether the eval should be
run. When TraceScorerConfig is used in @tracer.observe(),
run_condition is called with the decorated function's arguments. For
example, lambda x: x > 10 will only run when x > 10.
Example
A trace eval will be run for the trace/sub-trace with the process_request() function as the root
judgment = Tracer(project_name="default_project")
# Retrieve a trace scorer to be used with the TraceScorerConfig
trace_scorer = TracePromptScorer.get(name="sample_trace_scorer")
# A trace eval is only triggered if process_request() is called with x > 10
@judgment.observe(
span_type="function",
scorer_config=TraceScorerConfig(
scorer=trace_scorer,
sampling_rate=1.0,
run_condition=lambda x: x > 10
)
)
def process_request(x):
return x + 1tracer.get_current_span()
Returns the current span object for direct access to span properties and methods, useful for debugging and inspection.
Attributes
The current span object provides these properties for inspection and debugging:
Type of span ("span", "tool", "llm", "evaluation", "chain")
Example Usage
@tracer.observe(span_type="tool")
def debug_tool(query):
span = tracer.get_current_span()
if span:
# Access span properties for debugging
print(f"Executing {span.function} (ID: {span.span_id})")
print(f"Depth: {span.depth}, Type: {span.span_type}")
print(f"Inputs: {span.inputs}")
# Check parent relationship
if span.parent_span_id:
print(f"Parent span: {span.parent_span_id}")
# Monitor execution state
if span.agent_name:
print(f"Agent: {span.agent_name}")
result = perform_search(query)
# Check span after execution
if span:
print(f"Output: {span.output}")
print(f"Duration: {span.duration}s")
if span.has_evaluation:
print(f"Has {len(span.evaluation_runs)} evaluations")
if span.error:
print(f"Error: {span.error}")
return resultAdvanced Tracing
tracer.set_customer_id
You can specify a span and its children spans to be associated with a customer_id, which enables further insights through our usage dashboards. Note that in order for the trace to be associated with a customer_id, you need to set this on the root_span of the trace for it to show up on the monitoring tables.
tracer.set_tag
Coming soon.
OpenTelemetry TracerProvider
from judgeval.tracer import Tracer
from opentelemetry.sdk.trace import TracerProvider
tracer_provider = TracerProvider()
# Initialize tracer with OpenTelemetry configuration
judgment = Tracer(
project_name="default_project",
resource_attributes={
"service.name": "my-ai-agent",
"service.version": "1.2.0",
"deployment.environment": "production"
}
)
tracer_provider.add_span_processor(judgment.get_processor())
tracer = tracer_provider.get_tracer(__name__)
def answer_question(question: str) -> str:
with tracer.start_as_current_span("answer_question_span") as span:
span.set_attribute("question", question)
answer = "The capital of the United States is Washington, D.C."
span.set_attribute("answer", answer)
return answer
def process_request(question: str) -> str:
with tracer.start_as_current_span("process_request_span") as span:
span.set_attribute("input", question)
answer = answer_question(question)
span.set_attribute("output", answer)
return answer
if __name__ == "__main__":
print(process_request("What is the capital of the United States?"))