OpenAI Agents SDK

OpenAI Agents SDK integration captures traces from your OpenAI Agents applications, including agent execution flow, handoffs between agents, and individual agent calls.

Quickstart

Install Dependencies

uv add openai-agents-sdk judgeval openinference-instrumentation-openai-agents python-dotenv

pip install openai-agents-sdk judgeval openinference-instrumentation-openai-agents python-dotenv

Initialize Integration

setup.py

from judgeval.tracer import Tracer
from openinference.instrumentation.openai_agents import OpenAIAgentsInstrumentor

tracer = Tracer(project_name="openai_agents_project")
OpenAIAgentsInstrumentor().instrument()

Always initialize the Tracer before calling OpenAIAgentsInstrumentor().instrument() to ensure proper trace routing.

This integration uses OpenInference instrumentation to provide a robust, production-ready solution for porting telemetry data from OpenAI Agents into Judgment's platform. OpenInference handles the complex task of capturing and formatting trace data, ensuring reliable observability for your agent workflows.

Add to Existing Code

Add these lines to your existing OpenAI Agents application:

import dotenv
import os

dotenv.load_dotenv()

from judgeval.tracer import Tracer  
from openinference.instrumentation.openai_agents import OpenAIAgentsInstrumentor  

tracer = Tracer(project_name="openai-agents-app")  
OpenAIAgentsInstrumentor().instrument()  

from agents import Agent, Runner
import asyncio

agent = Agent(
    name="Assistant",
    instructions="You are a helpful assistant.",
)

async def main():
    result = await Runner.run(agent, input="Hello, how are you?")
    print(result.final_output)

if __name__ == "__main__":
    asyncio.run(main())

All agent executions and handoffs are automatically traced.

Example: Multi-Agent Triage System

multi_agent_example.py

import dotenv
import os
import asyncio

dotenv.load_dotenv()

from judgeval.tracer import Tracer
from openinference.instrumentation.openai_agents import OpenAIAgentsInstrumentor

tracer = Tracer(project_name="multi_agent_triage")
OpenAIAgentsInstrumentor().instrument()

from agents import Agent, Runner

# Define specialized agents
spanish_agent = Agent(
    name="Spanish Agent",
    instructions="You only speak Spanish. Answer all questions in Spanish.",
)

english_agent = Agent(
    name="English Agent",
    instructions="You only speak English. Answer all questions in English.",
)

french_agent = Agent(
    name="French Agent",
    instructions="You only speak French. Answer all questions in French.",
)

# Define triage agent with handoffs
triage_agent = Agent(
    name="Triage Agent",
    instructions="Determine the language of the user's request and handoff to the appropriate language-specific agent.",
    handoffs=[spanish_agent, english_agent, french_agent],
)

@tracer.observe(span_type="function")  
async def main():
    # Process multiple requests
    requests = [
        "Hola, ¿cómo estás?",
        "What is the weather like today?",
        "Bonjour, comment allez-vous?",
    ]

    for request in requests:
        result = await Runner.run(triage_agent, input=request)
        print(f"Request: {request}")
        print(f"Response: {result.final_output}\n")

if __name__ == "__main__":
    asyncio.run(main())

Tracking Non-Agent Operations: Use @tracer.observe() to track any function or method that's not part of your OpenAI Agents workflow. This is especially useful for monitoring utility functions, API calls, or other operations that happen outside the agent execution but are part of your overall application flow.

complete_example.py

from agents import Agent, Runner
from judgeval.tracer import Tracer
import asyncio

tracer = Tracer(project_name="my_agent")

@tracer.observe(span_type="function")
def preprocess_input(data: str) -> str:
    # Helper function tracked with @tracer.observe()
    return f"Preprocessed: {data}"

async def run_agent(user_input: str):
    # Preprocess input (traced with @tracer.observe)
    processed_input = preprocess_input(user_input)

    # Agent execution (automatically traced)
    agent = Agent(
        name="Assistant",
        instructions="You are a helpful assistant.",
    )

    result = await Runner.run(agent, input=processed_input)
    return result.final_output

# Execute - both helper functions and agents are traced
result = asyncio.run(run_agent("Hello World"))
print(result)

Next Steps

OpenInference Integration - Export OpenInference traces to Judgment for unified observability.
Agent Behavior Monitoring - Monitor your OpenAI Agents in production with behavioral scoring.
Tracing Reference - Learn more about Judgment's tracing capabilities and advanced configuration.