LiteLLM
If you use LiteLLM within your application, you can trace, monitor, and analyze all LLM calls with Judgment. LiteLLM provides a unified interface to call 100+ LLM providers using a consistent API.
Because LiteLLM exposes a functional API (litellm.completion), the standard wrap() approach does not apply. Instead, register an OpenTelemetry instrumentor to automatically capture every LiteLLM call — including model name, token usage, and cost.
Install Dependencies
uv add judgeval opentelemetry-instrumentation-litellm litellmpip install judgeval opentelemetry-instrumentation-litellm litellmInitialize Tracing
from judgeval import Tracer
from opentelemetry.instrumentation.litellm import LiteLLMInstrumentor
Tracer.init(project_name="litellm_project")
Tracer.registerOTELInstrumentation(LiteLLMInstrumentor())LiteLLMInstrumentor monkeypatches litellm.completion and related functions so every call emits an OTEL span with the model name, prompt/completion tokens, and cost.
Use LiteLLM as Normal
import litellm
from judgeval import Tracer
from opentelemetry.instrumentation.litellm import LiteLLMInstrumentor
Tracer.init(project_name="litellm_project")
Tracer.registerOTELInstrumentation(LiteLLMInstrumentor())
response = litellm.completion(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello, world!"}]
)
print(response.choices[0].message.content)All LiteLLM calls are automatically traced and exported to the Judgment platform.
Multi-Agent / Swarm Use Case
LiteLLM is commonly used inside multi-agent frameworks (e.g. swarm orchestrators) where each sub-agent calls litellm.completion directly. The instrumentor ensures that cost is attributed to the spans where inference actually happens — not the parent orchestrator.
import litellm
from judgeval import Tracer
from opentelemetry.instrumentation.litellm import LiteLLMInstrumentor
Tracer.init(project_name="swarm_project")
Tracer.registerOTELInstrumentation(LiteLLMInstrumentor())
@Tracer.observe(span_type="agent")
def data_worker(query: str) -> str:
response = litellm.completion(
model="gpt-4o",
messages=[{"role": "user", "content": f"Process this data: {query}"}]
)
return response.choices[0].message.content
@Tracer.observe(span_type="agent")
def review_worker(data: str) -> str:
response = litellm.completion(
model="gpt-4o",
messages=[{"role": "user", "content": f"Review this output: {data}"}]
)
return response.choices[0].message.content
@Tracer.observe(span_type="agent")
def orchestrator(query: str) -> str:
data = data_worker(query)
review = review_worker(data)
return review
result = orchestrator("Analyze Q4 revenue trends")In the Judgment trace view, each data_worker and review_worker span carries its own cost (from the underlying litellm.completion call), while the orchestrator span shows $0.00 since it makes no direct model calls.