Performance Monitoring
Online Evals
Run real-time evaluations on your agents in production.
Quickstart
Online evals are embedded as a method of the Tracer
class. They can be attached to any trace and will be executed asynchronously, inducing no latency on your agent's response time.
from judgeval.common.tracer import Tracer, wrap
from judgeval.scorers import AnswerRelevancyScorer
from judgeval.data import Example
from openai import OpenAI
client = wrap(OpenAI())
judgment = Tracer(project_name="my_project")
@judgment.observe(span_type="tool")
def my_tool():
return "Hello world!"
@judgment.observe(span_type="function")
def main():
task_input = my_tool()
res = client.chat.completions.create(
model="gpt-4.1",
messages=[{"role": "user", "content": f"{task_input}"}]
).choices[0].message.content
judgment.async_evaluate(
scorers=[AnswerRelevancyScorer(threshold=0.5)],
example=Example(
input=task_input,
actual_output=res
),
model="gpt-4.1"
)
return res
main()
You should see the online eval results attached to the relevant trace span on the Judgment platform shortly after the trace is recorded.
Evals can take time to execute, so they may appear slightly delayed on the UI. Once the eval is complete, you should see it attached to your trace like this:
