MCP Server
Use Judgment's MCP server to query traces, behaviors, sessions, and more directly from your AI code editor
The Judgment MCP server exposes your production data — traces, sessions, behaviors, projects, and automations — directly to AI-powered code editors via the Model Context Protocol. This lets your AI assistant query real production data, analyze agent performance, and use those insights to optimize your code — all without leaving your editor.
Setup
Connect the MCP Server
Add the following to your ~/.cursor/mcp.json (global) or .cursor/mcp.json (project-level):
{
"mcpServers": {
"judgment-mcp": {
"url": "https://mcp.judgmentlabs.ai",
"headers": {
"Authorization": "Bearer <YOUR_JUDGMENT_API_KEY>",
"X-Organization-Id": "<YOUR_ORGANIZATION_ID>"
}
}
}
}Run the following command to add the Judgment MCP server:
claude mcp add judgment-mcp \
--transport http \
--url https://mcp.judgmentlabs.ai \
--header "Authorization: Bearer <YOUR_JUDGMENT_API_KEY>" \
--header "X-Organization-Id: <YOUR_ORGANIZATION_ID>"Add the following to your ~/.codeium/windsurf/mcp_config.json:
{
"mcpServers": {
"judgment-mcp": {
"serverUrl": "https://mcp.judgmentlabs.ai",
"headers": {
"Authorization": "Bearer <YOUR_JUDGMENT_API_KEY>",
"X-Organization-Id": "<YOUR_ORGANIZATION_ID>"
}
}
}
}Add the Best Practices Skill
To help your AI assistant use the MCP server effectively, add the Judgment MCP best practices skill. This teaches your assistant optimal patterns like batching queries, using full-text search first, and deduplicating results.
mkdir -p .cursor/skills/judgment-mcp
curl -fLo .cursor/skills/judgment-mcp/SKILL.md \
https://docs.judgmentlabs.ai/skills/mcp-server-best-practices.mdmkdir -p .agents/skills/mcp-server-best-practices
curl -fLo .agents/skills/mcp-server-best-practices/SKILL.md \
https://docs.judgmentlabs.ai/skills/mcp-server-best-practices.md mkdir -p .windsurf/skills/judgment-mcp
curl -fLo .windsurf/skills/judgment-mcp/SKILL.md \
https://docs.judgmentlabs.ai/skills/mcp-server-best-practices.mdWhat You Can Do
Once connected, your AI assistant can query Judgment data through natural language. Here are some examples:
- "Show me the slowest traces from the last 24 hours"
- "Find traces where users asked about billing"
- "What behaviors were detected in session X?"
- "List all automations configured for this project"
- "Show me traces with errors that cost more than $0.10"
Use Production Data to Optimize Your Code
Beyond querying data, the MCP server enables a powerful workflow: use real production insights to improve your agent code. Your AI assistant can:
- Find failing patterns — search traces for errors, high latency, or unexpected behaviors, then fix the underlying code
- Analyze behavior trends — check which behaviors are firing most often and adjust your prompts or logic accordingly
- Optimize costs — identify expensive traces and refactor the agent flows that produce them
- Debug sessions — walk through an entire user session's traces to understand where things went wrong
Available Tools
The MCP server provides 16 tools organized into five categories.
Traces
| Tool | Description |
|---|---|
search_traces | Batch search up to 10 queries per call. Filter by duration, error, span name, customer ID, session ID, tags, LLM cost, behaviors, scores, or full-text search. |
get_trace_detail | Get duration, cost, and session info for a single trace. |
get_trace_spans | Get all spans for a trace. |
get_trace_span | Batch get span details (including scores and annotations) for up to 20 trace/span pairs. |
get_trace_tags | Get tags for a trace. |
get_trace_behaviors | Get behavior results (binary/categorical scores) for a trace. |
get_span_feedback | Get annotation feedback for a specific span. |
Sessions
| Tool | Description |
|---|---|
search_sessions | Search and filter sessions by session ID, trace count, latency, total cost, or behaviors. Supports time ranges, sorting, and pagination. |
get_session_detail | Get session timestamps, trace count, latency, cost, and token usage. |
get_session_trace_ids | Get all trace IDs in a session. |
get_session_trace_behaviors | Get behaviors detected across traces in a session, grouped by behavior. |
Behaviors
| Tool | Description |
|---|---|
list_behaviors | List all behaviors for the project with stats. |
get_behavior_detail | Get full details for a behavior including scorer prompt, configuration, and stats. |
get_judge_settings | Get advanced evaluation settings for a judge. |
Projects
| Tool | Description |
|---|---|
list_projects | List all projects in your organization with summary stats (datasets, experiment runs, traces, behaviors). |
Automations
| Tool | Description |
|---|---|
list_automations | List all automations (rules) with their conditions, actions, and active status. |