MCP Server

Use Judgment's MCP server to query traces, behaviors, sessions, and more directly from your AI code editor

The Judgment MCP server exposes your production data — traces, sessions, behaviors, projects, and automations — directly to AI-powered code editors via the Model Context Protocol. This lets your AI assistant query real production data, analyze agent performance, and use those insights to optimize your code — all without leaving your editor.

Setup

Connect the MCP Server

Add the following to your ~/.cursor/mcp.json (global) or .cursor/mcp.json (project-level):

{
  "mcpServers": {
    "judgment-mcp": {
      "url": "https://mcp.judgmentlabs.ai",
      "headers": {
        "Authorization": "Bearer <YOUR_JUDGMENT_API_KEY>",
        "X-Organization-Id": "<YOUR_ORGANIZATION_ID>"
      }
    }
  }
}

Run the following command to add the Judgment MCP server:

claude mcp add judgment-mcp \
  --transport http \
  --url https://mcp.judgmentlabs.ai \
  --header "Authorization: Bearer <YOUR_JUDGMENT_API_KEY>" \
  --header "X-Organization-Id: <YOUR_ORGANIZATION_ID>"

Add the following to your ~/.codeium/windsurf/mcp_config.json:

{
  "mcpServers": {
    "judgment-mcp": {
      "serverUrl": "https://mcp.judgmentlabs.ai",
      "headers": {
        "Authorization": "Bearer <YOUR_JUDGMENT_API_KEY>",
        "X-Organization-Id": "<YOUR_ORGANIZATION_ID>"
      }
    }
  }
}

Add the Best Practices Skill

To help your AI assistant use the MCP server effectively, add the Judgment MCP best practices skill. This teaches your assistant optimal patterns like batching queries, using full-text search first, and deduplicating results.

mkdir -p .cursor/skills/judgment-mcp
curl -fLo .cursor/skills/judgment-mcp/SKILL.md \
  https://docs.judgmentlabs.ai/skills/mcp-server-best-practices.md
mkdir -p .agents/skills/mcp-server-best-practices 
curl -fLo .agents/skills/mcp-server-best-practices/SKILL.md \
https://docs.judgmentlabs.ai/skills/mcp-server-best-practices.md 
mkdir -p .windsurf/skills/judgment-mcp
curl -fLo .windsurf/skills/judgment-mcp/SKILL.md \
  https://docs.judgmentlabs.ai/skills/mcp-server-best-practices.md

What You Can Do

Once connected, your AI assistant can query Judgment data through natural language. Here are some examples:

  • "Show me the slowest traces from the last 24 hours"
  • "Find traces where users asked about billing"
  • "What behaviors were detected in session X?"
  • "List all automations configured for this project"
  • "Show me traces with errors that cost more than $0.10"

Use Production Data to Optimize Your Code

Beyond querying data, the MCP server enables a powerful workflow: use real production insights to improve your agent code. Your AI assistant can:

  1. Find failing patterns — search traces for errors, high latency, or unexpected behaviors, then fix the underlying code
  2. Analyze behavior trends — check which behaviors are firing most often and adjust your prompts or logic accordingly
  3. Optimize costs — identify expensive traces and refactor the agent flows that produce them
  4. Debug sessions — walk through an entire user session's traces to understand where things went wrong

Available Tools

The MCP server provides 16 tools organized into five categories.

Traces

ToolDescription
search_tracesBatch search up to 10 queries per call. Filter by duration, error, span name, customer ID, session ID, tags, LLM cost, behaviors, scores, or full-text search.
get_trace_detailGet duration, cost, and session info for a single trace.
get_trace_spansGet all spans for a trace.
get_trace_spanBatch get span details (including scores and annotations) for up to 20 trace/span pairs.
get_trace_tagsGet tags for a trace.
get_trace_behaviorsGet behavior results (binary/categorical scores) for a trace.
get_span_feedbackGet annotation feedback for a specific span.

Sessions

ToolDescription
search_sessionsSearch and filter sessions by session ID, trace count, latency, total cost, or behaviors. Supports time ranges, sorting, and pagination.
get_session_detailGet session timestamps, trace count, latency, cost, and token usage.
get_session_trace_idsGet all trace IDs in a session.
get_session_trace_behaviorsGet behaviors detected across traces in a session, grouped by behavior.

Behaviors

ToolDescription
list_behaviorsList all behaviors for the project with stats.
get_behavior_detailGet full details for a behavior including scorer prompt, configuration, and stats.
get_judge_settingsGet advanced evaluation settings for a judge.

Projects

ToolDescription
list_projectsList all projects in your organization with summary stats (datasets, experiment runs, traces, behaviors).

Automations

ToolDescription
list_automationsList all automations (rules) with their conditions, actions, and active status.