Typescript

PromptScorer

Evaluate agent behavior based on a rubric you define and iterate on the platform.

A PromptScorer is a powerful tool for evaluating your agent's behavior in production with use-case specific, natural language rubrics.

import { Judgeval, Example } from "judgeval";

const client = Judgeval.create();

const tracer = await client.nodeTracer.create({
  projectName: "qa_assistant",
  enableEvaluation: true,
});

const scorer = await client.scorers.tracePromptScorer.get(
  "QA Answer Quality Scorer",
);

const processQuery = tracer.observe(async function (query: string) {
  const result = await generateResponse(query);

  tracer.asyncTraceEvaluate(scorer, "gpt-4"); 

  return result;
});

All scorer changes automatically sync with the Judgment platform.


client.scorers.promptScorer.get() | client.scorers.tracePromptScorer.get()

Fetches a Prompt Scorer or Trace Prompt Scorer configuration from the Judgment platform.

async get(name: string): Promise<PromptScorer>

Parameters

namerequired:string

The name of the PromptScorer you would like to retrieve from the platform

Returns

Promise<PromptScorer> - The fetched PromptScorer instance

Throws

  • Error if the scorer is a TracePromptScorer instead of a PromptScorer
  • Error if the scorer is not found or API call fails

Example

import { Judgeval } from "judgeval";

const client = Judgeval.create();

const scorer = await client.scorers.promptScorer.get("My Prompt Scorer");
import { Judgeval } from "judgeval";

const client = Judgeval.create();

const scorer = await client.scorers.tracePromptScorer.get("My Trace Prompt Scorer");

client.scorers.promptScorer.create() | client.scorers.tracePromptScorer.create()

Creates a new PromptScorer or Trace Prompt Scorer with custom configuration.

create(config: PromptScorerConfig): PromptScorer

Parameters

configrequired:PromptScorerConfig
Configuration object for the PromptScorer
PromptScorerConfig
namerequired:string
The name of the PromptScorer
promptrequired:string

The prompt used by the LLM judge to make an evaluation

threshold:number

Threshold value for success (typically 0-1)

Default: 0.5
model:string
Model to use for scoring
Default: "gpt-4o-mini"
options:Record<string, number>

If specified, the LLM judge will pick from one of the choices, and the score will be the one corresponding to the choice

Default:
description:string
Description of the scorer
Default: ""

Returns

PromptScorer - The created PromptScorer instance

Throws

  • Error if name or prompt is not provided

Example

import { Judgeval } from "judgeval";

const client = Judgeval.create();

const scorer = client.scorers.promptScorer.create({
  name: "Rhyme Scorer",
  prompt: "Evaluate whether the two inputs rhyme: {{word_1}}, {{word_2}}",
  threshold: 0.5,
  model: "gpt-5",
  options: {
    "does not rhyme": 0,
    "nearly rhymes": 0.75,
    "rhymes": 1,
  },
  description: "Evaluates whether the two words rhyme",
});
import { Judgeval } from "judgeval";

const client = Judgeval.create();

const scorer = client.scorers.tracePromptScorer.create({
  name: "Workflow Coherence",
  prompt: `
    Evaluate the coherence of this multi-step workflow.
    Consider: logical flow, error handling, and completeness.
  `,
  threshold: 0.7,
  model: "gpt-4",
  options: {
    "poor coherence": 0,
    "acceptable coherence": 0.5,
    "good coherence": 0.8,
    "excellent coherence": 1,
  },
  description: "Evaluates overall workflow execution quality",
});

Complete Usage Example

Here's a complete example showing how to use scorers with tracing and evaluation:

complete_example.ts
import { Judgeval } from "judgeval";
import { OpenAIInstrumentation } from "@opentelemetry/instrumentation-openai";
import OpenAI from "openai";

const client = Judgeval.create();

const tracer = await client.nodeTracer.create({
  projectName: "research_assistant",
  enableEvaluation: true,
  enableMonitoring: true,
  instrumentations: [new OpenAIInstrumentation()],
  resourceAttributes: {
    "service.name": "research-api",
    "service.version": "1.0.0",
  },
});

const scorer = await client.scorers.tracePromptScorer.get("answer-quality");

const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
});

const processQuery = tracer.observe(async function (query: string) {
  const response = await openai.chat.completions.create({
    model: "gpt-4o-mini",
    messages: [{ role: "user", content: query }],
  });

  const result = response.choices[0].message.content || "";

  tracer.asyncTraceEvaluate(scorer, "gpt-4");

  return result;
}, "llm");

const result = await processQuery("What is machine learning?");
console.log(result);

await tracer.shutdown();