OfflineTestRunner

Executes the offline-test lifecycle for a test config (the TypeScript port of the Python `OfflineTestRunner`): resolve the dataset version, optionally run the agent to produce offline traces, create the test run, wait for terminal status, fetch results, evaluate the pass condition, and report successes.

Executes the offline-test lifecycle for a test config (the TypeScript port of the Python OfflineTestRunner): resolve the dataset version, optionally run the agent to produce offline traces, create the test run, wait for terminal status, fetch results, evaluate the pass condition, and report successes.

runAgent()

Run the agent once per dataset example, producing one offline trace each.

NOTE: the offline-tracer lifecycle here (active-tracer swap, async observe, per-example trace attribution) still needs validation against a live run.

async function runAgent(agentFunction: AgentFunction, examples: ExampleRow[]): Promise<Record<string, string>>

Parameters

agentFunction
required

AgentFunction

examples
required

ExampleRow[]

Returns

Promise<Record<string, string>>

OfflineTestRunner

runAgent()

Parameters

agentFunctionrequired

examplesrequired

Returns

On this page

agentFunction
required

examples
required