Behaviors
View and manage behaviors (the binary or categorical labels judges assign).
View and manage behaviors (the binary or categorical labels judges assign).
Commands
| Command | Description |
|---|---|
behaviors create-binary | Create a binary (yes/no) behavior. |
behaviors create-classifier | Create a classifier (multi-label) behavior. |
behaviors delete | Delete a behavior. |
behaviors get | Get a behavior with judge details and stats. |
behaviors list | List behaviors. |
behaviors update | Update a behavior’s description. |
behaviors create-binary
Create a binary (yes/no) behavior.
Create a binary behavior. The judge LLM uses your prompt to decide true/false on each qualifying span.
judgment behaviors create-binary [OPTIONS] <PROJECT_ID> <NAME> <PROMPT>Arguments
| Name | Required |
|---|---|
PROJECT_ID | yes |
NAME | yes |
PROMPT | yes |
Options
| Flag | Type | Required | Description |
|---|---|---|---|
--description | text | no | Human-readable description shown in the UI. |
--model | text | no | LLM model ID used by the judge prompt. Defaults to "gpt-5.2" when omitted. |
--category-ids | text | no | UUIDs of categories to attach the behavior to. Pass an array of category UUIDs. |
--advanced-settings | text | no | JSON object overriding the judge's online-evaluation configuration. All four fields are required when this is supplied. Shape: continuous runs the judge automatically on qualifying spans; on_demand requires a manual judgment traces evaluate call. online_sampling_rate is a percent (0–100) of matching spans to score. |
--judge-id | text | no | Attach the new behavior to an existing judge instead of creating one. The judge must be score_type=binary and have no existing behaviors. |
--advanced-settings shape
{
"online_evaluation_mode": "continuous" | "on_demand",
"online_sampling_rate": <number 0-100>,
"online_span_triggers": [
{"field":"span_name"|"span_attribute","operator":"contains"|"equals"|"exists","value":"<string>","key":"<attr-key>"?}
],
"online_session_scoring": <bool>
}behaviors create-classifier
Create a classifier (multi-label) behavior.
Create a classifier behavior. The judge LLM picks one of the supplied options for each qualifying span.
judgment behaviors create-classifier [OPTIONS] <PROJECT_ID> <NAME> <PROMPT>Arguments
| Name | Required |
|---|---|
PROJECT_ID | yes |
NAME | yes |
PROMPT | yes |
Options
| Flag | Type | Required | Description |
|---|---|---|---|
--options | text | yes | JSON array of the allowed output categories the classifier judge can return. Must contain at least one option. Shape: |
--model | text | no | LLM model ID used by the judge prompt. Defaults to "gpt-5.2" when omitted. |
--category-ids | text | no | UUIDs of categories to attach the behavior to. Pass an array of category UUIDs. |
--advanced-settings | text | no | JSON object overriding the judge's online-evaluation configuration. All four fields are required when this is supplied. Shape: continuous runs the judge automatically on qualifying spans; on_demand requires a manual judgment traces evaluate call. online_sampling_rate is a percent (0–100) of matching spans to score. |
--judge-id | text | no | Attach the new behavior to an existing judge instead of creating one. The judge must be score_type=categorical and have no existing behaviors. |
--options shape
[
{"name":"<label>", "description":"<optional human description>", "category_ids":["<uuid>", ...]},
...
]--advanced-settings shape
{
"online_evaluation_mode": "continuous" | "on_demand",
"online_sampling_rate": <number 0-100>,
"online_span_triggers": [
{"field":"span_name"|"span_attribute","operator":"contains"|"equals"|"exists","value":"<string>","key":"<attr-key>"?}
],
"online_session_scoring": <bool>
}behaviors delete
Delete a behavior.
judgment behaviors delete [OPTIONS] <PROJECT_ID> <BEHAVIOR_ID>Arguments
| Name | Required |
|---|---|
PROJECT_ID | yes |
BEHAVIOR_ID | yes |
Options
| Flag | Type | Required | Description |
|---|---|---|---|
--delete-scorer | boolean | no | When true, also delete the underlying prompt scorer if no other behaviors reference it. |
--delete-all-values | boolean | no | For classifier behaviors, when true deletes every category row for this judge (not just the provided behavior_id). Ignored for binary behaviors. |
behaviors get
Get a behavior with judge details and stats.
judgment behaviors get [OPTIONS] <PROJECT_ID> <BEHAVIOR_ID>Arguments
| Name | Required |
|---|---|
PROJECT_ID | yes |
BEHAVIOR_ID | yes |
Options
| Flag | Type | Required | Description |
|---|---|---|---|
--start-date | text | no | Optional ISO 8601 start date for stats. |
--end-date | text | no | Optional ISO 8601 end date for stats. |
behaviors list
List behaviors.
List every behavior in a project along with rolled-up trace counts and last-seen stats.
judgment behaviors list <PROJECT_ID>Arguments
| Name | Required |
|---|---|
PROJECT_ID | yes |
behaviors update
Update a behavior’s description.
judgment behaviors update [OPTIONS] <PROJECT_ID> <BEHAVIOR_ID>Arguments
| Name | Required |
|---|---|
PROJECT_ID | yes |
BEHAVIOR_ID | yes |
Options
| Flag | Type | Required | Description |
|---|---|---|---|
--description | text | no | New human-readable description for the behavior. |