judgment behaviors CLI Reference

The Judgment CLI is deprecated and no longer under active development. This reference is retained for existing users. See the CLI deprecation notice for supported alternatives.

View and manage behaviors (the binary or categorical labels judges assign).

Commands

Command	Description
`behaviors create-binary`	Create a binary (yes/no) behavior.
`behaviors create-classifier`	Create a classifier (multi-label) behavior.
`behaviors delete`	Delete a behavior.
`behaviors get`	Get a behavior with judge details and stats.
`behaviors list`	List behaviors.
`behaviors update`	Update a behavior’s description.

`behaviors create-binary`

Create a binary (yes/no) behavior.

Create a binary behavior. The judge LLM uses your prompt to decide true/false on each qualifying span.

judgment behaviors create-binary [OPTIONS] [[[ORG_ID] PROJECT_ID] NAME PROMPT...]

Arguments

Name	Required
`[[ORG_ID] PROJECT_ID] NAME PROMPT`	no

Options

Flag	Type	Required	Description
`--organization-id`, `--org-id`	text	no	Organization ID. Defaults to JUDGMENT_ORG_ID or saved context.
`--organization`, `--org`	text	no	Organization name to resolve.
`--project-id`	text	no	Project ID. Defaults to JUDGMENT_PROJECT_ID or saved context.
`--project`	text	no	Project name to resolve.
`--description`	text	no	Human-readable description shown in the UI.
`--model`	text	no	LLM model ID used by the judge prompt. Defaults to "gpt-5.3-codex" when omitted.
`--category-ids`	text	no	UUIDs of categories to attach the behavior to. Pass an array of category UUIDs.
`--advanced-settings`	text	no	JSON object overriding the judge's online-evaluation configuration. All four fields are required when this is supplied. Shape: `continuous` runs the judge automatically on qualifying spans; `on_demand` requires a manual `judgment traces evaluate` call. `online_sampling_rate` is a percent (0–100) of matching spans to score.
`--judge-id`	text	no	Attach the new behavior to an existing judge instead of creating one. The judge must be `score_type=binary` and have no existing behaviors.
`-o`, `--output`	`yaml`, `json`	no	Output format.

--advanced-settings shape

{
  "online_evaluation_mode": "continuous" | "on_demand",
  "online_sampling_rate": <number 0-100>,
  "online_span_triggers": [
    {"field":"span_name"|"span_attribute","operator":"contains"|"equals"|"exists","value":"<string>","key":"<attr-key>"?}
  ],
  "online_session_scoring": <bool>
}

`behaviors create-classifier`

Create a classifier (multi-label) behavior.

Create a classifier behavior. The judge LLM picks one of the supplied options for each qualifying span.

judgment behaviors create-classifier [OPTIONS] [[[ORG_ID] PROJECT_ID] NAME PROMPT...]

Arguments

Name	Required
`[[ORG_ID] PROJECT_ID] NAME PROMPT`	no

Options

Flag	Type	Required	Description
`--organization-id`, `--org-id`	text	no	Organization ID. Defaults to JUDGMENT_ORG_ID or saved context.
`--organization`, `--org`	text	no	Organization name to resolve.
`--project-id`	text	no	Project ID. Defaults to JUDGMENT_PROJECT_ID or saved context.
`--project`	text	no	Project name to resolve.
`--options`	text	yes	JSON array of the allowed output categories the classifier judge can return. Must contain at least one option. Shape:
`--model`	text	no	LLM model ID used by the judge prompt. Defaults to "gpt-5.5" when omitted.
`--category-ids`	text	no	UUIDs of categories to attach the behavior to. Pass an array of category UUIDs.
`--advanced-settings`	text	no	JSON object overriding the judge's online-evaluation configuration. All four fields are required when this is supplied. Shape: `continuous` runs the judge automatically on qualifying spans; `on_demand` requires a manual `judgment traces evaluate` call. `online_sampling_rate` is a percent (0–100) of matching spans to score.
`--judge-id`	text	no	Attach the new behavior to an existing judge instead of creating one. The judge must be `score_type=categorical` and have no existing behaviors.
`-o`, `--output`	`yaml`, `json`	no	Output format.

--options shape

[
  {"name":"<label>", "description":"<optional human description>", "category_ids":["<uuid>", ...]},
  ...
]

--advanced-settings shape

{
  "online_evaluation_mode": "continuous" | "on_demand",
  "online_sampling_rate": <number 0-100>,
  "online_span_triggers": [
    {"field":"span_name"|"span_attribute","operator":"contains"|"equals"|"exists","value":"<string>","key":"<attr-key>"?}
  ],
  "online_session_scoring": <bool>
}

`behaviors delete`

Delete a behavior.

judgment behaviors delete [OPTIONS] [[[ORG_ID] PROJECT_ID] BEHAVIOR_ID...]

Arguments

Name	Required
`[[ORG_ID] PROJECT_ID] BEHAVIOR_ID`	no

Options

Flag	Type	Required	Description
`--organization-id`, `--org-id`	text	no	Organization ID. Defaults to JUDGMENT_ORG_ID or saved context.
`--organization`, `--org`	text	no	Organization name to resolve.
`--project-id`	text	no	Project ID. Defaults to JUDGMENT_PROJECT_ID or saved context.
`--project`	text	no	Project name to resolve.
`--delete-scorer`	boolean	no	When true, also delete the underlying prompt scorer if no other behaviors reference it.
`--delete-all-values`	boolean	no	For classifier behaviors, when true deletes every category row for this judge (not just the provided behavior_id). Ignored for binary behaviors.
`-o`, `--output`	`yaml`, `json`	no	Output format.

`behaviors get`

Get a behavior with judge details and stats.

judgment behaviors get [OPTIONS] [[[ORG_ID] PROJECT_ID] BEHAVIOR_ID...]

Arguments

Name	Required
`[[ORG_ID] PROJECT_ID] BEHAVIOR_ID`	no

Options

Flag	Type	Required	Description
`--organization-id`, `--org-id`	text	no	Organization ID. Defaults to JUDGMENT_ORG_ID or saved context.
`--organization`, `--org`	text	no	Organization name to resolve.
`--project-id`	text	no	Project ID. Defaults to JUDGMENT_PROJECT_ID or saved context.
`--project`	text	no	Project name to resolve.
`--start-date`	text	no	Optional ISO 8601 start date for stats.
`--end-date`	text	no	Optional ISO 8601 end date for stats.
`-o`, `--output`	`yaml`, `json`	no	Output format.

`behaviors list`

List behaviors.

List every behavior in a project along with rolled-up trace counts and last-seen stats.

judgment behaviors list [OPTIONS] [[[ORG_ID] PROJECT_ID]...]

Arguments

Name	Required
`[[ORG_ID] PROJECT_ID]`	no

Options

Flag	Type	Required	Description
`--organization-id`, `--org-id`	text	no	Organization ID. Defaults to JUDGMENT_ORG_ID or saved context.
`--organization`, `--org`	text	no	Organization name to resolve.
`--project-id`	text	no	Project ID. Defaults to JUDGMENT_PROJECT_ID or saved context.
`--project`	text	no	Project name to resolve.
`-o`, `--output`	`table`, `yaml`, `json`	no	Output format.

`behaviors update`

Update a behavior’s description.

judgment behaviors update [OPTIONS] [[[ORG_ID] PROJECT_ID] BEHAVIOR_ID...]

Arguments

Name	Required
`[[ORG_ID] PROJECT_ID] BEHAVIOR_ID`	no

Options

Flag	Type	Required	Description
`--organization-id`, `--org-id`	text	no	Organization ID. Defaults to JUDGMENT_ORG_ID or saved context.
`--organization`, `--org`	text	no	Organization name to resolve.
`--project-id`	text	no	Project ID. Defaults to JUDGMENT_PROJECT_ID or saved context.
`--project`	text	no	Project name to resolve.
`--description`	text	no	New human-readable description for the behavior.
`-o`, `--output`	`yaml`, `json`	no	Output format.

judgment behaviors

Commands

behaviors create-binary

behaviors create-classifier

behaviors delete

behaviors get

behaviors list

behaviors update

On this page

`behaviors create-binary`

`behaviors create-classifier`

`behaviors delete`

`behaviors get`

`behaviors list`

`behaviors update`