Unit Testing
CI pipelines are the core of all mature software engineering practices.
With LLMs, developers should expect nothing less.
Using judgeval
, you can easily unit test your LLM applications for consistency and quality in any metric of your choice.
Unit testing is natively supported in judgeval
through the client.assert_test
(Python) or client.assertTest
(Typescript) method.
This also integrates with popular testing frameworks like pytest
(Python) or jest
/vitest
(Typescript), meaning you won't have to learn any new testing frameworks!
Single Step Testing
import pytest # Added import
from judgeval import JudgmentClient
from judgeval.data import Example
from judgeval.scorers import FaithfulnessScorer
def test_faithfulness():
client = JudgmentClient()
example = Example(
input="What is the capital of France?",
actual_output="The capital of France is Lyon.", # Hallucinated output
retrieval_context=["Come tour Paris' museums in the capital of France!"],
)
# Example contains a hallucination, so we should expect an exception/assertion error
# when the threshold is 1.0 (expecting perfect faithfulness)
with pytest.raises(AssertionError):
client.assert_test(
eval_run_name="test_faithfulness_fail",
examples=[example],
scorers=[FaithfulnessScorer(threshold=1.0)],
model="gpt-4o" # Added model parameter
)
# This should pass as the threshold is low
client.assert_test(
eval_run_name="test_faithfulness_pass",
examples=[example],
scorers=[FaithfulnessScorer(threshold=0.1)],
model="gpt-4o" # Added model parameter
)
judgeval
naturally integrates into your CI pipelines, allowing you to execute robust unit tests across your entire codebase.
This allows you to catch regressions in your LLM applications before they make it to production!