Overview
The Statsig Python AI SDK lets you manage your prompts, online and offline evals, and debug your LLM applications in production. It depends upon the Statsig Python Server SDK, but provides convenient hooks for AI-specific functionality.1
Install the SDK
2
Initialize the SDK
If you already have a Statsig instance, you can pass it into the SDK. Otherwise, we’ll create an instance for you internally.
- Don't use Statsig
- Already have Statsig instance
Initialize the AI SDK with a Server Secret Key from the Statsig console.
Initializing With Options
Initializing With Options
Optionally, you can configure StatsigOptions for your Statsig instance:
Using the SDK
Getting a Prompt
Statsig can act as the control plane for your LLM prompts, allowing you to version and change them without deploying code. For more information, see the Prompts documentation.Logging Eval Results
When running an online eval, you can log results back to Statsig for analysis. Provide a score between 0 and 1, along with the grader name and any useful metadata (e.g., session IDs). Currently, you must provide the grader manually — future releases will support automated grading options.Programmatic Evaluation
Programmatic evaluation allows you to run evaluations on datasets programmatically, automatically scoring outputs and sending results to Statsig for analysis. With programmatic evaluation, you can:- Run evaluations on datasets: Process arrays, iterators, or async generators of input/expected pairs
- Define custom tasks: Create functions that generate outputs from inputs (supports both sync and async)
- Score outputs: Use single or multiple named scorer functions to evaluate outputs (supports boolean, numeric, or metadata-rich scores)
- Use parameters: Pass dynamic parameters to tasks using Zod schemas (Node) or dictionaries (Python)
- Categorize data: Group evaluation records by categories for better analysis
- Compute summary scores: Aggregate results across all records with custom summary functions
- Handle errors gracefully: Task and scorer errors are caught and reported without stopping the evaluation
expected field in data records is optional - scorers can evaluate outputs without expected values. Task and scorer errors are automatically caught and reported in the results.
OpenTelemetry (OTEL)
The AI SDK works with OpenTelemetry for sending telemetry to Statsig. You can enable OTel tracing by calling theinitializeTracing function.
You can also provide a custom TracerProvider to the initializeTracing function if you want to customize the tracing behavior.
More advanced OTel configuration and exporter support are on the way.
Otel is not supported in the Python AI SDK yet. Coming soon!