Test new prompts variants and run evals using scenarios.
The prompt playground, management and optimizer are closely related. The playground is a tool for testing new prompts variants and run custom metrics on scenarios by hand. The optimizer uses LLMs to automatically optimize the prompts. Often, after you’re done optimizing, you’ll want to store the best performing prompt in our prompt management.
We’re going to merge the playground with Prompt Optimizer to help you optimize your prompts in a more unified way.
You can store, manage and access your prompts here.
Start a new playground session
You can create a playground session from scratch by visiting the playground.
You can also take an existing prompt from the prompt management and start iterating on it in the playground.
Add tools
We support OpenAI & Anthropic function calling including syntax highlighting, auto-complete and auto-formatting.
Run evals
You can re-use all existing datasets of scenarios and run evals on them in 1-click. In this example below, we’re using the facts compare evaluator to compare the factual content of the LLM responses with expected responses for each row in the dataset of scenarios.
Save as a prompt
You can save the work-in-progress prompt as a new prompt.
Test new prompts variants and run evals using scenarios.
The prompt playground, management and optimizer are closely related. The playground is a tool for testing new prompts variants and run custom metrics on scenarios by hand. The optimizer uses LLMs to automatically optimize the prompts. Often, after you’re done optimizing, you’ll want to store the best performing prompt in our prompt management.
We’re going to merge the playground with Prompt Optimizer to help you optimize your prompts in a more unified way.
You can store, manage and access your prompts here.
Start a new playground session
You can create a playground session from scratch by visiting the playground.
You can also take an existing prompt from the prompt management and start iterating on it in the playground.
Add tools
We support OpenAI & Anthropic function calling including syntax highlighting, auto-complete and auto-formatting.
Run evals
You can re-use all existing datasets of scenarios and run evals on them in 1-click. In this example below, we’re using the facts compare evaluator to compare the factual content of the LLM responses with expected responses for each row in the dataset of scenarios.
Save as a prompt
You can save the work-in-progress prompt as a new prompt.