Example: Sam’s Car Dealership 🚗

Sam’s Car Dealership offers a range of vehicle services, including maintenance and repairs for customers, while also providing leasing options and selling both new and used cars. Their agent can schedule appointments and collect customer information.

In this example, we will build a Hamming test agent taking on the role of the customer to evaluate how well Sam’s AI voice agent schedules appointments for Sam’s customers.

Components of the Voice Test Agent

Step-by-Step Tutorial

1

Create A Dataset

Pro Tip: Hamming accepts CSV & JSON files up to 10MB.

  1. Visit app.hamming.ai/datasets to create a dataset.
  2. Click on +Add New Dataset. You can either upload an existing dataset or create an empty dataset and then use the JSON editor to build one. The dataset should contain all phone calls scenarios you’d like Hamming’s test agent to simulate.
  3. Enter a dataset name and description.
  4. Select the input columns. The input columns are the data points you’d like the Hamming test agent to reference.
2

Create A Scoring Prompt

Pro Tip: We recommend using gpt-4o with temperature 0 to get the best results.

  1. Visit app.hamming.ai/prompts to create a scoring prompt. The prompt should contain concise instructions for the scorer.
  2. Click on +Add Prompt.
  3. Enter a prompt name and description.
  4. Select the model that you want to use to evaluate your AI voice agent.
  5. Add a system prompt.
  1. Add a user prompt.

Important: The result of the prompt should be structured as an XML format, as outlined below.

  1. Save the prompt and deploy to production.
3

Create A Scorer

Important: Ensure that the values in the scorer align with the labels from your scoring prompt. For example, if the value “0” represents “No” in your prompt, make sure the scorer also labels “0” as “No.”

  1. Visit app.hamming.ai/evals to create a scorer, which simply acts as a wrapper for the scoring prompt.
  2. Click on +Add Scorer.
  3. Enter a scorer name and description.
  4. Add values and labels which will be used to display the results of the test.
  5. Select the scoring prompt you created in Step 2 and set the prompt label to production.
  6. Change the Variable Mappings field to Output and enter transcript in the field on the right.
4

Create A Hamming Test Agent

Pro Tip: If you want to speak to Hamming’s test agent, you can enter your own phone number and Hamming’s test agent will call you.

  1. Visit app.hamming.ai/voice-agents to set up a Hamming test agent (customer).
  2. Click on +Add Voice Agent.
  3. Enter a name.
  4. Select Function from the Scoring dropdown and select the scorer you created in the previous step.
  5. Configure a prompt for the Hamming test agent with input variables from your dataset, allowing it to simulate a variety of customer situations.

It is critical that you specify the input variables from your dataset created in the Test Agent (Customer) prompt as shown in the template below. This ensures that the Hamming Test agent references the input variables when speaking to your voice agent.

5

Configure post-call webhooks

If your agent doesn’t have any function calling, then you can skip to the next step.

Most voice agents are dynamic and perform actions during the call. To evaluate the performance of your voice agent, you need to configure a post-call webhook so we can capture the output of the call. (audio file, transcript, traces, etc.)

  1. For the Hamming test agent you created in the previous step, click on Edit.
  2. Find the Retell and Vapi webhooks and copy the webhook URL.
  1. Log-in to your Retell or Vapi account and put our webhook URL in the Post-call webhook field.
  2. This will allow Hamming to capture the complete output of the call (what was said and what was done) and evaluate the performance of your voice agent.
6

Run Hamming Test Agent

  1. Under dataset, select the dataset you created in Step 1.
  2. Click on Run.
  3. Enter the phone number of your ai voice agent.
  4. Click on Run Calls.
  5. Once your calls have finished, you will be able to see the results in the evaluation column.