Install Chroma via pypi to easily run the backend server:
Copy
Ask AI
pip install chromadb
Run the Chroma server:
Copy
Ask AI
chroma run --path ./getting-started
Populate ChromaDB with a set of documents
Make sure to replace the placeholders with your actual keys and dataset ID created in the previous step.
setup-chroma.js
Copy
Ask AI
const { ChromaClient, OpenAIEmbeddingFunction } = require("chromadb");const client = new ChromaClient();const embedder = new OpenAIEmbeddingFunction({ openai_api_key: "<your-openai-key>",});async function setupChroma() { const collection = await client.getOrCreateCollection({ name: "travel_collection", embeddingFunction: embedder, }); await collection.upsert({ documents: [ "Booking Confirmations: You are required to send a booking confirmation for each reservation. When reserving outside the XYZ Portal, please ensure to email the confirmation to [email protected].", "EU Travel Restrictions: Travelers from other countries are not allowed to enter Austria unless they are residents or meet certain exemptions.", "Traveling with Pets: Although pets used to be allowed on international flights, due to recent changes in regulations, you cannot bring your pet on an international flight.", "Travel Insurance: It is recommended that you purchase travel insurance to protect yourself from unexpected events such as trip cancellations or medical emergencies.", "Traveling with Medication: If you are traveling with medication, make sure to carry a copy of your prescription and keep the medication in its original packaging.", ], ids: ["id1", "id2", "id3", "id4", "id5"], }); return collection;}async function searchChroma(query, nResults = 3) { const collection = await client.getCollection({ name: "travel_collection", embeddingFunction: embedder, }); const results = await collection.query({ queryTexts: [query], nResults, }); if (results && results.documents && results.documents.length > 0) { return results.documents[0]; } return [];}module.exports = { setupChroma, searchChroma };
Create a Q/A dataset of scenarios
This dataset contains a list of questions and answers. The answers can be found within the ChromaDB collection we just created.
For Input Columns, select “question”, and for Output Columns, select “answer”.
Name it “Q/A Scenarios” and click Create.
Copy the dataset ID by clicking on the Copy ID button.
Modify the evaluation script for RAG
Make sure to replace the placeholders with your actual keys and dataset ID created in the previous step.
Modify the evals.js file from the Quickstart tutorial to perform a semantic search using ChromaDB, when answering questions. See the newly added answerQuestion function:
import chromadbimport osos.environ["TOKENIZERS_PARALLELISM"] = "false"chroma_client = chromadb.Client()collection = chroma_client.create_collection(name="travel_collection")collection.add( documents=[ "Booking Confirmations: You are required to send a booking confirmation for each reservation. When reserving outside the XYZ Portal, please ensure to email the confirmation to [email protected].", "EU Travel Restrictions: Travelers from other countries are not allowed to enter Austria unless they are residents or meet certain exemptions.", "Traveling with Pets: Although pets used to be allowed on international flights, due to recent changes in regulations, you cannot bring your pet on an international flight.", "Travel Insurance: It is recommended that you purchase travel insurance to protect yourself from unexpected events such as trip cancellations or medical emergencies.", "Traveling with Medication: If you are traveling with medication, make sure to carry a copy of your prescription and keep the medication in its original packaging.", ], ids=["id1", "id2", "id3", "id4", "id5"])def search_chroma(query, n_results=3): results = collection.query( query_texts=[query], n_results=n_results ) if results and results['documents'] and len(results['documents']) > 0: return results['documents'][0] return []
Create a Q/A dataset of scenarios
This dataset contains a list of questions and answers. The answers can be found within the ChromaDB collection we just created.
For Input Columns, select “question”, and for Output Columns, select “answer”.
Name it “Q/A Scenarios” and click Create.
Copy the dataset ID by clicking on the Copy ID button.
Modify the evaluation script for RAG
Make sure to replace the placeholders with your actual keys and dataset ID created in the previous step.
Modify the evals.py file from the Quickstart tutorial to perform a semantic search using ChromaDB, when answering questions. See the new changes to the answer_question function: