Skip to main content

What is Chroma?

Chroma (often called ChromaDB) is an open-source, AI-native vector database specifically designed to store and manage vector embeddings. It serves as the “long-term memory” for Large Language Models (LLMs) and other AI applications. While traditional databases (like SQL) are built for structured data like names and numbers, Chroma is built to handle the complex, high-dimensional numerical data that represents the “meaning” of text, images, and audio.

Key Features

  • Built-in Embeddings: Automatically converts text into vectors using integrated support for OpenAI, HuggingFace, and Cohere.
  • Flexible Storage: Runs entirely in-memory for rapid prototyping or as a persistent client-server database for production.
  • Metadata Filtering: Allows you to tag vectors with additional data and filter search results for higher precision.
  • Rich Integrations: Seamlessly connects with popular AI frameworks like LangChain, LlamaIndex, and AutoGPT.

Use Cases

  • Powers real-time chatbots by retrieving relevant knowledge from company PDFs or technical manuals to answer user questions accurately.
  • Enables semantic document search that finds files based on their actual meaning rather than just matching specific keywords.
  • Creates long-term memory for AI agents so they can remember past conversations and user preferences across multiple sessions.
  • Supports anomaly detection in cybersecurity by identifying data points that deviate significantly from “normal” behavior patterns stored in the database

Getting Started

  1. Go to Vector Database Service in your dashboard
  2. Select Chroma as your desired type of database
  3. Engine Version have to be choosed from available options.
  4. Give Connection Name ,choose friendly connection name (e.g. staging-db) .
  5. Create Database User with appropriate privledges.
  6. Password for Database User to keep it secure.
  7. Give Default Database/Schema name to connect to.
  8. Pick a region to deploy your database instance.
Chroma Instance Creation - vector database ##Connection Examples

Node js

import { ChromaClient } from "chromadb";

async function main() {
  // 1. Initialize the client (defaults to http://localhost:8000)
  const client = new ChromaClient();

  // 2. Create or get a collection
  const collection = await client.getOrCreateCollection({
    name: "my_javascript_collection",
  });

  // 3. Add data (Chroma handles embeddings via default model or API)
  await collection.add({
    ids: ["id1", "id2"],
    metadatas: [{ source: "wiki" }, { source: "news" }],
    documents: ["Chroma is an AI-native database", "JavaScript is a versatile language"],
  });

  // 4. Perform a semantic search
  const results = await collection.query({
    queryTexts: ["How can I store AI data?"],
    nResults: 1,
  });

  console.log(results);
}

main();

Python

import chromadb
from chromadb.config import Settings

# 1. Setup connection details
# Replace 'your-api-key' and the host with your actual service details
api_key = "YOUR_ANTRYK_API_KEY"
host_url = "api.antryk.com" # Example host for Antryk

client = chromadb.HttpClient(
    host=host_url,
    port=443,
    ssl=True,
    headers={
        "Authorization": f"Bearer {api_key}"
    }
)

# 2. Use the client as usual
collection = client.get_or_create_collection(name="cloud_collection")

collection.add(
    documents=["This is a document stored in the cloud"],
    ids=["doc_cloud_1"]
)

print(f"Collection count: {collection.count()}")

Scaling

  • Vertical Scaling: Increase CPU and memory
  • Horizontal Scaling: Add read replicas
  • Storage: Automatic storage scaling

Security

  • SSL/TLS: Encrypted connections required
  • VPC Integration: Private network connectivity
  • IP Whitelisting: Restrict access by IP
  • Authentication: Username/password auth

Backups

  • Automatic Backups: Daily at scheduled time
  • Manual Backups: On-demand backups

Monitoring

Track database performance with:
  • Query Performance: Slow query identification
  • Storage: Disk usage and growth trends
  • CPU & Memory: Resource utilization

Create Chroma Database

Get started with Chroma vector databases