Multi-tenant RAG for SaaS

This recipe combines environments, chat, and prompts into a complete multi-tenant RAG pattern. Use it as a blueprint if you’re adding AI features to an existing SaaS product.

The architecture

User (logged in)
  ↓
Your app (auth, session)
  ↓
Your backend (loads the right Memic key per customer)
  ↓
Memic API (scoped to one environment)
  ↓
Documents indexed per customer

Each customer has exactly one environment. Your backend translates “this authenticated user belongs to Customer X” into “use Memic API key X”. Customer data never crosses environment boundaries.

Prerequisites

Working app with authentication and a customer/tenant concept
Memic project with at least one environment
Python 3.10+ (examples use FastAPI)
pip install memic fastapi uvicorn

Step 1 — Provisioning

See the Per-customer isolation recipe for provisioning code. At the end, every customer row in your DB has a memic_api_key_encrypted column.

Step 2 — Document upload endpoint

Let customers upload their own files:

from fastapi import FastAPI, Depends, UploadFile, File
from memic import Memic

app = FastAPI()


@app.post("/documents")
async def upload_document(
    file: UploadFile = File(...),
    memic: Memic = Depends(get_memic_client),  # returns customer-scoped client
):
    # Save to a temp file first, then upload
    temp_path = f"/tmp/{file.filename}"
    with open(temp_path, "wb") as f:
        f.write(await file.read())

    result = memic.files.upload(temp_path)
    return {"file_id": result.file_id, "status": "processing"}

Step 3 — Status polling endpoint

Customers need to know when their file is searchable:

@app.get("/documents/{file_id}/status")
def document_status(
    file_id: str,
    memic: Memic = Depends(get_memic_client),
):
    return memic.files.get_status(file_id)

Step 4 — Chat endpoint

The core of the RAG experience:

from pydantic import BaseModel


class ChatRequest(BaseModel):
    messages: list[dict]


@app.post("/chat")
def chat(
    req: ChatRequest,
    memic: Memic = Depends(get_memic_client),
):
    # Load the live version of the system prompt from Memic
    system_prompt = memic.prompts.get("chatbot-system")
    rendered = system_prompt.render(
        company_name=get_current_customer().name,
    )

    messages = [
        {"role": "system", "content": rendered},
        *req.messages,
    ]

    result = memic.chat(messages=messages)
    return {
        "answer": result.answer,
        "citations": result.citations,
    }

Notice two things:

The system prompt lives in Memic, not in your code. You can tune the assistant’s persona from the dashboard without shipping new code.
memic is scoped to the current customer via Depends(get_memic_client). Search, chat, prompts — everything is automatically tenant-isolated.

Step 5 — Frontend integration

Your frontend calls your backend (/documents, /chat) the same as it would any other API. It doesn’t know Memic exists. Memic is an implementation detail of your backend, not a direct frontend dependency.

Do NOT call Memic directly from your frontend. API keys would be exposed in the browser. Always proxy through your backend.

Testing

At minimum, test:

Isolation — upload a file as Customer A, query as Customer B, assert zero results
Prompt rendering — fetch chatbot-system, render with a fake company_name, assert the output contains the company name
End-to-end chat — upload a known file, wait for ready, chat, assert the answer references the file

What’s next?

Rate limiting your API — even though Memic has its own rate limits, add your own limits per customer to prevent abuse
Usage tracking — log Memic calls per customer so you can bill or monitor usage
Caching — for frequently-repeated queries, add a Redis cache in front of Memic
Observability — instrument the Memic calls with OpenTelemetry so you can trace end-to-end latency

Per-customer isolation

Deep dive on the isolation pattern.

Production checklist

Before going live.

Get Started

Guides

Recipes

Multi-tenant RAG for SaaS

The architecture

Prerequisites

Step 1 — Provisioning

Step 2 — Document upload endpoint

Step 3 — Status polling endpoint

Step 4 — Chat endpoint

Step 5 — Frontend integration

Testing

What’s next?

Per-customer isolation

Production checklist

​The architecture

​Prerequisites

​Step 1 — Provisioning

​Step 2 — Document upload endpoint

​Step 3 — Status polling endpoint

​Step 4 — Chat endpoint

​Step 5 — Frontend integration

​Testing

​What’s next?

​Related

Per-customer isolation

Production checklist

The architecture

Prerequisites

Step 1 — Provisioning

Step 2 — Document upload endpoint

Step 3 — Status polling endpoint

Step 4 — Chat endpoint

Step 5 — Frontend integration

Testing

What’s next?

Related