Rate limits

Memic applies rate limits per API key to protect the service and ensure fair usage. Limits vary by endpoint — ingestion endpoints are more generous than search/chat because indexing is long-running but happens less often.

Rate limits are evolving as Memic scales. The numbers below are defaults; if your workload needs higher limits, contact us via the dashboard.

Default limits (per API key)

Endpoint category	Limit	Window
Search (`POST /search`)	60 requests	per minute
Chat (`POST /chat`)	30 requests	per minute
File ingestion (`POST /files/init`, `/confirm`)	120 requests	per minute
Status polls (`GET /files/{id}/status`)	300 requests	per minute
List endpoints (`GET /files`, `GET /projects`)	120 requests	per minute
Prompts (`GET /prompts/{name}`)	600 requests	per minute

When you hit a limit

The API returns 429 Too Many Requests with a Retry-After header in seconds:

HTTP/1.1 429 Too Many Requests
Retry-After: 12
Content-Type: application/json

{
  "detail": "Rate limit exceeded. Retry after 12 seconds.",
  "code": "rate_limit_exceeded",
  "request_id": "req_01J9..."
}

Respect the Retry-After value. Do not retry faster — doing so will extend your penalty window.

Handling 429 in code

The Python SDK handles this automatically with exponential backoff + jitter. If you’re hitting the API directly, use the Retry-After header:

import time
import requests

def call_with_retry(url, headers, json, max_attempts=5):
    for attempt in range(max_attempts):
        r = requests.post(url, headers=headers, json=json)
        if r.status_code != 429:
            return r
        wait = int(r.headers.get("Retry-After", 2 ** attempt))
        time.sleep(wait)
    r.raise_for_status()

Best practices

Cache identical search/chat queries — if you’re running the same query repeatedly (e.g. for every page view), add a short client-side cache
Batch where you can — use the list endpoint with page_size=100 instead of polling item-by-item
Back off on failure — don’t tight-loop retries
Use the Python SDK — it handles retry and backoff automatically

Need higher limits?

Reach out via the dashboard. Most workloads comfortably fit within the defaults, but we regularly adjust for customers with legitimate traffic needs.

Errors

Full error format reference.

Introduction

Endpoints

Default limits (per API key)

When you hit a limit

Handling 429 in code

Best practices

Need higher limits?

Errors

​Default limits (per API key)

​When you hit a limit

​Handling 429 in code

​Best practices

​Need higher limits?

​Related

Errors

Default limits (per API key)

When you hit a limit

Handling 429 in code

Best practices

Need higher limits?

Related