Skip to main content
Memic applies rate limits per API key to protect the service and ensure fair usage. Limits vary by endpoint — ingestion endpoints are more generous than search/chat because indexing is long-running but happens less often.
Rate limits are evolving as Memic scales. The numbers below are defaults; if your workload needs higher limits, contact us via the dashboard.

Default limits (per API key)

Endpoint categoryLimitWindow
Search (POST /search)60 requestsper minute
Chat (POST /chat)30 requestsper minute
File ingestion (POST /files/init, /confirm)120 requestsper minute
Status polls (GET /files/{id}/status)300 requestsper minute
List endpoints (GET /files, GET /projects)120 requestsper minute
Prompts (GET /prompts/{name})600 requestsper minute

When you hit a limit

The API returns 429 Too Many Requests with a Retry-After header in seconds:
HTTP/1.1 429 Too Many Requests
Retry-After: 12
Content-Type: application/json

{
  "detail": "Rate limit exceeded. Retry after 12 seconds.",
  "code": "rate_limit_exceeded",
  "request_id": "req_01J9..."
}
Respect the Retry-After value. Do not retry faster — doing so will extend your penalty window.

Handling 429 in code

The Python SDK handles this automatically with exponential backoff + jitter. If you’re hitting the API directly, use the Retry-After header:
import time
import requests

def call_with_retry(url, headers, json, max_attempts=5):
    for attempt in range(max_attempts):
        r = requests.post(url, headers=headers, json=json)
        if r.status_code != 429:
            return r
        wait = int(r.headers.get("Retry-After", 2 ** attempt))
        time.sleep(wait)
    r.raise_for_status()

Best practices

  • Cache identical search/chat queries — if you’re running the same query repeatedly (e.g. for every page view), add a short client-side cache
  • Batch where you can — use the list endpoint with page_size=100 instead of polling item-by-item
  • Back off on failure — don’t tight-loop retries
  • Use the Python SDK — it handles retry and backoff automatically

Need higher limits?

Reach out via the dashboard. Most workloads comfortably fit within the defaults, but we regularly adjust for customers with legitimate traffic needs.

Errors

Full error format reference.