Rate limits are evolving as Memic scales. The numbers below are defaults; if
your workload needs higher limits, contact us via the dashboard.
Default limits (per API key)
| Endpoint category | Limit | Window |
|---|---|---|
Search (POST /search) | 60 requests | per minute |
Chat (POST /chat) | 30 requests | per minute |
File ingestion (POST /files/init, /confirm) | 120 requests | per minute |
Status polls (GET /files/{id}/status) | 300 requests | per minute |
List endpoints (GET /files, GET /projects) | 120 requests | per minute |
Prompts (GET /prompts/{name}) | 600 requests | per minute |
When you hit a limit
The API returns429 Too Many Requests with a Retry-After header in
seconds:
Retry-After value. Do not retry faster — doing so will extend
your penalty window.
Handling 429 in code
The Python SDK handles this automatically with exponential backoff + jitter. If you’re hitting the API directly, use theRetry-After header:
Best practices
- Cache identical search/chat queries — if you’re running the same query repeatedly (e.g. for every page view), add a short client-side cache
- Batch where you can — use the list endpoint with
page_size=100instead of polling item-by-item - Back off on failure — don’t tight-loop retries
- Use the Python SDK — it handles retry and backoff automatically
Need higher limits?
Reach out via the dashboard. Most workloads comfortably fit within the defaults, but we regularly adjust for customers with legitimate traffic needs.Related
Errors
Full error format reference.