Rate Limiting
Edge and application-level limits that protect the runtime and your LLM budget.
Layer 1: Nginx
Layer 2: Gateway
Error shape
{
"error": "Rate limited",
"retryAfterMs": 45000
}Why two layers matter
Last updated