Rate limits

Rate limits are applied per API key, per minute.

Endpoints	Limit
Image creates - `generate`, `edit`, `resize`, `upscale`	20 / min
Video creates - `image-to-video`, `frames`	5 / min
Style extraction - `POST /styles/extract`	3 / min
Uploads - `POST /uploads` (presign)	30 / min
Everything else - `account`, job polling/listing, style reads, cancel	120 / min

Reading the limit

Responses include standard rate-limit headers so you can pace bursts off the server’s signal:

When you exceed a limit you get 429 with a Retry-After header (seconds):

{ "error": { "code": "RATE_LIMIT_ERROR", "message": "Rate limit exceeded" } }

Wait the number of seconds in Retry-After, then retry.
For bursts, watch X-RateLimit-Remaining and slow down before you hit zero.
Concurrency is also bounded - too many in-flight generations returns 429 CONCURRENT_LIMIT_REACHED; poll your existing jobs before starting more. Video has a lower concurrency ceiling than images and runs on a small dedicated pool, so clips may queue (stay pending) before they start processing - keep polling.

Long-polling with ?wait=true counts as a single request against these limits while it’s held - see Polling & wait.