GET /v1/async/status returns a live snapshot of everything currently in flight for your API key: how deep your admission queue is versus its capacity, how much of your sync concurrency budget is in use, and every inflight child grouped by surface and by region. It takes no parameters — it always reports the current state of the key making the request. Use it as a pacing signal. Before you fire a large batch of captures, read this endpoint to see how much headroom you have, then throttle your submissions to stay under your limits instead of discovering them through 429 responses.
This is a read-only snapshot and never counts against your credits. It reflects state at the moment of the request and can change immediately as work is admitted and completed.

Request

curl https://api.aisearchapi.dev/v1/async/status \
  -H "Authorization: Bearer $AISEARCH_API_KEY"

Response

200 OK
{
  "admissionQueue": {
    "depth": 12,
    "capacity": 200
  },
  "syncConcurrency": {
    "limit": 8,
    "running": 3
  },
  "inflight": {
    "total": 27,
    "bySurface": {
      "chatgpt": 9,
      "claude": 4,
      "perplexity": 6,
      "gemini": 5,
      "copilot": 3
    },
    "byRegion": {
      "US": 18,
      "GB": 6,
      "DE": 3
    }
  }
}
admissionQueue
object
The async admission queue for your key. New parent jobs wait here before their children begin processing.
syncConcurrency
object
Your synchronous concurrency budget — the number of ?mode=sync (or Prefer: wait) captures that can run at the same time.
inflight
object
Every child currently active (queued or processing) for your key.

Pace submissions to avoid 429s

Every 429 your key can receive maps to a field in this snapshot, so you can preflight against it:
You want to avoidWatchHeadroom formula
QUEUE_CAPACITY_EXCEEDEDadmissionQueuecapacity - depth
CONCURRENCY_LIMIT_EXCEEDEDsyncConcurrencylimit - running
RATE_LIMIT_EXCEEDEDrequest rate (see X-RateLimit-* headers)
1

Read the snapshot before a batch

Poll GET /v1/async/status right before submitting a large set of captures.
2

Compute your headroom

For async work, keep submissions under admissionQueue.capacity - admissionQueue.depth. For sync work, keep concurrent calls under syncConcurrency.limit - syncConcurrency.running.
3

Throttle, then submit

Submit up to your headroom, let some children drain, and re-read the snapshot before the next wave rather than retrying blindly.
Prefer async submission for large batches. A single POST /v1/search fans out into one child per surface × region and only occupies the admission queue, leaving your sync concurrency budget free for latency-sensitive, one-surface calls.
This snapshot is advisory, not a reservation. Between reading it and submitting, other requests on the same key can consume headroom. Always still handle 429 responses by honoring the Retry-After header.

Concurrency & limits

How the rate limit, sync concurrency budget, and admission queue fit together — and how to back off cleanly.

Asynchronous captures

Submit a parent job, fan out across surfaces and regions, and poll children as they complete.