Async status - AI Search API

GET /v1/async/status returns a live snapshot of everything currently in flight for your API key: how deep your admission queue is versus its capacity, how much of your sync concurrency budget is in use, and every inflight child grouped by surface and by region. It takes no parameters — it always reports the current state of the key making the request. Use it as a pacing signal. Before you fire a large batch of captures, read this endpoint to see how much headroom you have, then throttle your submissions to stay under your limits instead of discovering them through 429 responses.

This is a read-only snapshot and never counts against your credits. It reflects state at the moment of the request and can change immediately as work is admitted and completed.

Request

curl https://api.aisearchapi.dev/v1/async/status \
  -H "Authorization: Bearer $AISEARCH_API_KEY"

Response

200 OK

{
  "admissionQueue": {
    "depth": 12,
    "capacity": 200
  },
  "syncConcurrency": {
    "limit": 8,
    "running": 3
  },
  "inflight": {
    "total": 27,
    "bySurface": {
      "chatgpt": 9,
      "claude": 4,
      "perplexity": 6,
      "gemini": 5,
      "copilot": 3
    },
    "byRegion": {
      "US": 18,
      "GB": 6,
      "DE": 3
    }
  }
}

admissionQueue

object

The async admission queue for your key. New parent jobs wait here before their children begin processing.

Show admissionQueue

depth

integer

Number of children currently waiting in the admission queue.

capacity

integer

Maximum the queue can hold. When depth reaches capacity, new submissions are rejected with 429 QUEUE_CAPACITY_EXCEEDED.

syncConcurrency

object

Your synchronous concurrency budget — the number of ?mode=sync (or Prefer: wait) captures that can run at the same time.

Show syncConcurrency

limit

integer

Maximum sync captures that may run concurrently for your key.

running

integer

Sync captures running right now. When running reaches limit, further sync requests return 429 CONCURRENCY_LIMIT_EXCEEDED.

inflight

object

Every child currently active (queued or processing) for your key.

Show inflight

total

integer

Total inflight children across all surfaces and regions.

bySurface

object

Inflight child counts keyed by surface (chatgpt, claude, perplexity, gemini, copilot, google_ai_overview, google_ai_mode). Surfaces with nothing in flight are omitted.

byRegion

object

Inflight child counts keyed by region country code (ISO 3166-1 alpha-2). Regions with nothing in flight are omitted.

Pace submissions to avoid 429s

Every 429 your key can receive maps to a field in this snapshot, so you can preflight against it:

You want to avoid	Watch	Headroom formula
`QUEUE_CAPACITY_EXCEEDED`	`admissionQueue`	`capacity - depth`
`CONCURRENCY_LIMIT_EXCEEDED`	`syncConcurrency`	`limit - running`
`RATE_LIMIT_EXCEEDED`	request rate (see `X-RateLimit-*` headers)	—

Read the snapshot before a batch

Poll GET /v1/async/status right before submitting a large set of captures.

Compute your headroom

For async work, keep submissions under admissionQueue.capacity - admissionQueue.depth. For sync work, keep concurrent calls under syncConcurrency.limit - syncConcurrency.running.

Throttle, then submit

Submit up to your headroom, let some children drain, and re-read the snapshot before the next wave rather than retrying blindly.

Prefer async submission for large batches. A single POST /v1/search fans out into one child per surface × region and only occupies the admission queue, leaving your sync concurrency budget free for latency-sensitive, one-surface calls.

This snapshot is advisory, not a reservation. Between reading it and submitting, other requests on the same key can consume headroom. Always still handle 429 responses by honoring the Retry-After header.

Concurrency & limits

How the rate limit, sync concurrency budget, and admission queue fit together — and how to back off cleanly.

Asynchronous captures

Submit a parent job, fan out across surfaces and regions, and poll children as they complete.

​Request

​Response

​Pace submissions to avoid 429s

​Related

Concurrency & limits

Asynchronous captures

Request

Response

Pace submissions to avoid 429s

Related