Search before crawl

Find candidates before spending crawl work

Search keeps agents from crawling blindly. Start from a query, pick the right channel, and return normalized result collections that can become crawl targets, review queues, or research inputs.

Public web resultsURL discoveryRanked snippetsCrawl-ready links

curl

Run page search

curl -X POST "https://api.anycrawler.com/v1/search" \
  -H "Authorization: Bearer $ANYCRAWLER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "channel": "page",
    "query": "browser automation documentation",
    "country": "us",
    "page": 1,
    "results_per_page": 10
  }'

Use the unified /v1/search endpoint and select the search type with the channel field.

Use cases

Web Page Search API use cases

Treat page search results as a shortlist: dedupe domains, prefer authoritative sources, then call fetch for HTML-first pages or render for JavaScript-heavy results.

Use

Build a crawl seed list

Start from a topic and collect candidate URLs before deciding which pages deserve fetch or render credits.

Use

Compare public product pages

Find pricing, documentation, changelog, or competitor pages, then crawl the strongest results into Markdown.

Use

Recover from stale URLs

When a known page moves or disappears, search for fresh alternatives and route the best result into the crawler.

How it works

Query, normalize, route the strongest results

The gateway handles request validation, account checks, search provider routing, response normalization, caching, and credit settlement.

1

Submit a query

Send a POST request with query, channel=page, and optional localization or pagination fields.

2

AnyCrawler routes the search

The gateway validates your API key, checks credits, applies cache policy, and sends the request to the selected search collection.

3

Results are normalized

Upstream results are returned in a predictable response envelope with channel, query, nested search parameters, credits, and cache fields.

4

You choose the next step

Use result links as crawl targets, visual review candidates, research inputs, or source lists for a larger agent workflow.

Response

Normalized page results

{
  "ok": true,
  "query": "browser automation documentation",
  "credits_used": 20,
  "cache_timestamp": 0,
  "status_code": 200,
  "results": {
    "search_parameters": {
      "channel": "page",
      "country": "us",
      "page": 1,
      "query": "browser automation documentation",
      "results_per_page": 10
    },
    "organic": [
      {
        "position": 1,
        "title": "Playwright Documentation",
        "link": "https://playwright.dev/docs/intro",
        "snippet": "Fast and reliable browser automation documentation."
      }
    ]
  }
}

Result collections vary by channel, but the public response envelope stays consistent.

What you get

Search results that can become crawl targets

The page search response is built for routing. Use snippets for triage, links for follow-up crawl requests, and credit/cache fields for production observability.

  • Organic links for direct fetch or render follow-up
  • Snippets for quick source triage
  • Page and result-count fields for repeatable discovery
  • Credit and cache fields for operational planning

Request fields

One search endpoint, explicit channel selection

The stable public contract focuses on the fields below. Undocumented passthrough fields are not part of new integrations.

Field Type Required Notes
channel "page", "images", "news", "videos", or "scholar" Yes Selects the search vertical on the unified public search endpoint.
query string Yes The search query to run.
country string No Optional country hint for localized search results.
page integer No Search result page to request.
results_per_page integer No Requested result count. Billing is calculated in blocks of 10 results.

Response fields

Fields your integration can rely on

AnyCrawler keeps the search envelope consistent while the result collection reflects the selected channel.

Field Meaning
ok Whether the gateway completed the search successfully.
query The query submitted in the request.
credits_used Credits settled for the request.
cache_timestamp Cache marker for reused responses. Search cache does not change pricing.
status_code The upstream search provider status code returned through the gateway.
results.search_parameters.channel The search vertical used for the request.
results.search_parameters.country The country hint used for localized search results.
results.search_parameters.query The query mirrored inside the normalized search parameters.
results.search_parameters.page The search result page returned.
results.search_parameters.results_per_page The requested result count used for billing.
results.organic Organic web results with titles, links, and snippets for source triage.

JavaScript

Use the unified endpoint

const response = await fetch("https://api.anycrawler.com/v1/search", {
  method: "POST",
  headers: {
    "authorization": `Bearer ${process.env.ANYCRAWLER_API_KEY}`,
    "content-type": "application/json",
  },
  body: JSON.stringify({
    channel: "page",
    query: "browser automation documentation",
    country: "us",
    page: 1,
    results_per_page: 10,
  }),
});

const data = await response.json();
console.log(data.results);

Change channel to switch between page, images, news, videos, and scholar search.

Search vs crawl

Use page search as the first crawl routing step

Treat page search results as a shortlist: dedupe domains, prefer authoritative sources, then call fetch for HTML-first pages or render for JavaScript-heavy results. Once you have promising result links, call fetch or render to turn selected pages into Markdown, metadata, links, and optionally visual evidence.

API Best for Returns
search Finding candidate pages, sources, media, news, videos, or papers Search result collections
fetch Fast, low-cost extraction when useful content is already in HTML Markdown and page fields
render JavaScript-loaded pages that need browser execution Rendered Markdown and page fields

Errors

Status codes are explicit enough for production handling

Use response status plus error_code and request headers to route retries, credit issues, authentication problems, and upstream provider failures.

200 Search completed and normalized.
400 Invalid JSON, missing channel, missing query, or unsupported field value.
401 Missing, invalid, or revoked API key.
402 Account does not have enough credits.
403 Account inactive.
429 Rate limit reached.
502 Upstream search provider connection failed.
504 Search provider or gateway timed out.

Billing

Predictable result-count pricing

Search requests settle by requested result blocks. Cache can improve response consistency, but search cache hits do not switch to crawl cache-hit pricing.

Pricing

Start free, scale with explicit search rules

Search requests use 20 credits per block of 10 requested results. The same AnyCrawler plans apply here.

Free

$0/mo

For early validation with one-time credits, 30 fetch requests per minute, and 1 browser concurrency slot.

  • 10,000 one-time credits
  • $0 / cr
  • Fetch 30 req / min
  • Browser 1 concurrent

Agent Lite

$5/mo

For low-frequency AI agents and occasional automations that need 15,000 monthly credits.

  • 15,000 credits / mo
  • $0.000333 / cr
  • Fetch 60 req / min
  • Browser 2 concurrent
  • Low-frequency AI agent usage
  • Persistent top-ups available separately

Builder

$20/mo

For builders shipping serious crawler workflows with higher limits and better unit economics.

  • 80,000 credits / mo
  • $0.000250 / cr
  • Fetch 180 req / min
  • Browser 3 concurrent
  • Best upgrade after repeated Agent Lite usage

Growth

$80/mo

For teams scaling production scraping with better unit economics.

  • 960,000 credits / mo
  • $0.0000833 / cr
  • Fetch 480 req / min
  • Browser 8 concurrent

Scale

$200/mo

For high-volume pipelines with pricing aligned to large ongoing usage.

  • 3,000,000 credits / mo
  • $0.0000667 / cr
  • Fetch 1,200 req / min
  • Browser 20 concurrent

FAQ

Common page search questions

Short answers for teams using search to choose better crawl targets.

When should I use page search?

Use page search when the target URL is unknown. It gives agents a shortlist of public pages before you spend crawl credits on the pages that matter.

Can I use the same endpoint for other search types?

Yes. POST to /v1/search and change the channel field to page, images, news, videos, or scholar. Each channel also has this dedicated theme page for integration planning.

Does search cache reduce the credit cost?

Search responses can be cached for speed and operational consistency, but search billing still follows the result-count formula rather than cache-hit pricing.

How do search results fit with page crawling?

Use search to find candidate URLs, then call fetch or render on the strongest results when you need Markdown, metadata, links, or screenshots.