Crawler / Page / Fetch

Fetch pages fast for 2 credits.

Use the fetch path when a page's useful content is already available in HTML. It is the fastest, lowest-cost crawl mode in AnyCrawler: retrieve the URL, follow redirects, strip noisy markup, and return Markdown, metadata, links, and cache-aware usage fields in one API call.

Fast low-cost path

Spend fewer credits when HTML is enough

Fetch avoids browser startup when JavaScript execution is not required, so it is faster and cheaper for docs, articles, blogs, reference pages, and repeatable content checks. Send a URL, choose the fetch method, and receive a normalized response for agents, LLM pipelines, content indexes, and internal data products.

2-credit base request Fast HTTP retrieval 1-credit cache hits Markdown normalized

curl

Fetch a page

curl -X POST "https://api.anycrawler.com/v1/crawl/page" \
  -H "Authorization: Bearer $ANYCRAWLER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com/docs",
    "method": "fetch",
    "accept_cache": true,
    "include_metadata": true,
    "include_links": true,
    "markdown_variant": "readability"
  }'

Use Authorization: Bearer or x-api-key. Set method to fetch for the lightweight crawl path.

How it works

Fetch, clean, return structured fields

The gateway handles request validation, account checks, cache lookup, upstream fetch, response filtering, and credit settlement.

1

Submit the target URL

Send a POST request with url and method=fetch. Optional flags control metadata, links, media, Markdown variant, and cache reads.

2

AnyCrawler fetches the page

The gateway validates your API key, checks credits, applies cache policy, and retrieves the page through the fetch path.

3

HTML becomes Markdown

Noisy HTML is normalized into Markdown and token counts, with optional metadata and extracted links attached under results.

4

You get usage evidence

The response includes status, final URL, credits_used, cache_timestamp, and request headers for production debugging.

Use cases

Where fetch gives the best SEO and agent economics

Use the fetch page API when speed, repeatability, and low credit cost matter more than browser-only page state.

Use

Documentation ingestion

Pull docs, changelogs, help centers, and API references into RAG pipelines without paying for browser execution on every page.

Use

Low-cost content monitoring

Check pricing, policy, product, or competitor pages frequently while keeping the base request cost at the lightweight fetch rate.

Use

Article and blog extraction

Convert long-form HTML pages into Markdown and token counts so research agents can read the useful body instead of raw markup.

Response

Normalized crawl output

{
  "ok": true,
  "requested_url": "https://example.com/docs",
  "canonical_url": "https://example.com/docs",
  "final_url": "https://example.com/docs",
  "status_code": 200,
  "cache_timestamp": 0,
  "credits_used": 4,
  "results": {
    "title": "Example Docs",
    "description": "Product documentation",
    "markdown": "# Example Docs\n\nInstall the SDK...",
    "markdown_tokens": 428,
    "metadata": {
      "description": "Product documentation"
    },
    "links": [
      {
        "text": "API reference",
        "href": "https://example.com/docs/api"
      }
    ]
  }
}

metadata, links, and media appear only when requested; each included group adds its published modifier credits.

What you get

Cleaner than raw HTML, smaller than the full page

The fetch response is built for downstream tools. Use Markdown for model context, URLs for provenance, status fields for debugging, and optional extraction fields when you need to build a follow-up crawl queue.

  • Markdown and token counts for agent context
  • Final URL and status code for traceability
  • Optional metadata, links, and media for enrichment
  • Credit and cache fields for cost control

Request fields

Keep the payload small, add fields only when you need them

The stable public contract focuses on the fields below. Undocumented passthrough fields are not part of new integrations.

Field Type Required Notes
url string Yes The public page URL to fetch and normalize.
method "fetch" Recommended Selects the lightweight fetch path instead of the browser render path.
accept_cache boolean No Allows a valid cached response to return at the cache-hit rate.
include_metadata boolean No Adds page metadata to results.metadata when available.
include_links boolean No Adds extracted links to results.links for follow-up crawl planning.
include_media boolean No Adds extracted images and media references to results.media.
markdown_variant "markdown" or "readability" No Controls the extraction path. Both variants return results.markdown.
user_agent string or null No Optional paid-plan override for the outbound user agent.

Response fields

Fields your integration can rely on

AnyCrawler strips internal timing and debug fields before returning the public response.

Field Meaning
ok Whether the gateway completed the crawl successfully.
requested_url The URL submitted in the request.
canonical_url Canonical URL discovered from the page when present.
final_url The final URL after redirects.
status_code HTTP status code returned by the target page.
cache_timestamp Cache marker for reused responses. Zero means a fresh upstream result.
credits_used Credits settled for the request.
results.markdown Clean Markdown body normalized for AI context and downstream parsing.
results.markdown_tokens Estimated token count for the Markdown payload.
results.metadata Optional page metadata when include_metadata is true.
results.links Optional extracted links when include_links is true.
results.media Optional extracted media when include_media is true.

TypeScript

Use the SDK shape

import { AnyCrawlerClient } from "@anycrawler/sdk";

const client = new AnyCrawlerClient({
  apiKey: process.env.ANYCRAWLER_API_KEY!,
});

const { data, meta } = await client.crawlPage({
  url: "https://example.com/docs",
  method: "fetch",
  accept_cache: true,
  include_metadata: true,
});

console.log(data.results?.markdown);
console.log(meta.requestId);

The SDK request type includes the stable public crawl fields.

Fetch vs render

Use fetch for direct HTML. Use render for JavaScript pages.

Fetch is the lower-cost, faster path for pages that already expose useful HTML. If the page needs client-side rendering, browser behavior, or JavaScript-loaded content, switch the same endpoint to method=render.

Mode Best for Base cost
fetch Docs, articles, reference pages, static HTML, cached content checks 2 credits
render JavaScript apps, client-side routing, browser-dependent pages 10 credits
cache hit Repeated reads when accept_cache=true and a valid cached result exists 1 credit

Errors

Status codes are explicit enough for production handling

Use response status plus error_code and request headers to route retries, credit issues, and authentication problems.

200 Page fetched and normalized.
400 Invalid JSON or unsupported request field value.
401 Missing, invalid, or revoked API key.
402 Account does not have enough credits.
403 Account inactive or paid-plan-only field requested.
429 Rate limit reached.
502 Upstream connection failed.
504 Target or worker timed out.

Headers

Track every request

Successful and error responses include gateway metadata headers such as x-request-id, x-credits-reserved, and x-credits-used. Store them when you need support or cost audits.

Pricing

Start free, scale with explicit credit rules

Fetch requests use 2 credits before cache hits. The same pricing model from the main AnyCrawler plans applies here.

Free

$0/mo

For early validation with one-time credits, 30 fetch requests per minute, and 1 browser concurrency slot.

  • 10,000 one-time credits
  • $0 / cr
  • Fetch 30 req / min
  • Browser 1 concurrent

Agent Lite

$5/mo

For low-frequency AI agents and occasional automations that need 15,000 monthly credits.

  • 15,000 credits / mo
  • $0.000333 / cr
  • Fetch 60 req / min
  • Browser 2 concurrent
  • Low-frequency AI agent usage
  • Persistent top-ups available separately

Builder

$20/mo

For builders shipping serious crawler workflows with higher limits and better unit economics.

  • 80,000 credits / mo
  • $0.000250 / cr
  • Fetch 180 req / min
  • Browser 3 concurrent
  • Best upgrade after repeated Agent Lite usage

Growth

$80/mo

For teams scaling production scraping with better unit economics.

  • 960,000 credits / mo
  • $0.0000833 / cr
  • Fetch 480 req / min
  • Browser 8 concurrent

Scale

$200/mo

For high-volume pipelines with pricing aligned to large ongoing usage.

  • 3,000,000 credits / mo
  • $0.0000667 / cr
  • Fetch 1,200 req / min
  • Browser 20 concurrent

FAQ

Common fetch API questions

Short answers for teams deciding whether this is the right crawl mode.

When should I use fetch instead of render?

Use fetch for pages where the useful content is available in the returned HTML, such as docs, blogs, articles, reference pages, and many public content sites. Use render when JavaScript execution is required.

Does fetch still return Markdown?

Yes. The fetch path still normalizes output into results.markdown and results.markdown_tokens, so AI agents and RAG pipelines can consume cleaner text instead of raw HTML.

Can I get links and metadata in the same response?

Yes. Set include_metadata, include_links, or include_media to true. These optional fields are omitted by default to keep responses lean.

How does cache acceptance work?

When accept_cache is true and a valid cached response exists, the gateway can return the cached result and settle at the cache-hit base rate instead of the uncached fetch rate.