Context.dev
Power LLMs with Web Context
Give AI models real-time web access.
Logo Link
Instant logo delivery via global CDN.
Autofill Onboarding Forms
Pre-fill signup fields from a work email.
Power Generative AI
Enhance AI with real-time brand context.
Enrich Company Profiles
Make Company Profiles That Pop.
Transaction & Billing Data
Create Financials Worth a Second Look.
Automated Brand Kits
Generate Brand Kits On Demand.
Programmatic Theming
It's Your Platform, Their Look & Feel.
Stock Ticker Enrichment
Showcase Stock Tickers That Tell Stories.
Zapier Automations
Build No-Code Brand Workflows.
Crawl Website
Crawl any site and extract Markdown from every page.
Scrape HTML
Extract raw HTML from any URL instantly.
Scrape Markdown
Convert any webpage to LLM-ready Markdown.
Scrape Images
Extract all images from any URL.
Scrape Sitemap
Discover all page URLs on any domain.
Company Logos
Fetch high-res logos from any domain.
Company Colors
Extract brand color palettes from websites.
Company Address
Company mailing addresses from web sources.
Company Description
Get concise company summaries and overviews.
Company Socials
Find official social media profiles and links.
Company Styleguide
Retrieve full website styleguide.
Company Fonts
Detect web fonts used by a company.
NAICS Classification
Classify companies by NAICS industry codes.
SIC Classification
Classify companies by SIC industry codes.
Website Screenshot
Up-to-date website screenshots.
Transaction Identification
Map merchant descriptors to real-world brands.

Sitemap Extractor {API}
Extract every page URL from any website sitemap with a single API call.
Pass a domain name and get back deduplicated page URLs from its sitemaps. Sitemap index files are crawled recursively. Non-page resources are filtered out automatically.
Perfect for building content indexes, seeding crawlers, or auditing a competitor's full site structure in seconds.
What You Get
Each request crawls a domain's sitemaps and returns all discoverable page URLs.
Deduplicated page URLs
Deduplicated, page-only results with non-page resources like images and PDFs automatically filtered out.
Sitemap index support
Recursively crawls nested sitemap index files with parallel fetching and concurrency control.
Crawl metadata
See how many sitemaps were discovered, fetched, skipped, and errored in a single response.
Normalized domain input
Pass just the domain name, no protocol needed. The API validates and normalizes domains automatically.
How It Works
We discover every relevant sitemap, follow sitemap indexes, and return a clean, deduplicated list of page URLs.
Send a domain
Pass the domain name (e.g., "example.com" or "blog.example.com"). No protocol required.
Sitemaps discovered
The API checks robots.txt and common sitemap paths, then recursively follows sitemap index files.
URLs extracted and deduplicated
All page URLs are collected from every sitemap, deduplicated, and filtered to exclude non-page resources.
Clean URL list returned
You get normalized page URLs plus crawl metadata about the sitemap discovery process.
API Response
Discovered URLs for context.dev
GET /v1/web/scrape/sitemap?domain=context.dev{
"success": true,
"domain": "context.dev",
"urls": [
"https://context.dev/",
"https://context.dev/pricing",
"https://context.dev/blog",
"https://context.dev/data/logo-api",
"https://context.dev/use-cases/logo-link",
"... more discovered URLs"
],
"meta": {
"sitemapsDiscovered": 3,
"sitemapsFetched": 3,
"sitemapsSkipped": 0,
"errors": 0
}
}Frequently asked questions
Common questions about the Context.dev Sitemap Extractor API.
Am I billed for failed requests?
How does the sitemap extractor find sitemaps?
What's the difference between a sitemap extractor and a website crawler?
Is the sitemap extractor API free?
How do I extract every URL from a website?
Does it support sitemap index files (nested sitemaps)?
Why use this instead of writing my own sitemap parser?
Am I billed for failed requests?
How do I extract every URL from a website?
How does the sitemap extractor find sitemaps?
Does it support sitemap index files (nested sitemaps)?
What's the difference between a sitemap extractor and a website crawler?
Why use this instead of writing my own sitemap parser?
Is the sitemap extractor API free?
Ship an agent that actually knows things.
Free tier, 10-minute integration, and the same API powering agents at Mintlify, daily.dev, and Propane. No credit card to start.













