TL;DR
Request Markdown‑formatted output from AlterLab’s scraping API to strip HTML noise before feeding data to LLMs. This cuts token usage, lowers cost, and simplifies parsing in AI‑driven pipelines.
Why HTML Inflates LLM Costs
Large language models charge per token. Raw HTML from a typical page includes tags, attributes, whitespace, and scripts that add little semantic value but increase token count dramatically. For example, a product listing page might deliver 12 KB of HTML, which translates to roughly 3 000 tokens—most of it noise. When you chain multiple pages or run retrieval‑augmented generation (RAG) workflows, these extra tokens multiply quickly, raising both latency and expense.
The Markdown Alternative
AlterLab’s API supports an optional formats parameter. Setting formats=['markdown'] returns the page’s main content converted to clean Markdown. Headings become #, lists become -, and tables retain a simple pipe‑delimited structure. The resulting text is typically 60‑80 % smaller than the raw HTML equivalent, directly reducing the token count sent to your LLM.
This is platform, because Alters, the API request code showing
```python title="fetch.html snippet and then
showdown by using a dscrape using
-ing to
optional formats param
respo
we try a request to Markdown
and the extra
0 example
We'll a a like:
python title="scrape_markdown.py" {2-5}
client = alterlab.Client("YOUR_API_KEY") # API key from dashboard
# Request Markdown formatted output
response = client.scrape(
url="https://example.com/articles/latest",
formats=["markdown"] # highlighted: ask for Markdown
)
# The cleaned Markdown is ready for LLM consumption
print(response.text[:500]) # preview first 500 characters
bash title="Terminal"
curl -X POST https://api.alterlab.io/v1/scrape \
-H "X-API-Key: YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com/articles/latest", "formats": ["markdown"]}'
Integrating with LLM Pipelines
Once you have the Markdown string, you can feed it directly into your LLM call. Because the text is already structured, you often need less prompting to extract insights. For retrieval‑augmented generation, store the Markdown in your vector database; the reduced size means more chunks fit within your index’s token limits, improving recall without increasing storage costs.
Consider a simple summarization flow:
- Scrape target page with
formats=["markdown"]. - Pass the Markdown to a summarization model (e.g.,
gpt-4o-mini). - Use the summary downstream—no extra HTML stripping step required.
This eliminates a custom HTML‑to‑text preprocessing step, reducing both code complexity and potential bugs.
Combining Markdown Output with Cortex AI Extraction
AlterLab’s Cortex AI can extract structured fields (prices, dates, SKUs) from raw HTML. When you first request Markdown, you strip noise, then let Cortex work on the cleaner text. This two‑step approach can lower the token count sent to Cortex as well, because the model sees less irrelevant markup.
python title="cortex_markdown.py" {3-7}
client = alterlab.Client("YOUR_API_KEY")
response = client.scrape(
url="https://example.com/products/listing",
formats=["markdown"], # get clean Markdown first
extract={"model": "cortex-v1"} # then run AI extraction on that Markdown
)
print(response.json) # structured data, minimal token overhead
Cost Impact Example
Assume you scrape 10 000 product pages per month. Average raw HTML size: 12 KB (~3 000 tokens). Average Markdown size: 4 KB (~1 000 tokens).
- HTML route: 10 000 × 3 000 = 30 M tokens → at $0.000015 per token ≈ $450/month.
- Markdown route: 10 000 × 1 000 Markdown route: 10 000 × 1 000 = 10 M tokens → ≈ $150/month.
Savings of roughly $300/month, plus reduced egress bandwidth and faster LLM inference.
Best Practices
- Always request the minimal format you need:
formats=["markdown"]orformats=["json"]when downstream code expects structured data. - Combine
formatswithextractto let AlterLab perform both cleaning and AI extraction in one request. - Monitor your token usage via your LLM provider’s dashboard; you should see a noticeable drop after switching to Markdown.
- If you need the original HTML for archival, keep a separate request without the
formatsflag, but use it sparingly.
Internal Resources
For a full list of supported output formats, see the API documentation. To get started quickly, follow the quickstart guide. For pricing details on our pay‑as‑you‑go model, visit the pricing page.
Takeaway
Asking AlterLab for Markdown‑formatted scraped data is a simple, effective way to reduce LLM token consumption and lower operating costs. The cleaned output removes HTML noise, speeds up downstream processing, and works seamlessly with AlterLab’s AI extraction features. Start using the formats parameter today and see immediate savings on your AI‑driven scraping pipelines.













