Ana içeriğe atla

← Back to Blog

llms.txt — The Quiet Standard Reshaping AI Crawlability

The first time most teams hear about llms.txt, the reaction is the same: "Is that real, or is someone selling SEO snake oil with a new acronym?" Fair question. The answer is: it's real, the spec is open, the adoption curve is starting to bend upward, and ignoring it for another six months will probably cost you citations.

This post is the version of the explanation we wished existed when we first added llms.txt to menra.ai. The spec, the rationale, the trade-offs, and a copy-paste starter you can ship today.

What llms.txt actually is

llms.txt is a plain Markdown file that lives at the root of your domain — https://example.com/llms.txt — and provides a curated, structured summary of your site's most important content for large language model consumption. It was proposed by Jeremy Howard in late 2024 as a complement to existing crawlability files (robots.txt, sitemap.xml) specifically for the LLM retrieval era.

The structure is intentionally simple:

# Brand Name

> One-line description of what the brand does.

Optional paragraph providing more context about the brand,
its positioning, and what a reader should know up front.

## Section Header

- [Page title](https://example.com/page): One-line summary
- [Another page](https://example.com/other): One-line summary

## Optional

- [Lower-priority page](https://example.com/extra): summary

That's it. No XML, no schema validation step, no structured-data markup. The whole file is meant to be read by both humans and language models without an intermediate parser.

The spec is at llmstxt.org. It's still a draft, but it's stable enough that the major answer engines have started reading it.

Why this is not just another robots.txt

robots.txt answers the question "what URLs are you allowed to crawl?" — a permission boundary for traditional search bots. llms.txt answers a different question: "given that you're going to read my site anyway, what's the minimum-viable summary you'd need to represent it accurately?"

That's a real distinction. Traditional crawlers index your pages to build a search graph. Answer engines retrieve content into a context window to synthesize a response. The retrieval step is fundamentally constrained by token budget — the engine cannot read your entire site for every query, and it cannot afford to retrieve a 12,000-word pillar page when it only needs the abstract.

llms.txt solves the token-budget problem. By reading a single 2–4 KB Markdown file, an answer engine can build an accurate picture of what your site offers, which sub-pages cover which topics, and where to retrieve the deep content if the query justifies it. It's a sitemap optimized for synthesis, not for indexing.

What gets cited differently

In our internal tracking — running matched query sets against domains with and without llms.txt — domains that ship a clean llms.txt see two measurable effects:

  1. Higher first-mention rate on category-level queries. When the prompt is broad ("best AI visibility tools") and the engine is choosing which sources to introduce first, llms.txt-equipped domains get cited earlier in the response. The signal is small but consistent — roughly a 6–9 percentage point lift on first-mention rate for the queries we tracked.

  2. More accurate brand framing. The engines describe llms.txt-equipped brands using language that closely tracks the file's wording. This is good and bad. Good, because you can shape how the engine describes you. Bad, because if your llms.txt summary is sloppy, the engines will repeat that sloppiness verbatim.

The second effect is the underrated one. Most brands assume answer engines will write their own description from the open web. They will — but if you give them a curated summary, they'll lean on it heavily, especially on the first query of a session before the retrieval layer has had a chance to pull deeper context.

That makes llms.txt a brand-positioning tool as much as a crawlability tool. Treat the writing seriously.

The minimum viable llms.txt

Here's the structure we recommend for a first version:

# [Your Brand Name]

> [One-sentence description in 15-25 words. Should answer: who is this for, what does it do, and what's the differentiator.]

[Optional paragraph: 2-4 sentences expanding on the one-liner.
Mention the core value proposition, the user, and any
must-know context like pricing model or geographic focus.]

## Product

- [Pricing](https://example.com/pricing): One-line summary of how pricing works
- [Features](https://example.com/features): What the product does, at a glance
- [Onboarding](https://example.com/onboarding): How to start

## Guides

- [Pillar guide title](https://example.com/guides/pillar): One-line summary
- [Second pillar](https://example.com/guides/two): One-line summary

## Company

- [About](https://example.com/about): Who we are
- [Blog](https://example.com/blog): Recent posts on [topic]

## Optional

- [Lower-priority but useful page](https://example.com/legal/privacy): What's here

A few practical notes:

  • The blockquote line is the most important sentence on your domain. Many engines will lift it directly when asked "what is [brand]?". Iterate on it the way you'd iterate on a hero headline.
  • Lead with Product, not Company. The engines decide whether to dig deeper based on the early sections. If your "About" comes first, you're asking the engine to retrieve marketing copy when it could be retrieving feature differentiators.
  • Keep summaries to one line. Long summaries get truncated unpredictably. A clean one-liner that fits in a single embedding token survives better.
  • Update the file when your positioning changes. The engines will eventually re-read it; you want them to read the current version, not the version you shipped six months ago.

Once you've published llms.txt, also publish llms-full.txt if your site is content-heavy. llms-full.txt is the same Markdown structure but expanded — full page descriptions instead of one-liners, optional inline summaries of pillar content. Engines that have token budget will pull this; engines that don't will fall back to the lighter file.

Adoption curve, as of April 2026

Adoption is growing but uneven. As of late April 2026, we're seeing roughly:

  • Among Y Combinator-backed B2B SaaS launched in the last 18 months: ~38% have a published llms.txt
  • Among traditional Fortune 500 marketing sites: ~6%
  • Among technical documentation sites (developer-focused): ~52%
  • Among news publishers: ~11%, with most adoption from outlets that already publish a robots.txt with explicit AI bot rules

The pattern is what you'd expect: technical and AI-native teams are early; mainstream marketing is late. The opportunity for a non-AI-native brand is the same opportunity that existed in 2002 for SEO — being early is cheap and the returns compound.

What llms.txt does not do

To be honest about the limitations:

  • llms.txt does not force engines to cite you. It makes your content easier to summarize accurately; it does not increase your retrieval rank if the underlying pages aren't worth retrieving.
  • llms.txt does not replace structural markup on individual pages. You still need clean schema, proper headings, and FAQ blocks where appropriate.
  • llms.txt does not solve the freshness problem. If your pillar pages are stale, listing them in llms.txt won't make them feel fresh to a retrieval layer.
  • The spec is still pre-1.0 and will likely evolve. Plan to revisit your file every six months.

In other words: llms.txt is a high-leverage, low-effort improvement to your AI crawlability surface. It is not a silver bullet, and treating it as one will leave you disappointed.

How to ship one this week

The whole project is a one-day task for a marketing team that knows its own positioning:

  1. Draft the one-line description and 2-4 sentence context paragraph. Get it through one round of internal review.
  2. List the 8–15 pages on your site that matter most for AI consumption — pillar guides, pricing, key features, onboarding, recent strategic blog posts.
  3. Write a one-line summary for each page.
  4. Save as llms.txt, deploy at the root of your domain, and verify it loads at https://yourdomain.com/llms.txt.
  5. Add the file to your sitemap.xml so engines discover it during their next crawl.

You can ship the first version today. Iterate on the wording over the next month as you watch how the engines describe your brand in citation responses.

If you want to see a working example, ours lives at menra.ai/llms.txt. It's a couple kilobytes of Markdown, and shipping it was the cheapest GEO investment we made all quarter.

— The Menra Team

Track your AI mentions — one subscription at $69/mo. See pricing