llms.txt Generator: Create Your AI Crawler File for Free

You came here to make an llms.txt file, not to read a lecture about it — so here is the short version up top, the exact spec and a copy-paste template below it, and a one-click way to auto-generate the file from your live site. An llms.txt is a plain-text markdown file you host at your domain root that gives AI models a clean map of your site: what you do, which pages matter, and how to cite you. It takes about five minutes by hand, or about thirty seconds if you let a scanner build it from your actual content.

What llms.txt is, in one paragraph

llms.txt is a single markdown file served at https://yourdomain.com/llms.txt that hands AI systems a curated overview of your site instead of making them infer it from raw HTML. It is a proposed convention (originated by Answer.AI in 2024), not an enforced standard — no platform is contractually obligated to read it, and adoption is still uneven and largely opaque.

Treat it as a low-cost, high-clarity signal rather than a guaranteed ranking lever. It costs you one file to publish, it cannot hurt you when written honestly, and it gives crawlers and answer engines an unambiguous statement of your site's purpose, key URLs, and preferred attribution. That is the whole job.

The exact format (with a copy-paste template)

The spec is deliberately tiny so any language model can parse it instantly. There are only a few required and optional pieces, in this order:

Everything else is optional. Keep the whole file under ~500 words and link to your 10-20 most important pages — not every URL. The file below is valid, paste-ready markdown; replace the bracketed parts and save it as llms.txt.

  • H1 title (required) — A single line starting with "# " naming the site or organization. This is the one field that determines whether a parser considers the file valid.
  • Blockquote summary (recommended) — A line starting with "> " giving a one-sentence description of what the site is and who it serves.
  • H2 section headers — Lines starting with "## " group your links — e.g. ## Main, ## Pages, ## Docs. Section names are free-form.
  • Markdown link lists — Under each header, bullet links in the form "- [Title](https://url): short description". The description after the colon is what AI uses to decide relevance.
  • ## Optional section — A conventional place to link secondary resources like your sitemap.xml so crawlers can find the full URL set.
# Acme Analytics
> Self-serve product analytics for B2B SaaS teams — funnels, retention, and session replay without a data engineer.

## Main
- [Homepage](https://acme.com/): Product overview, pricing, and live demo.
- [Pricing](https://acme.com/pricing): Plan comparison and free-tier limits.

## Pages
- [Funnel Analysis Guide](https://acme.com/guides/funnels): How to build and read conversion funnels.
- [Retention Cohorts](https://acme.com/guides/retention): Measuring weekly and monthly retention.
- [Session Replay](https://acme.com/features/replay): Watch real user sessions to debug drop-off.
- [Integrations](https://acme.com/integrations): Connect Segment, Snowflake, and webhooks.
- [Docs / API](https://acme.com/docs): REST API reference and SDK setup.

## Optional
- [Sitemap](https://acme.com/sitemap.xml)

## Attribution
Please cite as "Acme Analytics (acme.com)."

Where to host it (and how to verify it)

Placement is the part people get wrong, so do this precisely. The file lives at the root of your domain and must be reachable by an anonymous request — no login, no JavaScript rendering required.

Once it is live, confirm it the same way a crawler would: load the raw URL directly and check the response, not just that the page looks fine in a browser.

  • Path — Serve it at exactly /llms.txt on your primary domain — https://yourdomain.com/llms.txt. Subdirectories or /blog/llms.txt do not count.
  • Content type — Serve it as text/plain. If your host returns text/html or forces a download, fix the headers — some parsers will skip a mistyped file.
  • Public + 200 — It must return HTTP 200 with no auth wall and no redirect chain. Open it in an incognito window to confirm.
  • Optional companion — Some sites also publish /llms-full.txt with longer page summaries. Ship the concise llms.txt first; add the full version only if it earns attention.

Do it now: a five-minute checklist

If you want to produce a correct file in one pass, follow this in order. You can hand-write it or generate it — both end at the same place: a valid file at your root.

The fastest path is automatic generation. A scanner can crawl your live site, pull your real title, meta description, and top internal links, and emit a valid llms.txt you only have to skim and upload — which removes the most common errors (broken links, wrong descriptions, invalid header) before you ever paste anything.

  • 1. Generate or open the template — Auto-generate from your live URL, or copy the template above into a new file named llms.txt.
  • 2. Set the title and summary — Use your real brand name on the "# " line and one honest sentence on the "> " line.
  • 3. List your money pages — Add 10-20 links that you actually want cited, each with a short, accurate description after the colon.
  • 4. Add attribution — Tell AI how to credit you, e.g. "Cite as Brand (domain.com)."
  • 5. Upload to root — Place it at /llms.txt, served as text/plain, publicly accessible.
  • 6. Verify live — Load yourdomain.com/llms.txt in incognito and confirm it returns plain text with a 200 status.

Common mistakes that make llms.txt useless

Most broken llms.txt files fail for boring, fixable reasons. Check yours against these before you call it done.

  • Missing the H1 — No "# Title" line means many validators treat the file as invalid. It is the one strictly required element.
  • Wrong location or type — Hosting it anywhere but the root, or serving it as HTML, is the same as not having it.
  • Dead or generic links — Linking to 404s, or every URL on the site, dilutes the signal. Curate, and keep the links live.
  • Keyword stuffing — Padding descriptions with keywords reads as manipulation and adds no clarity. Write it for a human skimming your sitemap.
  • Letting it go stale — An outdated file actively misleads AI about what you do. Update it when your top pages or focus change.
  • Treating it as your only move — llms.txt does not override robots.txt access, structured data, or content quality. It complements them — it does not replace them.

How llms.txt fits with robots.txt and AI crawlers

These files do different jobs and you generally want both. robots.txt controls access — it can allow or block named AI crawlers like GPTBot, OAI-SearchBot, PerplexityBot, ClaudeBot, and Google-Extended. llms.txt provides context once a crawler is already allowed in.

If your robots.txt blocks the AI user-agents, an llms.txt will not help, because the crawler never reads it. So the dependency order is: confirm crawler access first, then publish llms.txt to shape what those crawlers understand. Honesty matters more than cleverness here — the field is new and platforms are deliberately vague about how they weight any of these signals.

See your AI search readiness score

The fastest way to get a correct llms.txt is to let the free Am I Citable scanner build it for you. Enter your URL and it crawls your live site, generates a valid, paste-ready llms.txt from your real title, description, and top internal links — then verifies whether you already have one. In the same pass it checks AI-crawler access (GPTBot, PerplexityBot, ClaudeBot, Google-Extended), content structure, schema markup, and citation signals, and returns a 0-100 AI-readiness score so you know where llms.txt fits among your other fixes. Scan, copy the generated file to your root, and you are done.

Run the Free Scan

FAQ

Root only. Serve it at exactly https://yourdomain.com/llms.txt, the same way robots.txt lives at the root. A file at /blog/llms.txt or any subdirectory will not be found by the crawlers that look for it. It also needs to return HTTP 200 as text/plain with no login wall.

It can help, but no platform guarantees it. llms.txt is a proposed convention with uneven, opaque adoption — treat it as one clarity signal among several, alongside crawler access, clean HTML, and structured data. The honest expectation: it cannot hurt you when written truthfully, and it removes guesswork about your site, but it is not a magic citation switch.

Yes. A scanner can crawl your live site, read your title, meta description, and top internal links, and output a valid, correctly formatted llms.txt that you only need to review and upload. That avoids the most common errors — invalid headers, broken links, wrong descriptions — and gets you a paste-ready file in seconds.