Get 50% more traffic by automating content versions (not new content) that LLMs love!
A practical guide to making your website content accessible to AI systems like ChatGPT, Claude, Perplexity, and other LLM-powered tools.
How to Make Your Website LLM-Friendly
A practical guide to making your website content accessible to AI systems like ChatGPT, Claude, Perplexity, and other LLM-powered tools.
Table of Contents
- Why: The Third Audience
- What: The Three Components
- When: Should You Implement This?
- How: Implementation Overview
- Part 1: Markdown Auto-Discovery
- Part 2: llms.txt (AI Sitemap)
- Part 3: Bot Tracking
- Platform-Specific Instructions
- Testing & Validation
- Real-World Results
- Using AI Assistants to Implement
- Resources
Why: The Third Audience
Websites have traditionally served two audiences:
- Humans -- who read and interact with your pages
- Search engines -- who crawl and index for Google, Bing, etc.
There is now a third audience: AI systems.
Large Language Models (LLMs) like GPT-4, Claude, and others are increasingly used to answer questions, summarise information, and assist with research. When these systems reference your content, they benefit from clean, structured formats rather than complex HTML with navigation, ads, and scripts.
Reference: The Third Audience by Dries Buytaert (creator of Drupal)
Benefits
- Accurate AI responses -- Clean content means fewer hallucinations when AI references your site
- Brand mentions -- AI tools may cite your site more often when they can easily parse your content
- Future-proofing -- As AI search grows, optimised sites will have an advantage
- No SEO downside -- This complements traditional SEO, doesn't replace it
What: The Three Components
A fully LLM-optimised site has three things:
1. Markdown Auto-Discovery
For every page on your site, provide a Markdown version at a predictable URL and tell crawlers about it:
https://yoursite.com/about --> HTML page (for humans)
https://yoursite.com/about.md --> Markdown version (for AI)
Plus a discovery link in your HTML <head>:
<link rel="alternate" type="text/markdown" href="/about.md" />
2. llms.txt (AI Sitemap)
A single file at your site root following the llmstxt.org standard. Think of it as a sitemap specifically for AI crawlers -- it describes your site and links to all your .md pages:
https://yoursite.com/llms.txt
3. Bot Tracking (Optional)
Log which AI crawlers visit your .md files so you can measure whether it's working.
When: Should You Implement This?
Good Candidates
- Content-heavy sites -- Blogs, documentation, news, educational content
- B2B websites -- Product pages, pricing, features that people research
- Reference sites -- APIs, technical docs, knowledge bases
- Sites that want AI visibility -- If you want AI to accurately represent your brand
Less Critical
- E-commerce product listings -- Structured data (JSON-LD) may be more valuable
- Highly interactive apps -- Where content is dynamic/personalised
- Private/gated content -- Unless you want AI to access it
Time Investment
- Simple sites: 2-4 hours
- WordPress/CMS: 4-8 hours
- Large custom sites: 1-2 days
How: Implementation Overview
Regardless of platform, the implementation follows these steps:
- Generate Markdown files -- Convert your HTML content to clean Markdown with YAML frontmatter
- Serve them -- Configure your web server to serve
.mdfiles at predictable URLs - Add discovery links -- Inject
<link rel="alternate" type="text/markdown">into every page - Create llms.txt -- Build an AI sitemap listing your key pages
- Add bot tracking -- (Optional) Log AI crawler visits to measure adoption
- Test everything -- Verify endpoints, headers, and discovery links
Part 1: Markdown Auto-Discovery
What Your Markdown Files Should Look Like
Each .md file should have YAML frontmatter followed by clean content:
---
title: "About Us"
date: 2024-01-15
url: https://yoursite.com/about
type: page
description: "Learn about our company and mission"
---
# About Us
Your content here in clean Markdown format...
Key points:
- Strip all HTML boilerplate (navigation, footers, ads, scripts)
- Keep the actual content -- headings, paragraphs, lists, links
- Remove decorative images (icons, logos) but keep meaningful ones
- Remove duplicate CTAs and testimonial sections
- The goal is clean, readable text that an LLM can consume efficiently
Content-Type Header
Always serve .md files with:
Content-Type: text/markdown; charset=utf-8
Discovery Link Format
Add this to the <head> of every HTML page that has a Markdown counterpart:
<link rel="alternate" type="text/markdown" href="/page-slug.md" />
This is how AI crawlers find the Markdown version -- similar to how RSS feeds are discovered.
Part 2: llms.txt (AI Sitemap)
The llmstxt.org standard defines a file at /llms.txt that helps AI systems understand your site at a glance.
Format
# Your Company Name
> A one-paragraph summary of what your company does and what this site contains.
## Core Pages
- [About](https://yoursite.com/about.md): Company background and mission
- [Pricing](https://yoursite.com/pricing.md): Plans and pricing details
- [Contact](https://yoursite.com/contact.md): How to get in touch
## Products
- [Product One](https://yoursite.com/product-one.md): Description of product one
- [Product Two](https://yoursite.com/product-two.md): Description of product two
## Blog
- [Recent Article](https://yoursite.com/recent-article.md): Article description
Rules (from the spec):
- H1 with your project/company name (required)
- Blockquote summary (required)
- H2 sections grouping related pages
- Markdown list items with links to
.mdendpoints - Optional descriptions after each link
Where to Put It
- Primary:
https://yoursite.com/llms.txt(root of your site) - Alternative:
https://yoursite.com/.well-known/llms.txt
Generating It
For small sites, write it by hand. For larger sites, generate it from your existing .md files by reading their frontmatter (title, description, type) and grouping them into sections.
Part 3: Bot Tracking
Knowing which AI crawlers visit your content helps you measure whether the implementation is working.
Known AI Bot User Agents
| Bot | Company | User Agent String |
|-----|---------|-------------------|
| GPTBot | OpenAI | GPTBot/1.x |
| ChatGPT-User | OpenAI | ChatGPT-User |
| ClaudeBot | Anthropic | ClaudeBot/1.0 |
| Claude-Web | Anthropic | Claude-Web |
| PerplexityBot | Perplexity | PerplexityBot |
| Google-Extended | Google | Google-Extended |
| Applebot-Extended | Apple | Applebot-Extended |
| Meta-ExternalAgent | Meta | meta-externalagent |
| Bytespider | ByteDance | Bytespider |
| CCBot | Common Crawl | CCBot |
| YouBot | You.com | YouBot |
Simple Tracking Approach
Log requests to .md URLs where the user agent matches a known AI bot. Write each hit to a CSV file:
timestamp,bot_name,page,ip,user_agent
2026-01-22 13:32:03,GPTBot,/about.md,74.7.243.200,"Mozilla/5.0 ... GPTBot/1.3 ..."
2026-01-23 20:38:27,Meta-AI,/pricing.md,2a03:2880:f80e:5b::,"meta-externalagent/1.1 ..."
What to Look For
- Are bots visiting? Any hits at all means your discovery links and/or sitemap are working
- Which bots? GPTBot and ClaudeBot are the most common
- Which pages? See what content AI systems find most interesting
- Frequency? Daily visits vs occasional crawls
Platform-Specific Instructions
WordPress
WordPress is one of the easiest platforms to implement this on.
Step 1: Export Content and Generate Markdown
Export your posts/pages from the WordPress database, then convert HTML to Markdown locally using Turndown (Node.js):
// generate-markdown.js
const fs = require('fs');
const TurndownService = require('turndown');
const turndown = new TurndownService({ headingStyle: 'atx', codeBlockStyle: 'fenced' });
// For each post/page from your database export:
function processPost(post) {
const markdown = turndown.turndown(post.content);
return `---
title: "${post.title}"
date: ${post.date}
url: https://yoursite.com/${post.slug}
type: ${post.type}
---
# ${post.title}
${markdown}`;
}
// Write to file
fs.writeFileSync(`output/md/${post.slug}.md`, processPost(post));
Tip: Export content via MySQL query as a TSV file -- more reliable than JSON for content with special characters:
SELECT ID, post_title, post_name, post_date, post_type, post_excerpt,
REPLACE(REPLACE(REPLACE(post_content, '\t', ' '), '\r', ''), '\n', '{{NEWLINE}}')
FROM wp_posts WHERE post_status = 'publish' AND post_type IN ('post', 'page')
ORDER BY post_date DESC;
Step 2: Upload to Server
scp -r output/md/* user@server:/var/www/html/md/
For Bitnami WordPress on Lightsail, the path is /opt/bitnami/wordpress/md/.
Step 3: Add .htaccess Rules
Add these before the WordPress rewrite rules:
# BEGIN Markdown Auto-Discovery
<IfModule mod_mime.c>
AddType text/markdown .md
</IfModule>
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_URI} ^/([^/]+)\.md$
RewriteCond %{DOCUMENT_ROOT}/md/%1.md -f
RewriteRule ^([^/]+)\.md$ /md/$1.md [L]
</IfModule>
# END Markdown Auto-Discovery
This routes yoursite.com/about.md to yoursite.com/md/about.md if the file exists.
Step 4: Add Discovery Link (functions.php)
Add to your child theme's functions.php:
function add_markdown_discovery_link() {
if (is_singular(array('post', 'page'))) {
$slug = get_post_field('post_name', get_post());
if ($slug) {
echo '<link rel="alternate" type="text/markdown" href="/' . esc_attr($slug) . '.md" />' . "\n";
}
}
}
add_action('wp_head', 'add_markdown_discovery_link', 2);
Step 5: Add Bot Tracker (functions.php)
add_action('init', 'track_llm_bot_visits');
function track_llm_bot_visits() {
$request_uri = $_SERVER['REQUEST_URI'] ?? '';
if (!preg_match('/\.md$/i', $request_uri)) return;
$user_agent = $_SERVER['HTTP_USER_AGENT'] ?? '';
$ai_bots = [
'GPTBot' => 'GPTBot', 'ChatGPT-User' => 'ChatGPT-User',
'ClaudeBot' => 'ClaudeBot', 'Claude-Web' => 'Claude-Web',
'anthropic-ai' => 'Anthropic', 'Google-Extended' => 'Google-Extended',
'Applebot-Extended' => 'Applebot-Extended', 'PerplexityBot' => 'PerplexityBot',
'Bytespider' => 'Bytespider', 'CCBot' => 'CCBot', 'cohere-ai' => 'Cohere',
'YouBot' => 'YouBot', 'Meta-ExternalAgent' => 'Meta-AI',
];
$detected_bot = null;
foreach ($ai_bots as $pattern => $name) {
if (stripos($user_agent, $pattern) !== false) { $detected_bot = $name; break; }
}
if (!$detected_bot) return;
$csv_path = ABSPATH . 'md/llm-visits.csv';
if (!file_exists($csv_path)) {
file_put_contents($csv_path, "timestamp,bot_name,page,ip,user_agent\n");
}
$log_entry = sprintf("%s,%s,%s,%s,\"%s\"\n",
date('Y-m-d H:i:s'), $detected_bot,
parse_url($request_uri, PHP_URL_PATH),
$_SERVER['REMOTE_ADDR'] ?? 'unknown',
str_replace('"', '""', $user_agent)
);
file_put_contents($csv_path, $log_entry, FILE_APPEND | LOCK_EX);
}
Step 6: Create llms.txt
Write your llms.txt file and upload it to the web root:
scp llms.txt user@server:/var/www/html/llms.txt
No server config changes needed -- Apache serves static files from the web root by default.
Next.js / React
Markdown API Route
// app/[slug]/page.md/route.ts (App Router)
import { NextRequest, NextResponse } from 'next/server';
import TurndownService from 'turndown';
import { getPostBySlug } from '@/lib/posts';
const turndown = new TurndownService();
export async function GET(
request: NextRequest,
{ params }: { params: { slug: string } }
) {
const post = await getPostBySlug(params.slug);
if (!post) return new NextResponse('Not found', { status: 404 });
const markdown = `---
title: "${post.title}"
date: ${post.date}
url: https://yoursite.com/${post.slug}
description: "${post.description}"
---
# ${post.title}
${turndown.turndown(post.content)}`;
return new NextResponse(markdown, {
headers: { 'Content-Type': 'text/markdown; charset=utf-8' },
});
}
Discovery Link
// In your page component's <head> or metadata
<link rel="alternate" type="text/markdown" href={`/${post.slug}.md`} />
Rewrites (next.config.js)
module.exports = {
async rewrites() {
return [{ source: '/:slug.md', destination: '/api/:slug.md' }];
},
};
llms.txt
For Next.js, create public/llms.txt -- it will be served automatically at /llms.txt.
Hugo / Static Site Generators
Hugo can output Markdown alongside HTML natively:
# config.toml
[outputFormats.MD]
mediaType = "text/markdown"
baseName = "index"
isPlainText = true
[outputs]
page = ["HTML", "MD"]
Create a template at layouts/_default/single.md:
---
title: "{{ .Title }}"
date: {{ .Date.Format "2006-01-02" }}
url: {{ .Permalink }}
---
# {{ .Title }}
{{ .RawContent }}
Add discovery link to layouts/_default/baseof.html:
{{ if .IsPage }}
<link rel="alternate" type="text/markdown" href="{{ .RelPermalink }}index.md" />
{{ end }}
For llms.txt, create static/llms.txt and it will be copied to the build output.
Custom / Other Frameworks
The pattern is the same regardless of framework:
- Create a
/slug.mdendpoint that returns Markdown withContent-Type: text/markdown - Add
<link rel="alternate" type="text/markdown">to your HTML<head> - Put
llms.txtat your web root - (Optional) Log AI bot visits to a file
Server config for Nginx:
location ~ ^/(.+)\.md$ {
alias /var/www/html/md/$1.md;
default_type text/markdown;
add_header Content-Type "text/markdown; charset=utf-8";
}
Server config for Apache:
AddType text/markdown .md
Testing & Validation
Run these checks after implementation:
# 1. Markdown endpoint returns 200
curl -I https://yoursite.com/about.md
# Expected: HTTP/2 200
# 2. Content-Type is correct
curl -I https://yoursite.com/about.md | grep -i content-type
# Expected: content-type: text/markdown
# 3. Content has frontmatter
curl https://yoursite.com/about.md | head -10
# Expected: starts with ---
# 4. Discovery link exists in HTML
curl -s https://yoursite.com/about | grep 'text/markdown'
# Expected: <link rel="alternate" type="text/markdown" href="/about.md" />
# 5. llms.txt is accessible
curl -I https://yoursite.com/llms.txt
# Expected: HTTP/2 200
# 6. llms.txt content looks right
curl https://yoursite.com/llms.txt | head -10
# Expected: starts with # Your Company Name
Real-World Results
We implemented this on two production WordPress sites in January 2026. Within 5 weeks:
UK Site (289 pages):
- 701 AI bot visits to
.mdendpoints - ClaudeBot: 275 visits (39%)
- GPTBot: 231 visits (33%)
- Meta-AI: 195 visits (28%)
US Site (119 pages):
- 187 AI bot visits to
.mdendpoints - GPTBot: 95 visits (51%)
- ClaudeBot: 91 visits (49%)
- PerplexityBot: 1 visit
AI crawlers started hitting the .md endpoints within hours of deployment. GPTBot in particular crawled aggressively once it discovered the first few pages, often visiting dozens of pages in a single session.
Using AI Assistants to Implement
This entire implementation can be done with help from AI coding assistants like Claude Code or Cursor.
Example Prompts
For WordPress:
I have a WordPress site hosted on [provider]. I want to implement
markdown auto-discovery for AI crawlers. Please:
1. Export my posts/pages content
2. Generate markdown files with frontmatter
3. Configure .htaccess to serve them
4. Add discovery links to my theme
5. Create an llms.txt file
6. Add bot tracking
My SSH access is [details]. My theme is [theme-name].
For Next.js:
Add LLM-friendly content to my Next.js site. I want:
1. An API route serving markdown versions of pages at /[slug].md
2. Discovery <link> tags in the <head> of each page
3. An llms.txt file in public/
4. Proper Content-Type headers
My content comes from [CMS/database/files].
Tips
- Share your project structure so the AI understands your codebase
- Go step by step -- don't try to do everything at once
- Test after each change -- verify before moving on
- Keep backups of any files you modify (especially
.htaccessandfunctions.php)
Quick Reference
Files You'll Create/Modify
| Platform | Markdown Files | Server Config | Discovery Link | llms.txt |
|----------|---------------|---------------|----------------|----------|
| WordPress | /md/*.md | .htaccess | functions.php | Web root |
| Next.js | API route | next.config.js | <Head> component | public/llms.txt |
| Gatsby | public/md/*.md | _redirects | <Helmet> component | public/llms.txt |
| Hugo | Built-in output | N/A | Template | static/llms.txt |
| Custom | /md/*.md | nginx/apache conf | Template | Web root |
Key URLs to Verify
https://yoursite.com/llms.txt -- AI sitemap
https://yoursite.com/about.md -- Example markdown page
https://yoursite.com/.well-known/llms.txt -- Alternative AI sitemap location
Resources
- The Third Audience -- The concept article by Dries Buytaert
- llmstxt.org -- The llms.txt standard specification
- Turndown -- HTML to Markdown converter (JavaScript)
- markdownify -- HTML to Markdown (Python)
Last updated: February 2026