Get 50% more traffic by automating content versions (not new content) that LLMs love! - Hurricane Works Blog

How to Make Your Website LLM-Friendly

A practical guide to making your website content accessible to AI systems like ChatGPT, Claude, Perplexity, and other LLM-powered tools.

Why: The Third Audience
What: The Three Components
When: Should You Implement This?
How: Implementation Overview
Part 1: Markdown Auto-Discovery
Part 2: llms.txt (AI Sitemap)
Part 3: Bot Tracking
Platform-Specific Instructions
Testing & Validation
Real-World Results
Using AI Assistants to Implement
Resources

Why: The Third Audience

Websites have traditionally served two audiences:

Humans -- who read and interact with your pages
Search engines -- who crawl and index for Google, Bing, etc.

There is now a third audience: AI systems.

Large Language Models (LLMs) like GPT-4, Claude, and others are increasingly used to answer questions, summarise information, and assist with research. When these systems reference your content, they benefit from clean, structured formats rather than complex HTML with navigation, ads, and scripts.

Reference: The Third Audience by Dries Buytaert (creator of Drupal)

Benefits

Accurate AI responses -- Clean content means fewer hallucinations when AI references your site
Brand mentions -- AI tools may cite your site more often when they can easily parse your content
Future-proofing -- As AI search grows, optimised sites will have an advantage
No SEO downside -- This complements traditional SEO, doesn't replace it

What: The Three Components

A fully LLM-optimised site has three things:

1. Markdown Auto-Discovery

For every page on your site, provide a Markdown version at a predictable URL and tell crawlers about it:

https://yoursite.com/about        --> HTML page (for humans)
https://yoursite.com/about.md     --> Markdown version (for AI)

Plus a discovery link in your HTML <head>:

<link rel="alternate" type="text/markdown" href="/about.md" />

2. llms.txt (AI Sitemap)

A single file at your site root following the llmstxt.org standard. Think of it as a sitemap specifically for AI crawlers -- it describes your site and links to all your .md pages:

https://yoursite.com/llms.txt

3. Bot Tracking (Optional)

Log which AI crawlers visit your .md files so you can measure whether it's working.

When: Should You Implement This?

Good Candidates

Content-heavy sites -- Blogs, documentation, news, educational content
B2B websites -- Product pages, pricing, features that people research
Reference sites -- APIs, technical docs, knowledge bases
Sites that want AI visibility -- If you want AI to accurately represent your brand

Less Critical

E-commerce product listings -- Structured data (JSON-LD) may be more valuable
Highly interactive apps -- Where content is dynamic/personalised
Private/gated content -- Unless you want AI to access it

Time Investment

Simple sites: 2-4 hours
WordPress/CMS: 4-8 hours
Large custom sites: 1-2 days

How: Implementation Overview

Regardless of platform, the implementation follows these steps:

Generate Markdown files -- Convert your HTML content to clean Markdown with YAML frontmatter
Serve them -- Configure your web server to serve .md files at predictable URLs
Add discovery links -- Inject <link rel="alternate" type="text/markdown"> into every page
Create llms.txt -- Build an AI sitemap listing your key pages
Add bot tracking -- (Optional) Log AI crawler visits to measure adoption
Test everything -- Verify endpoints, headers, and discovery links

Part 1: Markdown Auto-Discovery

What Your Markdown Files Should Look Like

Each .md file should have YAML frontmatter followed by clean content:

---
title: "About Us"
date: 2024-01-15
url: https://yoursite.com/about
type: page
description: "Learn about our company and mission"
---

# About Us

Your content here in clean Markdown format...

Key points:

Strip all HTML boilerplate (navigation, footers, ads, scripts)
Keep the actual content -- headings, paragraphs, lists, links
Remove decorative images (icons, logos) but keep meaningful ones
Remove duplicate CTAs and testimonial sections
The goal is clean, readable text that an LLM can consume efficiently

Content-Type Header

Always serve .md files with:

Content-Type: text/markdown; charset=utf-8

Discovery Link Format

Add this to the <head> of every HTML page that has a Markdown counterpart:

<link rel="alternate" type="text/markdown" href="/page-slug.md" />

This is how AI crawlers find the Markdown version -- similar to how RSS feeds are discovered.

Part 2: llms.txt (AI Sitemap)

The llmstxt.org standard defines a file at /llms.txt that helps AI systems understand your site at a glance.

Format

# Your Company Name

> A one-paragraph summary of what your company does and what this site contains.

## Core Pages

- [About](https://yoursite.com/about.md): Company background and mission
- [Pricing](https://yoursite.com/pricing.md): Plans and pricing details
- [Contact](https://yoursite.com/contact.md): How to get in touch

## Products

- [Product One](https://yoursite.com/product-one.md): Description of product one
- [Product Two](https://yoursite.com/product-two.md): Description of product two

## Blog

- [Recent Article](https://yoursite.com/recent-article.md): Article description

Rules (from the spec):

H1 with your project/company name (required)
Blockquote summary (required)
H2 sections grouping related pages
Markdown list items with links to .md endpoints
Optional descriptions after each link

Where to Put It

Primary: https://yoursite.com/llms.txt (root of your site)
Alternative: https://yoursite.com/.well-known/llms.txt

Generating It

For small sites, write it by hand. For larger sites, generate it from your existing .md files by reading their frontmatter (title, description, type) and grouping them into sections.

Part 3: Bot Tracking

Knowing which AI crawlers visit your content helps you measure whether the implementation is working.

Known AI Bot User Agents

| Bot | Company | User Agent String | |-----|---------|-------------------| | GPTBot | OpenAI | GPTBot/1.x | | ChatGPT-User | OpenAI | ChatGPT-User | | ClaudeBot | Anthropic | ClaudeBot/1.0 | | Claude-Web | Anthropic | Claude-Web | | PerplexityBot | Perplexity | PerplexityBot | | Google-Extended | Google | Google-Extended | | Applebot-Extended | Apple | Applebot-Extended | | Meta-ExternalAgent | Meta | meta-externalagent | | Bytespider | ByteDance | Bytespider | | CCBot | Common Crawl | CCBot | | YouBot | You.com | YouBot |

Simple Tracking Approach

Log requests to .md URLs where the user agent matches a known AI bot. Write each hit to a CSV file:

timestamp,bot_name,page,ip,user_agent
2026-01-22 13:32:03,GPTBot,/about.md,74.7.243.200,"Mozilla/5.0 ... GPTBot/1.3 ..."
2026-01-23 20:38:27,Meta-AI,/pricing.md,2a03:2880:f80e:5b::,"meta-externalagent/1.1 ..."

What to Look For

Are bots visiting? Any hits at all means your discovery links and/or sitemap are working
Which bots? GPTBot and ClaudeBot are the most common
Which pages? See what content AI systems find most interesting
Frequency? Daily visits vs occasional crawls

Platform-Specific Instructions

WordPress

WordPress is one of the easiest platforms to implement this on.

Step 1: Export Content and Generate Markdown

Export your posts/pages from the WordPress database, then convert HTML to Markdown locally using Turndown (Node.js):

// generate-markdown.js
const fs = require('fs');
const TurndownService = require('turndown');
const turndown = new TurndownService({ headingStyle: 'atx', codeBlockStyle: 'fenced' });

// For each post/page from your database export:
function processPost(post) {
    const markdown = turndown.turndown(post.content);

    return `---
title: "${post.title}"
date: ${post.date}
url: https://yoursite.com/${post.slug}
type: ${post.type}
---

# ${post.title}

${markdown}`;
}

// Write to file
fs.writeFileSync(`output/md/${post.slug}.md`, processPost(post));

Tip: Export content via MySQL query as a TSV file -- more reliable than JSON for content with special characters:

SELECT ID, post_title, post_name, post_date, post_type, post_excerpt,
REPLACE(REPLACE(REPLACE(post_content, '\t', ' '), '\r', ''), '\n', '{{NEWLINE}}')
FROM wp_posts WHERE post_status = 'publish' AND post_type IN ('post', 'page')
ORDER BY post_date DESC;

Step 2: Upload to Server

scp -r output/md/* user@server:/var/www/html/md/

For Bitnami WordPress on Lightsail, the path is /opt/bitnami/wordpress/md/.

Step 3: Add .htaccess Rules

Add these before the WordPress rewrite rules:

# BEGIN Markdown Auto-Discovery
<IfModule mod_mime.c>
    AddType text/markdown .md
</IfModule>

<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /

RewriteCond %{REQUEST_URI} ^/([^/]+)\.md$
RewriteCond %{DOCUMENT_ROOT}/md/%1.md -f
RewriteRule ^([^/]+)\.md$ /md/$1.md [L]
</IfModule>
# END Markdown Auto-Discovery

This routes yoursite.com/about.md to yoursite.com/md/about.md if the file exists.

Step 4: Add Discovery Link (functions.php)

Add to your child theme's functions.php:

function add_markdown_discovery_link() {
    if (is_singular(array('post', 'page'))) {
        $slug = get_post_field('post_name', get_post());
        if ($slug) {
            echo '<link rel="alternate" type="text/markdown" href="/' . esc_attr($slug) . '.md" />' . "\n";
        }
    }
}
add_action('wp_head', 'add_markdown_discovery_link', 2);

Step 5: Add Bot Tracker (functions.php)

add_action('init', 'track_llm_bot_visits');

function track_llm_bot_visits() {
    $request_uri = $_SERVER['REQUEST_URI'] ?? '';
    if (!preg_match('/\.md$/i', $request_uri)) return;

    $user_agent = $_SERVER['HTTP_USER_AGENT'] ?? '';
    $ai_bots = [
        'GPTBot' => 'GPTBot', 'ChatGPT-User' => 'ChatGPT-User',
        'ClaudeBot' => 'ClaudeBot', 'Claude-Web' => 'Claude-Web',
        'anthropic-ai' => 'Anthropic', 'Google-Extended' => 'Google-Extended',
        'Applebot-Extended' => 'Applebot-Extended', 'PerplexityBot' => 'PerplexityBot',
        'Bytespider' => 'Bytespider', 'CCBot' => 'CCBot', 'cohere-ai' => 'Cohere',
        'YouBot' => 'YouBot', 'Meta-ExternalAgent' => 'Meta-AI',
    ];

    $detected_bot = null;
    foreach ($ai_bots as $pattern => $name) {
        if (stripos($user_agent, $pattern) !== false) { $detected_bot = $name; break; }
    }
    if (!$detected_bot) return;

    $csv_path = ABSPATH . 'md/llm-visits.csv';
    if (!file_exists($csv_path)) {
        file_put_contents($csv_path, "timestamp,bot_name,page,ip,user_agent\n");
    }

    $log_entry = sprintf("%s,%s,%s,%s,\"%s\"\n",
        date('Y-m-d H:i:s'), $detected_bot,
        parse_url($request_uri, PHP_URL_PATH),
        $_SERVER['REMOTE_ADDR'] ?? 'unknown',
        str_replace('"', '""', $user_agent)
    );
    file_put_contents($csv_path, $log_entry, FILE_APPEND | LOCK_EX);
}

Step 6: Create llms.txt

Write your llms.txt file and upload it to the web root:

scp llms.txt user@server:/var/www/html/llms.txt

No server config changes needed -- Apache serves static files from the web root by default.

Next.js / React

Markdown API Route

// app/[slug]/page.md/route.ts (App Router)
import { NextRequest, NextResponse } from 'next/server';
import TurndownService from 'turndown';
import { getPostBySlug } from '@/lib/posts';

const turndown = new TurndownService();

export async function GET(
  request: NextRequest,
  { params }: { params: { slug: string } }
) {
  const post = await getPostBySlug(params.slug);
  if (!post) return new NextResponse('Not found', { status: 404 });

  const markdown = `---
title: "${post.title}"
date: ${post.date}
url: https://yoursite.com/${post.slug}
description: "${post.description}"
---

# ${post.title}

${turndown.turndown(post.content)}`;

  return new NextResponse(markdown, {
    headers: { 'Content-Type': 'text/markdown; charset=utf-8' },
  });
}

Discovery Link

// In your page component's <head> or metadata
<link rel="alternate" type="text/markdown" href={`/${post.slug}.md`} />

Rewrites (next.config.js)

module.exports = {
  async rewrites() {
    return [{ source: '/:slug.md', destination: '/api/:slug.md' }];
  },
};

llms.txt

For Next.js, create public/llms.txt -- it will be served automatically at /llms.txt.

Hugo / Static Site Generators

Hugo can output Markdown alongside HTML natively:

# config.toml
[outputFormats.MD]
mediaType = "text/markdown"
baseName = "index"
isPlainText = true

[outputs]
page = ["HTML", "MD"]

Create a template at layouts/_default/single.md:

---
title: "{{ .Title }}"
date: {{ .Date.Format "2006-01-02" }}
url: {{ .Permalink }}
---

# {{ .Title }}

{{ .RawContent }}

Add discovery link to layouts/_default/baseof.html:

{{ if .IsPage }}
<link rel="alternate" type="text/markdown" href="{{ .RelPermalink }}index.md" />
{{ end }}

For llms.txt, create static/llms.txt and it will be copied to the build output.

Custom / Other Frameworks

The pattern is the same regardless of framework:

Create a /slug.md endpoint that returns Markdown with Content-Type: text/markdown
Add <link rel="alternate" type="text/markdown"> to your HTML <head>
Put llms.txt at your web root
(Optional) Log AI bot visits to a file

Server config for Nginx:

location ~ ^/(.+)\.md$ {
    alias /var/www/html/md/$1.md;
    default_type text/markdown;
    add_header Content-Type "text/markdown; charset=utf-8";
}

Server config for Apache:

AddType text/markdown .md

Testing & Validation

Run these checks after implementation:

# 1. Markdown endpoint returns 200
curl -I https://yoursite.com/about.md
# Expected: HTTP/2 200

# 2. Content-Type is correct
curl -I https://yoursite.com/about.md | grep -i content-type
# Expected: content-type: text/markdown

# 3. Content has frontmatter
curl https://yoursite.com/about.md | head -10
# Expected: starts with ---

# 4. Discovery link exists in HTML
curl -s https://yoursite.com/about | grep 'text/markdown'
# Expected: <link rel="alternate" type="text/markdown" href="/about.md" />

# 5. llms.txt is accessible
curl -I https://yoursite.com/llms.txt
# Expected: HTTP/2 200

# 6. llms.txt content looks right
curl https://yoursite.com/llms.txt | head -10
# Expected: starts with # Your Company Name

Real-World Results

We implemented this on two production WordPress sites in January 2026. Within 5 weeks:

UK Site (289 pages):

701 AI bot visits to .md endpoints
ClaudeBot: 275 visits (39%)
GPTBot: 231 visits (33%)
Meta-AI: 195 visits (28%)

US Site (119 pages):

187 AI bot visits to .md endpoints
GPTBot: 95 visits (51%)
ClaudeBot: 91 visits (49%)
PerplexityBot: 1 visit

AI crawlers started hitting the .md endpoints within hours of deployment. GPTBot in particular crawled aggressively once it discovered the first few pages, often visiting dozens of pages in a single session.

Using AI Assistants to Implement

This entire implementation can be done with help from AI coding assistants like Claude Code or Cursor.

Example Prompts

For WordPress:

I have a WordPress site hosted on [provider]. I want to implement
markdown auto-discovery for AI crawlers. Please:
1. Export my posts/pages content
2. Generate markdown files with frontmatter
3. Configure .htaccess to serve them
4. Add discovery links to my theme
5. Create an llms.txt file
6. Add bot tracking

My SSH access is [details]. My theme is [theme-name].

For Next.js:

Add LLM-friendly content to my Next.js site. I want:
1. An API route serving markdown versions of pages at /[slug].md
2. Discovery <link> tags in the <head> of each page
3. An llms.txt file in public/
4. Proper Content-Type headers

My content comes from [CMS/database/files].

Tips

Share your project structure so the AI understands your codebase
Go step by step -- don't try to do everything at once
Test after each change -- verify before moving on
Keep backups of any files you modify (especially .htaccess and functions.php)

Quick Reference

Files You'll Create/Modify

| Platform | Markdown Files | Server Config | Discovery Link | llms.txt | |----------|---------------|---------------|----------------|----------| | WordPress | /md/*.md | .htaccess | functions.php | Web root | | Next.js | API route | next.config.js | <Head> component | public/llms.txt | | Gatsby | public/md/*.md | _redirects | <Helmet> component | public/llms.txt | | Hugo | Built-in output | N/A | Template | static/llms.txt | | Custom | /md/*.md | nginx/apache conf | Template | Web root |

Key URLs to Verify

https://yoursite.com/llms.txt              -- AI sitemap
https://yoursite.com/about.md              -- Example markdown page
https://yoursite.com/.well-known/llms.txt  -- Alternative AI sitemap location

Resources

The Third Audience -- The concept article by Dries Buytaert
llmstxt.org -- The llms.txt standard specification
Turndown -- HTML to Markdown converter (JavaScript)
markdownify -- HTML to Markdown (Python)

Last updated: February 2026