TL;DR
Add explicit robots.txt Allow rules for PerplexityBot and Perplexity-User to earn citations and product cards in Perplexity AI's “Ask Anything” mode. The bot ignores crawl-delay, so rely on 429 + Retry-After or CDN rate controls for bursts. Follow the quick-start snippet below, enrich pages with Product/FAQ schema, and monitor logs for ROI.
Why PerplexityBot Matters to Revenue
Perplexity processed 780 million user queries in May 2025—quadruple last year's volume. Third-party trackers place Perplexity at 22 million monthly active users and a 6.3% share of the AI-chat market. Its crawler, PerplexityBot, is a gateway to a fast-growing discovery channel that surfaces direct links—often above traditional SERPs.
2025 Adoption & Traffic Stats
| Metric | 2024 → 2025 | Why It Matters |
|---|---|---|
| Total Perplexity queries/month | 400M → ~3B | More queries mean more chances to surface your brand |
| Sites explicitly allowing PerplexityBot | 31% of top-10K domains | First movers dominate answer boxes |
| Ad revenue pilot | Sponsored follow-up Q's live | Early visibility may convert to paid placements |
Meet Perplexity's Crawlers
PerplexityBot vs Perplexity-User
| Purpose | User-agent | Behaviour | Respects robots.txt? |
|---|---|---|---|
| Model + index | PerplexityBot | Wide, continuous scrape | Yes – obeys Allow/Disallow |
| Real-time fetch | Perplexity-User | Burst traffic per query | Yes |
Controversy: Some researchers claim the bot sometimes hits disallowed paths during spikes. Mitigate with server-side 403s on sensitive routes.
How to Spot Them in Logs
grep -E "PerplexityBot|Perplexity-User" access.log | cut -d' ' -f1,12 | head
Robots.txt Configuration
Quick-Start Allow Block
# — Perplexity allow-list (training + search) —
User-agent: PerplexityBot
Allow: /
User-agent: Perplexity-User
Allow: /Place above any User-agent: * group to avoid override.
Throttling & Defensive Rules
Perplexity ignores crawl-delay. Use 429 rate limiting at your CDN or web server. After testing, raise the threshold gradually to avoid blocking genuine bursts.
Troubleshooting Flowchart
- Add rules to robots.txt
- Test:
curl -A "PerplexityBot" https://yoursite.com/robots.txt— expect 200 - Observe logs for Perplexity-User within 12 hours
- Burst >12 req/s? Enable 429 gating
Schema & Content Optimisation
Product / Article JSON-LD Essentials
Perplexity lifts price, rating, and description into its answer cards directly from Product schema. Keep JSON-LD under 32 KB with aggregateRating for maximum citation potential:
{
"@context": "https://schema.org",
"@type": "Product",
"name": "Zero-Waste Bamboo Toothbrush",
"sku": "BAMB-TB-01",
"offers": {
"@type": "Offer",
"price": "6.00",
"priceCurrency": "USD"
},
"aggregateRating": {
"@type": "AggregateRating",
"ratingValue": "4.7",
"reviewCount": "214"
}
}Attach Article plus FAQPage schemas. Perplexity quotes FAQ answers verbatim and back-links the source—prime real estate for your messaging.
Cross-Industry Mini Cases
| Sector | Quick Win | Result |
|---|---|---|
| Retail / D2C | Add live stock schema | 7% lift in Perplexity-referral sessions (GWA client, Q2 2025) |
| B2B SaaS | Allow docs, set 429 on >10 req/s | 33% less bot bandwidth, citations intact |
| Healthcare | Include peer-review citations | Aligns with June 2025 core update trust focus |
Risk & Compliance
Bandwidth — Use 429 + Retry-After; bot ignores crawl-delay.
IP Ranges — Not publicly listed; rely on UA filtering.
Licensing — Perplexity revenue-shares with publishers but faces ongoing copyright lawsuits. Clarify AI-training permissions in your Terms.
Privacy — The bot will not log in or solve CAPTCHAs; gate sensitive data accordingly.
Implementation Checklist
- Back up robots.txt
- Add allow block for both Perplexity agents
- Test with
curl -A "PerplexityBot" - Monitor logs for bursts
- Create GA4 filter for
utm_source=perplexity.ai - Audit schema coverage
- Review server load after 14 days
- Update SOP
- Schedule quarterly crawl audit
- Book an expert SEO Audit to benchmark AI readiness
FAQs
What is PerplexityBot?
PerplexityBot is Perplexity AI's main web crawler, collecting public pages for model training and answer generation.
Is PerplexityBot safe for SEO?
Yes—it obeys robots.txt Allow/Disallow but ignores crawl-delay. Blocking it does not affect Google rankings.
How do I block PerplexityBot?
Add User-agent: PerplexityBot plus Disallow: /. Consider allowing Perplexity-User if you still want real-time answer visibility.
Does Perplexity respect crawl-delay?
No. Use HTTP 429 and CDN rate limits instead.
Where can I see PerplexityBot traffic?
Filter logs for its user-agent or create a GA4 dimension for utm_source=perplexity.ai.
Is Your Site Ready for AI Search?
Configuring robots.txt is just one piece of the puzzle. Our AI Search Optimization service ensures your site is structured, cited, and visible across Google AI Overviews, ChatGPT, Perplexity, and Gemini.
Get a Free AI Search AuditNext Steps
Ready to harness AI crawler traffic? Start with a comprehensive SEO Audit—we benchmark crawl health, schema depth, and Perplexity eligibility in two weeks. Need magnetic content? Our SEO programs turn data into demand.