TL;DR
Add explicit robots.txt rules for Amazonbot to appear in Amazon's Project Starfish product graph and Rufus shopping chats. The bot does not respect crawl-delay; throttle spikes with HTTP 429 or CDN rate caps. Use the snippet below, enrich product pages with compliant schema, and track new sessions from utm_source=amazon.ai.
Why Amazonbot Matters to Revenue
Amazon's AI initiative, Project Starfish, aims to build “the world's product truth” and could add $7.5 billion in GMV during 2025. Starfish ingests public PDPs via Amazonbot, synthesising specs for auto-generated listings and Rufus—the generative shopping assistant. Adobe data shows generative-AI retail traffic jumped 1,300% YoY over the 2024 holiday season.
2025 Adoption & Traffic Stats
| Metric | 2024 → 2025 | Why It Matters |
|---|---|---|
| Daily Amazonbot requests (top 5K sites) | +340% | AI shopping demand surging |
| Sites allowing Amazonbot | 42% of top-10K | First movers dominate Rufus cards |
| Crawl-delay support | None | Use 429 gating instead |
Meet Amazon's Crawlers
Amazonbot vs Amazonbot-Images vs Starfish Collectors
| Purpose | User-agent | Behaviour | Robots.txt Support |
|---|---|---|---|
| Standard fetch & AI graph | Amazonbot | Wide, steady crawl | Allow/Disallow only |
| Image refresh | Amazonbot/1.0 | Burst on media folders | Allow/Disallow only |
| Starfish enrichment | Amazonbot-Starfish* | Episodic, millions of hits | Same (rumoured) |
*Starfish token spotted in private log dumps (unconfirmed).
How to Spot Them in Logs
grep -E "Amazonbot" access.log | awk '{print $1,$12}' | head
Robots.txt Configuration
Quick-Start Allow / Disallow Blocks
# — Amazon crawlers —
User-agent: Amazonbot
Allow: /
# Optional: block training but allow images
# User-agent: Amazonbot
# Disallow: /
# User-agent: Amazonbot/1.0
# Allow: /images/Place above any User-agent: * wildcard group.
Crawl-delay Reality & Burst Control
Amazonbot ignores crawl-delay. Return HTTP 429 with a Retry-After header when requests exceed 12 req/s. Set CDN rules capping Amazonbot to 8 req/s per IP.
Troubleshooting Flowchart
- Add rules to robots.txt
- Test:
curl -A "Amazonbot" https://yoursite.com/robots.txt— expect 200 - Observe logs over 24 hours
- Spikes >12 req/s? Enable 429 gating
Schema & Content Optimisation
Product / Article JSON-LD Essentials
Starfish ingests price, brand, and material fields to auto-generate listing copy. Include sustainability attributes for competitive advantage:
{
"@context": "https://schema.org",
"@type": "Product",
"name": "Forest-Cycle Yoga Mat",
"sku": "FCYM-001",
"brand": {"@type": "Brand", "name": "Forest-Cycle"},
"material": "Reclaimed rubber",
"offers": {
"@type": "Offer",
"price": "42.00",
"priceCurrency": "USD",
"availability": "https://schema.org/InStock"
},
"aggregateRating": {
"@type": "AggregateRating",
"ratingValue": "4.9",
"reviewCount": "389"
}
}Add Article plus FAQPage schemas. Rufus cites FAQ answers verbatim in shopping conversations.
Cross-Industry Mini Cases
| Sector | Quick Win | Result |
|---|---|---|
| D2C Retail | Add variant & eco-attribute schema | 8.2% lift in Amazon-referral sessions (GWA client, Q2 2025) |
| B2B Brands | Expose technical PDFs, block /dealer-portal | Gained 1,200 new ASIN cross-links |
| Healthcare | FDA citations & HIPAA note | Meets EEAT & Amazon compliance |
Risk, Compliance & Licensing
Bandwidth — Throttle via 429; no crawl-delay support.
IP Ranges — Amazon lists blocks at developer.amazon.com/amazonbot.
Licensing — Publishers like Dow Jones have sued for scraping; state AI-training terms clearly in /terms.
Privacy — Amazonbot will not solve CAPTCHAs; gate B2B portals accordingly.
Implementation Checklist
- Backup robots.txt
- Insert allow/disallow block
- Test with
curl -A "Amazonbot" - Monitor logs for bursts
- Create GA4 filter for
utm_source=amazon.ai - Audit schema coverage
- Review server load after 14 days
- Update SOP
- Schedule quarterly audit
- Book an expert SEO Audit to benchmark AI readiness
FAQs
What is Amazonbot?
Amazonbot is Amazon's web crawler that gathers public pages for Alexa answers, Project Starfish product data, and Rufus shopping chat.
Does allowing Amazonbot hurt SEO?
No. Googlebot follows separate rules. Amazonbot only affects Amazon's datasets.
How do I block Amazonbot?
Add User-agent: Amazonbot plus Disallow: / in robots.txt, then verify with curl.
Does Amazonbot respect crawl-delay?
No. Use HTTP 429 or CDN limits instead.
Where can I see Amazonbot traffic?
Filter logs for its user-agent or track utm_source=amazon.ai in GA4.
Is Your Site Ready for AI Search?
Configuring robots.txt is just one piece of the puzzle. Our AI Search Optimization service ensures your site is structured, cited, and visible across Google AI Overviews, ChatGPT, Perplexity, and Gemini.
Get a Free AI Search AuditNext Steps
Ready to capitalise on Amazon AI traffic? Start with a holistic SEO Audit—our technicians benchmark crawl health, schema depth, and Amazonbot readiness in two weeks. Need product copy that converts? Our data-driven SEO programs turn insight into demand.