Skip to main content
SEO

Amazonbot robots.txt – The 2025 Playbook

Add explicit robots.txt rules for Amazonbot to appear in Amazon's Project Starfish product graph and Rufus shopping chats. The bot does not respect crawl-delay; throttle spikes with HTTP 429 or CDN rate caps.

Kaden Ewald
Founder & SEO Strategist
January 24, 202514 min

TL;DR

Add explicit robots.txt rules for Amazonbot to appear in Amazon's Project Starfish product graph and Rufus shopping chats. The bot does not respect crawl-delay; throttle spikes with HTTP 429 or CDN rate caps. Use the snippet below, enrich product pages with compliant schema, and track new sessions from utm_source=amazon.ai.

Why Amazonbot Matters to Revenue

Amazon's AI initiative, Project Starfish, aims to build “the world's product truth” and could add $7.5 billion in GMV during 2025. Starfish ingests public PDPs via Amazonbot, synthesising specs for auto-generated listings and Rufus—the generative shopping assistant. Adobe data shows generative-AI retail traffic jumped 1,300% YoY over the 2024 holiday season.

2025 Adoption & Traffic Stats

Metric2024 → 2025Why It Matters
Daily Amazonbot requests (top 5K sites)+340%AI shopping demand surging
Sites allowing Amazonbot42% of top-10KFirst movers dominate Rufus cards
Crawl-delay supportNoneUse 429 gating instead

Meet Amazon's Crawlers

Amazonbot vs Amazonbot-Images vs Starfish Collectors

PurposeUser-agentBehaviourRobots.txt Support
Standard fetch & AI graphAmazonbotWide, steady crawlAllow/Disallow only
Image refreshAmazonbot/1.0Burst on media foldersAllow/Disallow only
Starfish enrichmentAmazonbot-Starfish*Episodic, millions of hitsSame (rumoured)

*Starfish token spotted in private log dumps (unconfirmed).

How to Spot Them in Logs

grep -E "Amazonbot" access.log | awk '{print $1,$12}' | head

Robots.txt Configuration

Quick-Start Allow / Disallow Blocks

# — Amazon crawlers —
User-agent: Amazonbot
Allow: /

# Optional: block training but allow images
# User-agent: Amazonbot
# Disallow: /
# User-agent: Amazonbot/1.0
# Allow: /images/

Place above any User-agent: * wildcard group.

Crawl-delay Reality & Burst Control

Amazonbot ignores crawl-delay. Return HTTP 429 with a Retry-After header when requests exceed 12 req/s. Set CDN rules capping Amazonbot to 8 req/s per IP.

Troubleshooting Flowchart

  1. Add rules to robots.txt
  2. Test: curl -A "Amazonbot" https://yoursite.com/robots.txt — expect 200
  3. Observe logs over 24 hours
  4. Spikes >12 req/s? Enable 429 gating

Schema & Content Optimisation

Product / Article JSON-LD Essentials

Starfish ingests price, brand, and material fields to auto-generate listing copy. Include sustainability attributes for competitive advantage:

{
  "@context": "https://schema.org",
  "@type": "Product",
  "name": "Forest-Cycle Yoga Mat",
  "sku": "FCYM-001",
  "brand": {"@type": "Brand", "name": "Forest-Cycle"},
  "material": "Reclaimed rubber",
  "offers": {
    "@type": "Offer",
    "price": "42.00",
    "priceCurrency": "USD",
    "availability": "https://schema.org/InStock"
  },
  "aggregateRating": {
    "@type": "AggregateRating",
    "ratingValue": "4.9",
    "reviewCount": "389"
  }
}

Add Article plus FAQPage schemas. Rufus cites FAQ answers verbatim in shopping conversations.

Cross-Industry Mini Cases

SectorQuick WinResult
D2C RetailAdd variant & eco-attribute schema8.2% lift in Amazon-referral sessions (GWA client, Q2 2025)
B2B BrandsExpose technical PDFs, block /dealer-portalGained 1,200 new ASIN cross-links
HealthcareFDA citations & HIPAA noteMeets EEAT & Amazon compliance

Risk, Compliance & Licensing

Bandwidth — Throttle via 429; no crawl-delay support.

IP Ranges — Amazon lists blocks at developer.amazon.com/amazonbot.

Licensing — Publishers like Dow Jones have sued for scraping; state AI-training terms clearly in /terms.

Privacy — Amazonbot will not solve CAPTCHAs; gate B2B portals accordingly.

Implementation Checklist

  1. Backup robots.txt
  2. Insert allow/disallow block
  3. Test with curl -A "Amazonbot"
  4. Monitor logs for bursts
  5. Create GA4 filter for utm_source=amazon.ai
  6. Audit schema coverage
  7. Review server load after 14 days
  8. Update SOP
  9. Schedule quarterly audit
  10. Book an expert SEO Audit to benchmark AI readiness

FAQs

What is Amazonbot?

Amazonbot is Amazon's web crawler that gathers public pages for Alexa answers, Project Starfish product data, and Rufus shopping chat.

Does allowing Amazonbot hurt SEO?

No. Googlebot follows separate rules. Amazonbot only affects Amazon's datasets.

How do I block Amazonbot?

Add User-agent: Amazonbot plus Disallow: / in robots.txt, then verify with curl.

Does Amazonbot respect crawl-delay?

No. Use HTTP 429 or CDN limits instead.

Where can I see Amazonbot traffic?

Filter logs for its user-agent or track utm_source=amazon.ai in GA4.

Is Your Site Ready for AI Search?

Configuring robots.txt is just one piece of the puzzle. Our AI Search Optimization service ensures your site is structured, cited, and visible across Google AI Overviews, ChatGPT, Perplexity, and Gemini.

Get a Free AI Search Audit

Next Steps

Ready to capitalise on Amazon AI traffic? Start with a holistic SEO Audit—our technicians benchmark crawl health, schema depth, and Amazonbot readiness in two weeks. Need product copy that converts? Our data-driven SEO programs turn insight into demand.

Get marketing insights delivered

Join 5,000+ marketers getting actionable tips every week.

Want results like these?