# As a condition of accessing this website, you agree to abide by the following # content signals: # (a) If a Content-Signal = yes, you may collect content for the corresponding # use. # (b) If a Content-Signal = no, you may not collect content for the # corresponding use. # (c) If the website operator does not include a Content-Signal for a # corresponding use, the website operator neither grants nor restricts # permission via Content-Signal with respect to the corresponding use. # The content signals and their meanings are: # search: building a search index and providing search results (e.g., returning # hyperlinks and short excerpts from your website's contents). Search does not # include providing AI-generated search summaries. # ai-input: inputting content into one or more AI models (e.g., retrieval # augmented generation, grounding, or other real-time taking of content for # generative AI search answers). # ai-train: training or fine-tuning AI models. # ANY RESTRICTIONS EXPRESSED VIA CONTENT SIGNALS ARE EXPRESS RESERVATIONS OF # RIGHTS UNDER ARTICLE 4 OF THE EUROPEAN UNION DIRECTIVE 2019/790 ON COPYRIGHT # AND RELATED RIGHTS IN THE DIGITAL SINGLE MARKET. # BEGIN Cloudflare Managed content User-agent: * Content-Signal: search=yes,ai-train=no Allow: / User-agent: Amazonbot Disallow: / User-agent: Applebot-Extended Disallow: / User-agent: Bytespider Disallow: / User-agent: CCBot Disallow: / User-agent: ClaudeBot Disallow: / User-agent: CloudflareBrowserRenderingCrawler Disallow: / User-agent: Google-Extended Disallow: / User-agent: GPTBot Disallow: / User-agent: meta-externalagent Disallow: / # END Cloudflare Managed Content # ============================================================ # robots.txt — diddydesign.com # Last updated: May 2026 # Strategy: Maximum AI search visibility (AEO) + block scrapers # Public design canonicals live under /u/ and are allowed by the global Allow rules. # /preview/ is reserved for private noindex previews and remains blocked. # ============================================================ # ============================================================ # SEARCH ENGINES — full access # ============================================================ User-agent: Googlebot Allow: / Disallow: /preview/ Disallow: /auth Disallow: /reset-password Disallow: /admin Disallow: /admindashboard Disallow: /team/ Disallow: /workspace/ Disallow: /billing/return User-agent: Bingbot Allow: / Disallow: /preview/ Disallow: /auth Disallow: /reset-password Disallow: /admin Disallow: /admindashboard Disallow: /team/ Disallow: /workspace/ Disallow: /billing/return User-agent: DuckDuckBot Allow: / Disallow: /preview/ Disallow: /auth Disallow: /admin Disallow: /workspace/ Disallow: /billing/return User-agent: Slurp Allow: / Disallow: /preview/ Disallow: /auth Disallow: /admin Disallow: /workspace/ Disallow: /billing/return User-agent: Applebot Allow: / Disallow: /preview/ Disallow: /auth Disallow: /admin Disallow: /workspace/ Disallow: /billing/return User-agent: PetalBot Allow: / Disallow: /preview/ Disallow: /auth Disallow: /admin Disallow: /workspace/ Disallow: /billing/return User-agent: Yandex Allow: / Disallow: /preview/ Disallow: /auth Disallow: /admin Disallow: /workspace/ Disallow: /billing/return User-agent: Baiduspider Allow: / Disallow: /preview/ Disallow: /auth Disallow: /admin Disallow: /workspace/ Disallow: /billing/return # ============================================================ # AI SEARCH & INFERENCE BOTS # Allow: These bots cite your pages in AI answers (AEO gold) # Perplexity, ChatGPT Search, Claude, Gemini, Copilot, You.com # ============================================================ # OpenAI — ChatGPT search results (NOT training) User-agent: OAI-SearchBot Allow: / Disallow: /preview/ Disallow: /auth Disallow: /admin Disallow: /workspace/ Disallow: /billing/return # OpenAI — when a ChatGPT user browses your page User-agent: ChatGPT-User Allow: / Disallow: /preview/ Disallow: /auth Disallow: /admin Disallow: /workspace/ Disallow: /billing/return # Perplexity — search citations User-agent: PerplexityBot Allow: / Disallow: /preview/ Disallow: /auth Disallow: /admin Disallow: /workspace/ Disallow: /billing/return # Perplexity — when Perplexity user browses your page User-agent: Perplexity-User Allow: / Disallow: /preview/ Disallow: /auth Disallow: /admin Disallow: /workspace/ Disallow: /billing/return # Anthropic Claude — search & inference (cited in Claude answers) User-agent: ClaudeBot Allow: / Disallow: /preview/ Disallow: /auth Disallow: /admin Disallow: /workspace/ Disallow: /billing/return # Anthropic — Claude.ai browsing User-agent: Claude-User Allow: / Disallow: /preview/ Disallow: /auth Disallow: /admin Disallow: /workspace/ Disallow: /billing/return # Google — Gemini AI model training / AI Overviews grounding # Allowing this = eligibility to appear in Google AI Overviews User-agent: Google-Extended Allow: / Disallow: /preview/ Disallow: /auth Disallow: /admin Disallow: /workspace/ Disallow: /billing/return # Apple — Apple Intelligence answers User-agent: Applebot-Extended Allow: / Disallow: /preview/ Disallow: /auth Disallow: /admin Disallow: /workspace/ Disallow: /billing/return # Microsoft Copilot — AI search (Bing AI, Edge sidebar) User-agent: Bingbot Allow: / # You.com AI search User-agent: YouBot Allow: / Disallow: /preview/ Disallow: /auth Disallow: /admin Disallow: /workspace/ Disallow: /billing/return # Cohere AI — used for enterprise AI search User-agent: cohere-ai Allow: / Disallow: /preview/ Disallow: /auth Disallow: /admin Disallow: /workspace/ Disallow: /billing/return # Meta AI search (WhatsApp, Instagram, Facebook AI answers) User-agent: Meta-ExternalAgent Allow: / Disallow: /preview/ Disallow: /auth Disallow: /admin Disallow: /workspace/ Disallow: /billing/return # Grok (xAI) — X/Twitter AI search User-agent: Grok Allow: / Disallow: /preview/ Disallow: /auth Disallow: /admin Disallow: /workspace/ Disallow: /billing/return # Amazon Alexa / Rufus AI User-agent: Amazonbot Allow: / Disallow: /preview/ Disallow: /auth Disallow: /admin Disallow: /workspace/ Disallow: /billing/return # ============================================================ # AI TRAINING-ONLY BOTS # These harvest your content for model training. # They do not cite you or send traffic. # We block them to protect our content. # ============================================================ # OpenAI training crawler (NOT search) — block to protect content # Note: OAI-SearchBot above (search) is still allowed User-agent: GPTBot Disallow: / # Common Crawl — large-scale dataset scraper, no referral traffic User-agent: CCBot Disallow: / # ByteDance (TikTok) — aggressive scraper, minimal referral value User-agent: Bytespider Disallow: / # Diffbot — scrapes structured data, no meaningful traffic return User-agent: Diffbot Disallow: / # AI2 (Allen Institute) — research training data only User-agent: ai2-crawler Disallow: / # Omgili — scrapes for datasets User-agent: omgili Disallow: / # DataForSEO — SEO tool scraper (not a search engine) User-agent: DataForSEOBot Disallow: / # PetalBot extended crawls — allow basic, block deep scraping User-agent: PetalBot Crawl-delay: 10 # ============================================================ # SEO AUDIT TOOLS # Allow for your own audits; they consume crawl budget # when crawling excessively — rate limit them. # ============================================================ User-agent: SemrushBot Allow: / Crawl-delay: 10 User-agent: AhrefsBot Allow: / Crawl-delay: 10 User-agent: MJ12bot Crawl-delay: 30 User-agent: DotBot Crawl-delay: 30 User-agent: BLEXBot Crawl-delay: 30 # ============================================================ # SOCIAL MEDIA CRAWLERS # Allow: These generate link previews on sharing # (Twitter cards, Facebook OG, LinkedIn previews) # ============================================================ User-agent: Twitterbot Allow: / Disallow: /preview/ Disallow: /auth Disallow: /admin Disallow: /workspace/ User-agent: facebookexternalhit Allow: / Disallow: /preview/ Disallow: /auth Disallow: /admin Disallow: /workspace/ User-agent: LinkedInBot Allow: / Disallow: /preview/ Disallow: /auth Disallow: /admin Disallow: /workspace/ User-agent: WhatsApp Allow: / Disallow: /preview/ Disallow: /auth Disallow: /admin Disallow: /workspace/ User-agent: Slackbot Allow: / Disallow: /preview/ Disallow: /auth Disallow: /admin Disallow: /workspace/ User-agent: Discordbot Allow: / Disallow: /preview/ Disallow: /auth Disallow: /admin Disallow: /workspace/ User-agent: TelegramBot Allow: / Disallow: /preview/ Disallow: /auth Disallow: /admin Disallow: /workspace/ # ============================================================ # ARCHIVE # ============================================================ # Internet Archive Wayback Machine — good for credibility User-agent: ia_archiver Allow: / Disallow: /preview/ Disallow: /auth Disallow: /admin Disallow: /workspace/ # ============================================================ # CATCH-ALL — block anything not listed above # ============================================================ User-agent: * Allow: / Disallow: /preview/ Disallow: /auth Disallow: /reset-password Disallow: /admin Disallow: /admindashboard Disallow: /team/ Disallow: /workspace/ Disallow: /billing/return Disallow: /api/ # ============================================================ # SITEMAPS — sitemap.xml is the main index for Astro static pages and the Cloudflare dynamic showcase sitemap. # ============================================================ Sitemap: https://diddydesign.com/sitemap.xml