# ========================================================= # robots.txt for https://www.protocols.io # Purpose: # - Allow discovery of public scientific content # - Protect private, authenticated, and system areas # - Provide explicit guidance to AI crawlers # ========================================================= # ------------------------- # Default rule (all crawlers) # ------------------------- User-agent: * Disallow: /private/ Disallow: /blind/ Disallow: /api/ Disallow: /download Disallow: /pubchase Disallow: /spectro Disallow: /neb Disallow: /career/ Disallow: /essays Disallow: /editorials Disallow: /test Disallow: /flux # AI crawlers User-agent: GPTBot Disallow: /private/ Disallow: /api/ User-agent: anthropic-ai Disallow: /private/ Disallow: /api/ User-agent: CCBot Disallow: /private/ Disallow: /api/ # ------------------------- # Sitemap # ------------------------- Sitemap: https://www.protocols.io/sitemaps/sitemap.xml