# BCC Research robots.txt — AI-friendly with guardrails # --- Global defaults (keep existing hygiene) --- User-agent: * Crawl-Delay: 2 Disallow: /index/advancedresult/ Disallow: /index/advancedsearch/ Disallow: /*?* Disallow: /*? Disallow: /aboutus/hubspotlead Disallow: /statsana/piwik/ Allow: /public/ Allow: /public/images2017/ # --- AI crawlers: explicit allow (faster crawl) --- User-agent: GPTBot Allow: / Crawl-Delay: 2 User-agent: OAI-SearchBot Allow: / Crawl-Delay: 2 User-agent: ClaudeBot Allow: / Crawl-Delay: 2 User-agent: Claude-SearchBot Allow: / Crawl-Delay: 2 User-agent: PerplexityBot Allow: / Crawl-Delay: 2 User-agent: CCBot Allow: / Crawl-Delay: 2 # Broader “infra” user-agents used for AI indexing/pipeline work User-agent: GoogleOther Allow: / Crawl-Delay: 2 User-agent: Amazonbot Allow: / Crawl-Delay: 2 # Traditional search bots (often feed LLM features) User-agent: Googlebot Allow: / User-agent: bingbot Allow: / # Training/usage policy toggles (not crawlers, but respected by some vendors) User-agent: Google-Extended Allow: / User-agent: Applebot-Extended Allow: / # --- Machine Content Policy for AI & LLMs (already live) --- MCP: https://www.bccresearch.com/mcp.yaml # --- Sitemaps (already live) --- Sitemap: https://www.bccresearch.com/public/sitemap/indexsitemap/sitemapindex.xml