# BigDataCloud robots.txt # Strategy: allow AI assistants that generate referral/brand value (ChatGPT, Perplexity, Claude, Gemini) # Block pure scrapers, data harvesters, and SEO tools with no marketing return. # ── AI assistants with marketing/referral value ────────────────────────────── # These power AI search and answer engines — appearing in their answers drives traffic and brand awareness. # We allow content pages but block live lookup results (no value, burns quota) and account pages. User-agent: GPTBot User-agent: ChatGPT-User User-agent: ClaudeBot User-agent: Claude-Web User-agent: PerplexityBot User-agent: Google-Extended User-agent: Applebot-Extended User-agent: Googlebot-Extended Disallow: /account/ Disallow: /ip-lookup/ Disallow: /asn-lookup/ Disallow: /network-lookup/ Disallow: /graphql/ Disallow: /auth/ Disallow: /login Disallow: /signup # ── Pure scrapers and data harvesters — block completely ───────────────────── # No marketing return, high volume, no attribution. User-agent: Bytespider Disallow: / User-agent: ImagesiftBot Disallow: / User-agent: CCBot Disallow: / User-agent: DataForSeoBot Disallow: / User-agent: AhrefsBot Disallow: / User-agent: SemrushBot Disallow: / User-agent: MJ12bot Disallow: / User-agent: DotBot Disallow: / User-agent: BLEXBot Disallow: / User-agent: PetalBot Disallow: / User-agent: Amazonbot Disallow: / # ── All other crawlers (Google, Bing, etc.) ────────────────────────────────── User-agent: * Allow: / Disallow: /account/ Disallow: /auth/ Disallow: /login Disallow: /signup Disallow: /graphql/ # ── Sitemaps ───────────────────────────────────────────────────────────────── Sitemap: https://www.bigdatacloud.com/sitemap.xml Sitemap: https://www.bigdatacloud.com/sitemap-index.xml