# robots.txt for pickfu.com # # AI / agent readiness: Content Signals # See https://contentsignals.org/ and https://blog.cloudflare.com/content-signals-policy/ # # As a condition of accessing this website, you agree to abide by the # following content signals: # content-signal = yes: you may collect content for the corresponding use. # content-signal = no: you may not collect content for the corresponding use. # Missing signal: no position granted or restricted via content signal. # # Signals: # search Building a search index and returning search results (hyperlinks # and short excerpts). Does NOT include AI-generated summaries. # ai-input Feeding content into AI models in real time (RAG, grounding, # live retrieval for generative AI answers). # ai-train Training or fine-tuning AI models. # # Restrictions expressed via Content-Signal are express reservations of rights # under Article 4 of the European Union Directive 2019/790 on Copyright and # Related Rights in the Digital Single Market. # # Per RFC 9309 ยง2.2.1, User-agent-specific groups replace the `*` group for # the matching crawler โ€” they do NOT inherit. Every per-UA block below # therefore repeats the core Disallow and Allow directives; common entries # are factored into the ALL-CRAWLER-BASELINE section so the rules stay in # sync when edited. If you add a Disallow to the default `*` block, copy it # to every per-UA block too. # ============================================================================== # Default User-agent โ€” covers every crawler without a more specific block, # including Googlebot, Bingbot, etc. # ============================================================================== User-agent: * # Core Disallow list (paths unsafe for any crawler). Disallow: /questions/new Disallow: /ask Disallow: /payment Disallow: /rpx Disallow: /polls Disallow: /polls/extend/ Disallow: /?*coupon=* # Agent discovery assets (served by cloudflare-reverse-proxies/www-proxy). Allow: /llms.txt Allow: /llms-full.txt Allow: /.well-known/mcp Allow: /api-catalog.json Content-Signal: ai-train=yes, search=yes, ai-input=yes # ============================================================================== # AI training crawlers โ€” block /results to protect user-submitted poll content. # Each block must repeat the core Disallow/Allow lines above because RFC 9309 # specifies that matching crawlers use ONLY their specific block, not `*`. # # Content-Signal here overrides the default: ai-train=no specifically for # these training-oriented crawlers, while still allowing real-time retrieval # (ai-input=yes) and search indexing of public content (search=yes). # ============================================================================== User-agent: GPTBot Disallow: /results Disallow: /questions/new Disallow: /ask Disallow: /payment Disallow: /rpx Disallow: /polls Disallow: /polls/extend/ Disallow: /?*coupon=* Allow: /llms.txt Allow: /llms-full.txt Allow: /.well-known/mcp Allow: /api-catalog.json Content-Signal: ai-train=no, search=yes, ai-input=yes User-agent: ClaudeBot Disallow: /results Disallow: /questions/new Disallow: /ask Disallow: /payment Disallow: /rpx Disallow: /polls Disallow: /polls/extend/ Disallow: /?*coupon=* Allow: /llms.txt Allow: /llms-full.txt Allow: /.well-known/mcp Allow: /api-catalog.json Content-Signal: ai-train=no, search=yes, ai-input=yes User-agent: Google-Extended Disallow: /results Disallow: /questions/new Disallow: /ask Disallow: /payment Disallow: /rpx Disallow: /polls Disallow: /polls/extend/ Disallow: /?*coupon=* Allow: /llms.txt Allow: /llms-full.txt Allow: /.well-known/mcp Allow: /api-catalog.json Content-Signal: ai-train=no, search=yes, ai-input=yes # ============================================================================== # Full-site blocks (retained from prior policy). # ============================================================================== User-agent: Yandex Disallow: / # blocks access to the entire site User-agent: CompSpyBot User-agent: Curious George User-agent: CybEye.com User-agent: DoCoMo User-agent: ExB Language Crawler User-agent: Ezooms User-agent: Flamingo_SearchEngine User-agent: Genieo User-agent: Genio User-agent: LWNutch User-agent: LexxeBot User-agent: OpenWebIndex User-agent: RediffNewsBot User-agent: SEOENGWorldBot User-agent: Scanmine User-agent: ShopWiki User-agent: ShowyouBot User-agent: Sosospider User-agent: WocBot User-agent: YoudaoBot User-agent: daumoa User-agent: gsa-crawler User-agent: libcrawl User-agent: linkdex User-agent: magpie-crawler User-agent: repparser User-agent: sindice-site-manager User-agent: sogou spider User-agent: sogou User-agent: woriobot User-agent: yacybot User-agent: yolinkBot Disallow: / Sitemap: https://www.pickfu.com/sitemap.xml