# robots.txt for https://www.cuatrecasas.com/ # General crawling rules for search engines User-agent: * Disallow: /admin/ Disallow: /bundles/ Disallow: /bundles_old/ Disallow: /erecruiting/ Disallow: /images/ Allow: /images/cache/ Disallow: /img/ Disallow: /media_repository/ Allow: /media_repository/images/ Allow: /media_repository/docs/ Allow: /summernote/ Allow: /resources/ Disallow: /web/ Allow: /web/assets/ Allow: /web/vendor/ # Social media crawler exception to render shared images User-agent: Twitterbot Allow: /images/ # AI search and AI features crawlers (allowed with same restrictions as the general block) User-agent: OAI-SearchBot User-agent: ChatGPT-User User-agent: PerplexityBot User-agent: Claude-User User-agent: Claude-SearchBot User-agent: Google-Extended User-agent: Applebot-Extended User-agent: Meta-ExternalAgent Disallow: /admin/ Disallow: /bundles/ Disallow: /bundles_old/ Disallow: /erecruiting/ Disallow: /images/ Allow: /images/cache/ Disallow: /img/ Disallow: /media_repository/ Allow: /media_repository/images/ Allow: /media_repository/docs/ Allow: /summernote/ Allow: /resources/ Disallow: /web/ Allow: /web/assets/ Allow: /web/vendor/ # AI training crawlers (allow only selected content sections) User-agent: GPTBot User-agent: ClaudeBot Allow: /es/*/art/ Allow: /en/*/art/ Allow: /pt/*/art/ Allow: /*/conocimiento Allow: /*/servicios Allow: /*/services Disallow: / # AI training crawlers (blocked) User-agent: CCBot User-agent: Bytespider User-agent: Amazonbot User-agent: cohere-training-data-crawler Disallow: / Sitemap: https://www.cuatrecasas.com/sitemap.xml Sitemap: https://www.cuatrecasas.com/sitemap.spain_es.xml Sitemap: https://www.cuatrecasas.com/sitemap.portugal_pt.xml Sitemap: https://www.cuatrecasas.com/sitemap.latam_es.xml Sitemap: https://www.cuatrecasas.com/sitemap.latam_pt.xml Sitemap: https://www.cuatrecasas.com/sitemap.global_en.xml Sitemap: https://www.cuatrecasas.com/sitemap.portugal_es.xml Sitemap: https://www.cuatrecasas.com/sitemap.portugal_en.xml Sitemap: https://www.cuatrecasas.com/sitemap.spain_en.xml Sitemap: https://www.cuatrecasas.com/sitemap.spain_pt.xml Sitemap: https://www.cuatrecasas.com/sitemap.latam_en.xml Sitemap: https://www.cuatrecasas.com/sitemap.global_es.xml Sitemap: https://www.cuatrecasas.com/sitemap.global_pt.xml LLMS: https://www.cuatrecasas.com/llms.txt