# Crawlers Setup User-agent: * # Website Sitemap Sitemap: https://www.sugarfreeshops.com/media/sitemap/google_sitemap_en.xml Sitemap: https://www.sugarfreeshops.com/media/sitemap/google_sitemap_el.xml # # Allowable Index Allow: /*?p= Allow: /media/ # Directories Disallow: /app/ Disallow: /bin/ Disallow: /dev/ Disallow: /lib/ Disallow: /var/ Disallow: /404/ Disallow: /js/ #Disallow: /media/ Disallow: /phpserver/ Disallow: /pkginfo/ Disallow: /report/ Disallow: /setup/ Disallow: /update/ Disallow: /vendor/ Disallow: /cgi-bin/ Disallow: /downloader/ Disallow: /errors/ Disallow: /includes/ Disallow: /scripts/ Disallow: /shell/ Disallow: /skin/ Disallow: /stats/ # Paths (clean URLs) Disallow: */index.php/ Disallow: */catalog/product_compare/ Disallow: */catalog/category/view/ Disallow: */catalog/product/view/ Disallow: */catalog/product/gallery/ Disallow: */catalogsearch/ Disallow: */search/ #Disallow: */checkout/ Disallow: */control/ Disallow: */contacts/ Disallow: */customer/ Disallow: */customize/ Disallow: */newsletter/ Disallow: */review/ Disallow: */tag/ Disallow: */sendfriend/ Disallow: */wishlist/ Disallow: *//* # Files Disallow: /composer.json Disallow: /composer.lock Disallow: /CONTRIBUTING.md Disallow: /CONTRIBUTOR_LICENSE_AGREEMENT.html Disallow: /COPYING.txt Disallow: /Gruntfile.js Disallow: /LICENSE.txt Disallow: /LICENSE_AFL.txt Disallow: /nginx.conf.sample Disallow: /package.json Disallow: /php.ini.sample Disallow: /RELEASE_NOTES.txt # Do not index pages that are sorted or filtered. Disallow: /*?*product_list_mode= Disallow: /*?*product_list_order= Disallow: /*?*product_list_limit= Disallow: /*?*product_list_dir= # Paths (no clean URLs) #Disallow: /*.js$ #Disallow: /*.css$ #Disallow: /*? #Disallow: /*?SID= Disallow: /*.php$ Disallow: /*?p=*& # CVS, SVN directory and dump files Disallow: /*.CVS Disallow: /*.Zip$ Disallow: /*.zip$ Disallow: /*.Svn$ Disallow: /*.svn$ Disallow: /*.Idea$ Disallow: /*.idea$ Disallow: /*.Sql$ Disallow: /*.sql$ Disallow: /*.Tgz$ Disallow: /*.tgz$ # Google Image Crawler Setup User-agent: Googlebot-Image Disallow: User-agent: AhrefsBot Disallow: / Disallow: /lib/ Disallow: /*.php$ Disallow: /pkginfo/ Disallow: /report/ Disallow: /var/ Disallow: /catalog/ Disallow: /customer/ Disallow: /sendfriend/ Disallow: /review/ Disallow: /*SID= Disallow: /*? # Disable checkout & customer account Disallow: /checkout/ Disallow: /onestepcheckout/ Disallow: /customer/ Disallow: /customer/account/ Disallow: /customer/account/login/ # Disable Search pages Disallow: /catalogsearch/ Disallow: */search/ Disallow: /catalog/product_compare/ Disallow: /catalog/category/view/ Disallow: /catalog/product/view/ # Disable common folders Disallow: /app/ Disallow: /bin/ Disallow: /dev/ Disallow: /lib/ Disallow: /phpserver/ Disallow: /pub/ # Disable Tag & Review (Avoid duplicate content) Disallow: /tag/ Disallow: /review/ # Common files Disallow: /composer.json Disallow: /composer.lock Disallow: /CONTRIBUTING.md Disallow: /CONTRIBUTOR_LICENSE_AGREEMENT.html Disallow: /COPYING.txt Disallow: /Gruntfile.js Disallow: /LICENSE.txt Disallow: /LICENSE_AFL.txt Disallow: /nginx.conf.sample Disallow: /package.json Disallow: /php.ini.sample Disallow: /RELEASE_NOTES.txt # Disable sorting (Avoid duplicate content) Disallow: /*?*product_list_mode= Disallow: /*?*product_list_order= Disallow: /*?*product_list_limit= Disallow: /*?*product_list_dir= # Disable version control folders and others Disallow: /*.git Disallow: /*.CVS Disallow: /*.Zip$ Disallow: /*.Svn$ Disallow: /*.Idea$ Disallow: /*.Sql$ Disallow: /*.Tgz$ #sem https://www.semrush.com/bot/ #To block SEMrushBot from crawling your site for different SEO and technical issues: User-agent: SemrushBot-SA Disallow: / #To block SEMrushBot from crawling your site for Backlink Audit tool: User-agent: SemrushBot-BA Disallow: / #To block SEMrushBot from crawling your site for On Page SEO Checker tool and similar tools: User-agent: SemrushBot-SI Disallow: / #To block SEMrushBot from checking URLs on your site for SWA tool: User-agent: SemrushBot-SWA Disallow: / #To block SEMrushBot from crawling your site for Content Analyzer and Post Tracking tools: User-agent: SemrushBot-CT Disallow: / #To block SEMrushBot from crawling your site for Brand Monitoring: User-agent: SemrushBot-BM Disallow: / #To block SEMrushBot from crawling your site for SEO A/B Testing tool: User-agent: SemrushBot-SEOAB Disallow: / # === 1. Default Allow === User-agent: * Allow: / # === 2. Google Crawlers (Search + Gemini AI) === User-agent: Googlebot Allow: / User-agent: Googlebot-Image Allow: / User-agent: GoogleOther Allow: / User-agent: Google-Extended Allow: / # === 3. OpenAI / ChatGPT === User-agent: GPTBot Allow: / User-agent: ChatGPT-User Allow: / # === 4. Meta AI / Facebook / Instagram === User-agent: FacebookBot Allow: / User-agent: Facebot Allow: / User-agent: Meta-ExternalBot Allow: / User-agent: MetaBot Allow: / # === 5. Anthropic Claude === User-agent: anthropic-ai Allow: / User-agent: ClaudeBot Allow: / # === 6. Perplexity AI === User-agent: PerplexityBot Allow: / # === 7. Bing / Microsoft AI === User-agent: Bingbot Allow: / User-agent: BingPreview Allow: / # === 8. Other AI / Emerging Crawlers === User-agent: YouBot Allow: / User-agent: NeevaBot Allow: / User-agent: PhindBot Allow: / User-agent: DuckDuckBot Allow: / User-agent: KagiBot Allow: / User-agent: AndiBot Allow: / User-agent: TurnitinBot Allow: / User-agent: CCBot Allow: / #Tools Bots User-agent: AhrefsBot Allow: / User-agent: Applebot Allow: /