Top SitesBR104 - Notícias de Alagoas, Maceió e Interior em tempo real

Machine Readiness

Stored receipt and evidence

Overall

16

Readable

55

Callable

0

Commerce

0

Payment

0

Machine Access

Inspect the site's MCP endpoint

Open MCP explorer

DialtoneApp can scan the stored discovery files for this domain, try the MCP initialize handshake, and show the raw protocol transcript.

Purchase boundary

read only

Control boundary

unknown

Payment rails

None

Payment providers

None

Payment methods

None

Payment protocols

None

Payment assets

None

Payment networks

None

Capabilities

None

Verified payment surface

No

Crypto only

No

Readable docs

robots, llms

Products

0

Variants

0

Priced variants

0

Currencies

0

Offers

0

Priced offers

0

Priced actions

0

Samples

Offer samples

No stored offer samples.

Samples

Action samples

No stored action samples.

Samples

Product samples

No stored product samples.

Document

robots.txt

Open robots.txt
# As a condition of accessing this website, you agree to abide by the following
# content signals:

# (a)  If a Content-Signal = yes, you may collect content for the corresponding
#      use.
# (b)  If a Content-Signal = no, you may not collect content for the
#      corresponding use.
# (c)  If the website operator does not include a Content-Signal for a
#      corresponding use, the website operator neither grants nor restricts
#      permission via Content-Signal with respect to the corresponding use.

# The content signals and their meanings are:

# search:   building a search index and providing search results (e.g., returning
#           hyperlinks and short excerpts from your website's contents). Search does not
#           include providing AI-generated search summaries.
# ai-input: inputting content into one or more AI models (e.g., retrieval
#           augmented generation, grounding, or other real-time taking of content for
#           generative AI search answers).
# ai-train: training or fine-tuning AI models.

# ANY RESTRICTIONS EXPRESSED VIA CONTENT SIGNALS ARE EXPRESS RESERVATIONS OF
# RIGHTS UNDER ARTICLE 4 OF THE EUROPEAN UNION DIRECTIVE 2019/790 ON COPYRIGHT
# AND RELATED RIGHTS IN THE DIGITAL SINGLE MARKET.

# BEGIN Cloudflare Managed content

User-agent: *
Content-Signal: search=yes,ai-train=no
Allow: /

User-agent: Amazonbot
Disallow: /

User-agent: Applebot-Extended
Disallow: /

User-agent: Bytespider
Disallow: /

User-agent: CCBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: CloudflareBrowserRenderingCrawler
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: GPTBot
Disallow: /

User-agent: meta-externalagent
Disallow: /

# END Cloudflare Managed Content

########################################
# BLOQUEIO ESPECÍFICO DE BOTS/IA/SCRAPERS
# Estes não devem acessar NENHUMA parte do site
########################################

User-agent: anthropic-ai
Disallow: /

User-agent: AwarioRssBot
Disallow: /

User-agent: AwarioSmartBot
Disallow: /

User-agent: Bytespider
Disallow: /

User-agent: CCBot
Disallow: /

#User-agent: ChatGPT-User
#Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: Claude-Web
Disallow: /

User-agent: cohere-ai
Disallow: /

User-agent: DataForSeoBot
Disallow: /

User-agent: Diffbot
Disallow: /

# IMPORTANTE:
# NÃO vamos bloquear FacebookBot para permitir geração de preview/social

# IMPORTANTE:
# NÃO vamos bloquear nada do Google aqui.
# (Googlebot, Googlebot-News, Google-Extended, etc.)
# Bloquear Google-Extended no robots.txt não deveria matar Discover,
# mas, como seu WAF pode usar essa lista pra dar 403,
# o mais seguro é NÃO colocar o Google-Extended no bloco proibido.
# Assim você evita bloquear sem querer o render do Discover.

#User-agent: GPTBot
#Disallow: /

User-agent: magpie-crawler
Disallow: /

User-agent: NewsNow
Disallow: /

User-agent: news-please
Disallow: /

User-agent: omgili
Disallow: /

User-agent: omgilibot
Disallow: /

User-agent: PerplexityBot
Disallow: /

User-agent: PiplBot
Disallow: /

User-agent: Scrapy
Disallow: /

User-agent: TurnitinBot
Disallow: /


########################################
# REGRAS GERAIS (Google, Bing, etc.)
########################################

User-agent: *
# Bloquear arquivos técnicos e rotas que não devem aparecer no Google
Disallow: /*.jsx$
Disallow: *.jsx$
Disallow: /*.jsx/
Disallow: *.jsx?
Disallow: /ads/
Disallow: /api/
Disallow: /search
Disallow: /busca
Disallow: /?queryId=
Disallow: /*/0

# WordPress internos / rotas sensíveis
Disallow: /wp-admin/
Disallow: /readme.html
Disallow: /license.txt
Disallow: /xmlrpc.php
Disallow: /wp-login.php
Disallow: /wp-register.php

# Bloqueia as versões /amp/ (se você não quer indexar AMP separado)
Disallow: /*/amp/

# Permitir recursos necessários
Allow: /wp-admin/admin-ajax.php
Allow: /wp-content/uploads/

########################################
# SITEMAP
########################################
Sitemap: https://www.br104.com.br/sitemap_index.xml
Sitemap: https://www.br104.com.br/sitemap-news.xml/

Document

llms.txt

Not stored for this site.

Document

llms-full.txt

Not stored for this site.