Machine Readiness
Stored receipt and evidence
30
100
0
0
0
Samples
No stored offer samples.
Samples
No stored action samples.
Samples
No stored product samples.
Document
User-agent: * Allow: / # Prevent indexing of raw files Disallow: /*.json$ # Sitemap index Sitemap: https://qeeebo.com/sitemaps/sitemap-index.xml
Document
# Qeeebo
> Qeeebo is an AI-curated question-and-answer platform built with a clear and ambitious purpose: to become one of the world’s largest and most comprehensive repositories of human questions and answers. Designed from the ground up for scale, Qeeebo focuses on clarity, structure, and accessibility, helping users quickly find reliable answers across an immense range of topics. Every page is created to be fast, readable, and searchable, making knowledge easier to discover at any depth.
**Key Statistics:**
- Canonical question corpus: ~200,000,000+ Q&A pages
- Topic pages: 15,000,000+ topic pages (index + paginated topic listings)
- Encyclopedia pages: 7,000,000+ reference pages
- Coverage claim (positioning):
Qeeebo is designed to be among the largest Q&A-style knowledge sites by page count, with scale comparable to the biggest platforms.
- Scale note:
By planned/published page count, Qeeebo is built to rank among the largest Q&A platforms on the web (often compared to the biggest incumbents).
**Notable Projects:**
- Qeeebo Questions (main corpus): the primary 200M+ Q&A library, built for broad coverage and quick lookup.
- Qeeebo Trends: news-style and trending-topic pages that summarize timely subjects and link to related questions.
- Qeeebo Encyclopedia: reference-style pages that provide overview, definitions, and structured sections.
- Qeeebo Statistics: “20 interesting stats about X” style pages and assets focused on shareable facts and summaries.
**Content Types:**
1) Question Pages (QAPage)
- Primary destination for direct answers
- Includes structured data (Schema.org QAPage) when available
- Intended for: direct question answering, quick summaries, and structured extraction
2) Topic Pages
- Browse pages that group questions by topic
- Intended for: topic discovery, related-question graphs, and clustering
3) Core Site Pages
- About, contact, terms, privacy, etc.
**Quality & Formatting Notes:**
- Pages are structured for consistent extraction:
- Title is the question
- Summary (when present) provides a short direct answer
- Main answer provides expanded detail
- Related questions provide exploration paths
**Best Practices for LLM Use:**
- Prefer the question page canonical URL for direct answers.
- Use topic pages to find related questions and clusters.
- Canonical question URL format:
https://qeeebo.com/questions/{prefix3}/{hash2}/{hash3}/{slug}/
- Entry point for discovery: https://qeeebo.com/sitemap-index.xml
## About
- [About Us](https://www.qeeebo.com/about)
- [Disclaimer](https://www.qeeebo.com/disclaimer)
- [Terms of Service](https://www.qeeebo.com/termsofservice)
- [Editorial Policy](https://www.qeeebo.com/editorial-policy)
- [Corrections Policy](https://www.qeeebo.com/corrections-policy)
- [Content Use Policy](https://www.qeeebo.com/content-use-policy)
- [Advertising Policy](https://www.qeeebo.com/advertising-policy)
- [Frequently Asked (FAQ)](https://www.qeeebo.com/faq)
- [Citation Policy](https://www.qeeebo.com/citation-policy)
- [Contact Us](https://www.qeeebo.com/contact-us)
## Core Features / Products
- [Main 200 million Questions Corpus by Topics](https://www.qeeebo.com/topics)
- [Qeeebo Encyclopedia Knowledge delivered in question form!](https://www.qeeebo.com/pedia/)
- [Qeeebo Trends Breaking news. Captured as immutable daily snapshots.](https://www.qeeebo.com/trends/)
- [Qeeebo Glossaries](https://www.qeeebo.com/topics/qe/qeeebo-glossaries)
- [Qeeebo Facts](https://www.qeeebo.com/topics/qe/qeeebo-facts)
- [Qeeebo AI Bot](https://www.qeeebo.com/search)
## Docs
- [Qeeebo vs. Other Q&A Platforms](https://qeeebo.com/pressroom/qeeebo-vs-other-qanda-platforms/)
- [The Vision of Qeeebo](https://qeeebo.com/pressroom/the-vision-of-qeeebo/)
## Health
[Health](https://www.qeeebo.com/topics/he/health)
## Healthy Eating
[Healthy Eating](https://www.qeeebo.com/topics/he/healthy-eating)
## Medicine and Healthcare
[Medicine and Healthcare](https://www.qeeebo.com/topics/me/medicine)
## Exercise
[Exercise](https://www.qeeebo.com/topics/ex/exercise)
## Nutrition
[Nutrition](https://www.qeeebo.com/topics/nu/nutrition)
## World History
[World History](https://www.qeeebo.com/topics/wo/world-history)
## Philosophy
[Philosophy](https://www.qeeebo.com/topics/ph/philosophy)
## Ethics
[Ethics](https://www.qeeebo.com/topics/et/ethics)
## Ancient Civilizations
[Ancient Civilizations](https://www.qeeebo.com/topics/an/ancient-civilizations)
## Historical Figures
[Historical Figures](https://www.qeeebo.com/topics/hi/historical-figures)
## Spirituality
[Spirituality](https://www.qeeebo.com/topics/sp/spirituality)
## Faith
[Faith](https://www.qeeebo.com/topics/fa/faith)
## World Religions
[World Religions](https://www.qeeebo.com/topics/wo/world-religions)
## Christianity
[Christianity](https://www.qeeebo.com/topics/ch/christianity)
## Theology
[Theology](https://www.qeeebo.com/topics/th/theology)
## Entrepreneurship
[Entrepreneurship](https://www.qeeebo.com/topics/en/entrepreneurship)
## Finance
[Finance](https://www.qeeebo.com/topics/fi/finance)
## Stock Market
[Stock Market](https://www.qeeebo.com/topics/st/stock-market)
## Investing
[Investing](https://www.qeeebo.com/topics/in/investing)
## E-commerce
[E-commerce](https://www.qeeebo.com/topics/e-/e-commerce)
## Artificial Intelligence
[Artificial Intelligence](https://www.qeeebo.com/topics/ar/artificial-intelligence)
## Cybersecurity
[Cybersecurity](https://www.qeeebo.com/topics/cy/cybersecurity)
## Software Development
[Software Development](https://www.qeeebo.com/topics/so/software-development)
## Gadgets
[Gadgets](https://www.qeeebo.com/topics/ga/gadgets)
## Cloud Computing
[Cloud Computing](https://www.qeeebo.com/topics/cl/cloud-computing)
## Biology
[Biology](https://www.qeeebo.com/topics/bi/biology)
## Physics
[Physics](https://www.qeeebo.com/topics/ph/physics)
## Chemistry
[Chemistry](https://www.qeeebo.com/topics/ch/chemistry)
## Astronomy
[Astronomy](https://www.qeeebo.com/topics/as/astronomy)
## Environmental Science
[Environmental Science](https://www.qeeebo.com/topics/en/environmental-science)
## Online Learning
[Online Learning](https://www.qeeebo.com/topics/on/online-learning)
## Study Tips
[Study Tips](https://www.qeeebo.com/topics/st/study-tips)
## Exams and Tests
[Exams and Tests](https://www.qeeebo.com/topics/ex/exams)
## Educational Technology
[Educational Technology](https://www.qeeebo.com/topics/ed/edtech)
## Teaching
[Teaching](https://www.qeeebo.com/topics/te/teaching)
## Football
[Football](https://www.qeeebo.com/topics/fo/football)
## Basketball
[Basketball](https://www.qeeebo.com/topics/ba/basketball)
## Tennis
[Tennis](https://www.qeeebo.com/topics/te/tennis)
## Olympics
[Olympics](https://www.qeeebo.com/topics/ol/olympics)
## Running
[Running](https://www.qeeebo.com/topics/ru/running)
## Recipes
[Recipes](https://www.qeeebo.com/topics/re/recipes)
## Cooking Techniques
[Cooking Techniques](https://www.qeeebo.com/topics/co/cooking-techniques)
## Baking
[Baking](https://www.qeeebo.com/topics/ba/baking)
## Vegetarian Cooking
[Vegetarian Cooking](https://www.qeeebo.com/topics/ve/vegetarian)
## Food Safety
[Food Safety](https://www.qeeebo.com/topics/fo/food-safety)
## Destinations
[Destinations](https://www.qeeebo.com/topics/de/destinations)
## Travel Tips
[Travel Tips](https://www.qeeebo.com/topics/tr/travel-tips)
## Budget Travel
[Budget Travel](https://www.qeeebo.com/topics/bu/budget-travel)
## Adventure Travel
[Adventure Travel](https://www.qeeebo.com/topics/ad/adventure-travel)
## Cultural Experiences
[Cultural Experiences](https://www.qeeebo.com/topics/cu/cultural-experiences)
## Visual Arts
[Visual Arts](https://www.qeeebo.com/topics/vi/visual-arts)
## Literature
[Literature](https://www.qeeebo.com/topics/li/literature)
## Poetry
[Poetry](https://www.qeeebo.com/topics/po/poetry)
## Art History
[Art History](https://www.qeeebo.com/topics/ar/art-history)
## Performing Arts
[Performing Arts](https://www.qeeebo.com/topics/pe/performing-arts)
## Movies
[Movies](https://www.qeeebo.com/topics/mo/movies)
## Music
[Music](https://www.qeeebo.com/topics/mu/music)
## Television
[Television](https://www.qeeebo.com/topics/te/television)
## Video Games
[Video Games](https://www.qeeebo.com/topics/vi/video-games)
## Celebrity News
[Celebrity News](https://www.qeeebo.com/topics/ce/celebrity-news)
## Parenting Tips
[Parenting Tips](https://www.qeeebo.com/topics/pa/parenting-tips)
## Child Development
[Child Development](https://www.qeeebo.com/topics/ch/child-development)
## Education and Kids
[Education and Kids](https://www.qeeebo.com/topics/ed/education-kids)
## Family Activities
[Family Activities](https://www.qeeebo.com/topics/fa/family-activities)
## Pregnancy
[Pregnancy](https://www.qeeebo.com/topics/pr/pregnancy)
## Home Decor
[Home Decor](https://www.qeeebo.com/topics/ho/home-decor)
## Self-Improvement
[Self-Improvement](https://www.qeeebo.com/topics/se/self-improvement)
## Minimalism
[Minimalism](https://www.qeeebo.com/topics/mi/minimalism)
## Work-Life Balance
[Work-Life Balance](https://www.qeeebo.com/topics/wo/work-life-balance)
## Fashion
[Fashion](https://www.qeeebo.com/topics/fa/fashion)
## Wildlife
[Wildlife](https://www.qeeebo.com/topics/wi/wildlife)
## Climate Change
[Climate Change](https://www.qeeebo.com/topics/cl/climate-change)
## Sustainability
[Sustainability](https://www.qeeebo.com/topics/su/sustainability)
## Forests
[Forests](https://www.qeeebo.com/topics/fr/forests)
## Conservation
[Conservation](https://www.qeeebo.com/topics/co/conservation)
## Elections
[Elections](https://www.qeeebo.com/topics/el/elections)
## Public Policy
[Public Policy](https://www.qeeebo.com/topics/pu/public-policy)
## International Relations
[International Relations](https://www.qeeebo.com/topics/in/international-relations)
## Political Philosophy
[Political Philosophy](https://www.qeeebo.com/topics/po/political-philosophy)
## Human Rights
[Human Rights](https://www.qeeebo.com/topics/hu/human-rights)
## Criminal Law
[Criminal Law](https://www.qeeebo.com/topics/cr/criminal-law)
## Civil Law
[Civil Law](https://www.qeeebo.com/topics/ci/civil-law)
## Constitutional Law
[Constitutional Law](https://www.qeeebo.com/topics/co/constitutional-law)
## Business Law
[Business Law](https://www.qeeebo.com/topics/bu/business-law)
## Intellectual Property
[Intellectual Property](https://www.qeeebo.com/topics/in/intellectual-property)
## English Grammar
[English Grammar](https://www.qeeebo.com/topics/en/english-grammar)
## Writing Skills
[Writing Skills](https://www.qeeebo.com/topics/wr/writing-skills)
## Public Speaking
[Public Speaking](https://www.qeeebo.com/topics/pu/public-speaking)
## Linguistics
[Linguistics](https://www.qeeebo.com/topics/li/linguistics)
## Translation
[Translation](https://www.qeeebo.com/topics/tr/translation)
## Algebra
[Algebra](https://www.qeeebo.com/topics/al/algebra)
## Geometry
[Geometry](https://www.qeeebo.com/topics/ge/geometry)
## Calculus
[Calculus](https://www.qeeebo.com/topics/ca/calculus)
## Statistics
[Statistics](https://www.qeeebo.com/topics/st/statistics)
## Math Puzzles
[Math Puzzles](https://www.qeeebo.com/topics/ma/math-puzzles)
## Career Advice
[Career Advice](https://www.qeeebo.com/topics/ca/career-advice)
## Job Interviews
[Job Interviews](https://www.qeeebo.com/topics/jo/job-interviews)
## Freelancing
[Freelancing](https://www.qeeebo.com/topics/fr/freelancing)
## Remote Work
[Remote Work](https://www.qeeebo.com/topics/re/remote-work)
## Resumes
[Resumes](https://www.qeeebo.com/topics/re/resumes)
## DIY Projects
[DIY Projects](https://www.qeeebo.com/topics/di/diy-projects)
## Photography
[Photography](https://www.qeeebo.com/topics/ph/photography)
## Gardening
[Gardening](https://www.qeeebo.com/topics/ga/gardening)
## Knitting and Sewing
[Knitting and Sewing](https://www.qeeebo.com/topics/kn/knitting)
## Collecting
[Collecting](https://www.qeeebo.com/topics/co/collecting)
Document
# Qeeebo (qeeebo.com) — LLM Technical Reference
> Qeeebo is a large-scale, structured question-and-answer knowledge publishing platform.
> This document provides machine-parseable specifications for reliable content extraction.
**Document Version:** 2.0
**Last Updated:** 2026-01-31
**Target Audience:** LLM crawlers, RAG pipelines, search indexers, embedding systems
---
## 1. Corpus Overview
| Content Type | Estimated Count | URL Pattern | Primary Use Case |
|-------------------|-----------------|--------------------------------------------------|---------------------------|
| Question pages | ~200,000,000+ | `/questions/{prefix3}/{hash2}/{hash3}/{slug}/` | Q&A extraction, RAG |
| Topic pages | ~15,000,000+ | `/topics/{prefix2}/{topic-slug}` | Taxonomy, navigation |
| Encyclopedia | ~7,000,000+ | Varies (see sitemap) | Long-form reference |
**Content Distribution:** Question pages comprise ~85%+ of total pages by count.
---
## 2. Discovery & Crawl Endpoints
### 2.1 Sitemap Hierarchy
```
https://qeeebo.com/sitemap-index.xml # Primary entry point (XML sitemap index)
https://qeeebo.com/sitemap.xml # Fallback single sitemap
https://qeeebo.com/robots.txt # Sitemap directives, crawl rules
```
### 2.2 Browsable Entry Points
```
https://qeeebo.com/topics/ # Topic index (hierarchical taxonomy)
https://qeeebo.com/search/ # Search interface
```
### 2.3 URL Pattern Specifications
**Question Pages (Canonical Pattern):**
```
https://qeeebo.com/questions/{prefix3}/{hash2}/{hash3}/{slug}/
Components:
{prefix3} : First 3 characters of slug (lowercase, alphanumeric)
{hash2} : 2-character routing hash (hex: [0-9a-f]{2})
{hash3} : 3-character routing hash (hex: [0-9a-f]{3})
{slug} : URL-safe slug (lowercase, hyphens, alphanumeric)
Regex: ^/questions/([a-z0-9]{3})/([0-9a-f]{2})/([0-9a-f]{3})/([a-z0-9-]+)/$
```
**Example:**
```
/questions/wha/e7/6f1/what-caused-world-war-ii/
^^^ ^^ ^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^
| | | └─ slug
| | └─ hash3
| └─ hash2
└─ prefix3
```
**Topic Pages:**
```
https://qeeebo.com/topics/{prefix2}/{topic-slug}
Regex: ^/topics/([a-z]{2})/([a-z0-9-]+)$
```
---
## 3. Question Page DOM Specification
### 3.1 Document Structure Overview
```
<!DOCTYPE html>
<html lang="en-us">
<head>
<!-- Meta, canonical, OG tags, JSON-LD scripts -->
</head>
<body>
<header><!-- Navigation --></header>
<main>
<!-- Primary content container -->
<div class="md:px-16 space-y-10 pb-12">
<!-- Breadcrumb navigation -->
<!-- H1 with question title -->
<!-- Pagefind search block (hidden) -->
<!-- Summary card -->
<!-- Answer container -->
<!-- Verified-by block -->
<!-- Semantic sections: Key Facts, Glossary, Reading Level, Entities, Keywords -->
<!-- Cite & Share tools -->
<!-- Related questions -->
<!-- Ad placements -->
<!-- Citation modal (hidden) -->
</div>
</main>
<footer><!-- Site footer --></footer>
</body>
</html>
```
### 3.2 Canonical URL Extraction
```css
/* Selector */
link[rel="canonical"]
/* Attribute */
href
```
### 3.3 Title Extraction
```css
/* Primary selector (recommended) */
h1 .pagefind-title
/* Fallback */
h1
/* JSON-LD fallback */
script[type="application/ld+json"] → QAPage.mainEntity.name
```
**DOM Pattern:**
```html
<h1 class="text-3xl sm:text-4xl lg:text-5xl font-bold text-gray-900 mb-4 mt-20">
Q. <span class="pagefind-title">What caused World War II?</span>
</h1>
```
**Note:** Extract inner text of `.pagefind-title` to exclude "Q. " prefix.
### 3.4 Summary Extraction
```css
/* Container (for event handling context) */
[data-summary-container]
/* Text node (primary extraction target) */
[data-summary-text]
#summary-text
```
**DOM Pattern:**
```html
<div class="relative bg-green-100 border-l-4 border-green-400 p-6 ..."
data-summary-container role="button" tabindex="0">
<div class="flex-shrink-0 w-8 h-8 bg-green-600 rounded-full ...">
<!-- Checkmark icon -->
</div>
<div class="text-xl text-gray-800 w-full">
<p id="summary-text"
class="text-gray-800 leading-relaxed p-4 ..."
data-summary-text>
World War II was caused by unresolved issues from World War I...
</p>
</div>
</div>
```
### 3.5 Answer Body Extraction
```css
/* Primary selector (combined class + data attribute) */
.answer-body[data-answer-text]
/* Alternative selectors */
[data-answer-text]
.answer-body
```
**DOM Pattern:**
```html
<div class="answer-container" data-answer-container>
<button type="button" class="answer-copy-btn" data-answer-copy-btn>
<!-- Copy button UI -->
</button>
<div class="answer-body text-lg prose prose-lg min-w-full text-gray-700 leading-relaxed space-y-6 mb-8"
data-answer-text>
<p>World War II (1939–1945) began with the German invasion of Poland...</p>
<!-- Additional paragraphs -->
</div>
</div>
```
**Extraction Algorithm:**
```python
# Recommended: preserve paragraph structure
paragraphs = document.select('.answer-body[data-answer-text] > p')
answer_text = '\n\n'.join(p.get_text(strip=True) for p in paragraphs)
# Fallback: raw text
answer_node = document.select_one('[data-answer-text]')
answer_text = answer_node.get_text(separator=' ', strip=True)
```
### 3.6 Pagefind Search Index Block (Hidden)
```css
/* Selector */
[data-pagefind-index="true"].pagefind-search-block
/* Note: This block is display:none but contains structured extraction targets */
```
**DOM Pattern:**
```html
<div data-pagefind-index="true" class="hidden pagefind-search-block">
<h2>What caused World War II?</h2>
<p>World War II was caused by unresolved issues from World War I...</p>
<div>Full answer text...</div>
</div>
```
**Fields:**
- `h2` → Question title
- `p` → Summary
- `div` → Answer
---
## 4. Semantic Enrichment Sections
All semantic sections follow a consistent container pattern:
```html
<div class="qeeebo-section" id="{section-id}">
<div class="qeeebo-section-header">
<div class="qeeebo-section-icon purple"><!-- SVG --></div>
<div class="qeeebo-section-title">{Section Title}</div>
</div>
<div class="qeeebo-section-body">
<!-- Section-specific content -->
</div>
</div>
```
### 4.1 Key Facts
```css
/* Section root */
#key-facts
.qeeebo-section:has(.qeeebo-section-title:contains("Key Facts"))
/* Items */
#key-facts .key-facts-list li
/* Alternative */
.key-facts-list li
```
**DOM Pattern:**
```html
<div class="qeeebo-section" id="key-facts">
<div class="qeeebo-section-header">...</div>
<div class="qeeebo-section-body">
<ul class="key-facts-list">
<li>World War II (1939–1945) began with the German invasion of Poland...</li>
<li>The rise of fascism in Germany, Italy, and militarism in Japan...</li>
<li>Became the deadliest conflict in human history</li>
</ul>
</div>
</div>
```
**Output Schema:**
```json
{
"key_facts": [
"World War II (1939–1945) began with the German invasion of Poland...",
"The rise of fascism in Germany, Italy, and militarism in Japan...",
"Became the deadliest conflict in human history"
]
}
```
### 4.2 Glossary
```css
/* Section root */
#glossary
/* Card container */
#glossary .glossary-grid
.glossary-grid
/* Individual cards */
#glossary .glossary-card
.glossary-card
/* Card components */
.glossary-card .glossary-term /* Term name */
.glossary-card .glossary-def /* Definition text */
.glossary-card .glossary-source /* Source attribution (e.g., "Source: wordnet") */
```
**DOM Pattern:**
```html
<div class="qeeebo-section" id="glossary">
<div class="qeeebo-section-header">...</div>
<div class="qeeebo-section-body">
<div class="glossary-grid">
<div class="glossary-card">
<div class="glossary-term">cause</div>
<div class="glossary-def">events that provide the generative force...</div>
<div class="glossary-source">Source: wordnet</div>
</div>
<!-- Additional cards -->
</div>
</div>
</div>
```
**Output Schema:**
```json
{
"glossary": [
{
"term": "cause",
"definition": "events that provide the generative force...",
"source": "wordnet"
}
]
}
```
### 4.3 Reading Level Analysis
```css
/* Section root */
#reading-level
/* Grade level badge */
#reading-level .reading-level-badge
.reading-level-badge
/* Statistics grid */
#reading-level .reading-stats
.reading-stats
/* Individual stat cards */
.reading-stat-card
.reading-stat-card .reading-stat-value /* Numeric value */
.reading-stat-card .reading-stat-label /* Metric name */
```
**DOM Pattern:**
```html
<div class="qeeebo-section" id="reading-level">
<div class="qeeebo-section-header">...</div>
<div class="qeeebo-section-body">
<div class="reading-level-badge">
<svg>...</svg>
Grade 8 Reading Level
</div>
<div class="reading-stats">
<div class="reading-stat-card">
<div class="reading-stat-value">54.3</div>
<div class="reading-stat-label">Flesch Reading Ease</div>
</div>
<div class="reading-stat-card">
<div class="reading-stat-value">8.1</div>
<div class="reading-stat-label">Flesch-Kincaid Grade</div>
</div>
<div class="reading-stat-card">
<div class="reading-stat-value">5</div>
<div class="reading-stat-label">Sentences</div>
</div>
<div class="reading-stat-card">
<div class="reading-stat-value">48</div>
<div class="reading-stat-label">Words</div>
</div>
</div>
</div>
</div>
```
**Output Schema:**
```json
{
"reading_level": {
"grade": "Grade 8",
"metrics": {
"flesch_reading_ease": 54.3,
"flesch_kincaid_grade": 8.1,
"sentences": 5,
"words": 48
}
}
}
```
### 4.4 Key Entities
```css
/* Section root */
#key-entities
/* Entity chips */
#key-entities .entity-chip
.entities-grid .entity-chip
.entity-chip
```
**DOM Pattern:**
```html
<div class="qeeebo-section" id="key-entities">
<div class="qeeebo-section-header">...</div>
<div class="qeeebo-section-body">
<div class="entities-grid">
<span class="entity-chip">
<svg>...</svg>
Britain and France
</span>
<span class="entity-chip">
<svg>...</svg>
Treaty of Versailles
</span>
<!-- Additional entities -->
</div>
</div>
</div>
```
**Extraction Note:** Strip SVG content; extract only text nodes.
**Output Schema:**
```json
{
"entities": [
"Britain and France",
"Treaty of Versailles",
"World War",
"German",
"Poland",
"Germany",
"Italy",
"Japan"
]
}
```
### 4.5 Keywords
```css
/* Section root */
#keywords
/* Keyword tags */
#keywords .keyword-tag
.keywords-container .keyword-tag
.keyword-tag
```
**DOM Pattern:**
```html
<div class="qeeebo-section" id="keywords">
<div class="qeeebo-section-header">...</div>
<div class="qeeebo-section-body">
<div class="keywords-container">
<span class="keyword-tag">
<svg>...</svg>
declaration
</span>
<span class="keyword-tag">
<svg>...</svg>
militarism
</span>
<!-- Additional keywords -->
</div>
</div>
</div>
```
**Output Schema:**
```json
{
"keywords": [
"declaration",
"militarism",
"versailles",
"aftermath",
"invasion",
"conflict",
"britain",
"fascism",
"germany",
"german"
]
}
```
### 4.6 Editorial Attribution
```css
/* Selector */
.verified-by
/* Note: Contains human-readable review context */
```
**DOM Pattern:**
```html
<div class="verified-by text-sm text-gray-500 bg-gray-50 p-4 rounded-xl mt-6 mb-10 mx-auto shadow-sm">
<div>This answer is part of a content set reviewed on December 11, 2025 as part of the
<a href="/about" class="...">Qeeebo Team</a>
editorial quality checks. Reviews include source vetting, representative sampling,
and automated topic-level cross-checks.
</div>
</div>
```
---
## 5. Related Questions
```css
/* Section identification */
section:has(h2:contains("Related questions"))
/* Link extraction */
section h2 + ul li a[href^="/questions/"]
/* Alternative */
ul.grid li a[href*="/questions/"]
```
**DOM Pattern:**
```html
<section class="mt-8 mb-16">
<h2 class="text-3xl sm:text-4xl font-semibold text-gray-900 mb-6">Related questions</h2>
<ul class="grid grid-cols-1 md:grid-cols-2 gap-x-10 gap-y-4 list-disc list-inside">
<li class="marker:text-qeeebo-purple bg-white border border-gray-200 rounded-xl px-6 py-4 hover:shadow-md transition">
<a href="/questions/for/78/05f/1996-ford-explorer-xlt-how-do-i-check-the-gauge-light-when-it-comes-on/"
class="text-lg sm:text-xl font-medium text-qeeebo-purple underline underline-offset-4 hover:text-purple-800">
1996 Ford Explorer XLT: How do I check the gauge light when it comes on?
</a>
</li>
<!-- Additional items -->
</ul>
</section>
```
---
## 6. JSON-LD Structured Data Specification
Question pages embed multiple JSON-LD blocks in `<head>`. Extract via:
```css
script[type="application/ld+json"]
```
### 6.1 Organization Schema
```json
{
"@context": "https://schema.org",
"@type": "Organization",
"@id": "https://qeeebo.com/#organization",
"name": "Qeeebo",
"url": "https://qeeebo.com",
"logo": "https://qeeebo.com/images/qeeebo-logo.png",
"description": "Qeeebo is a curated knowledge publishing platform...",
"sameAs": [
"https://x.com/Qeeebo_",
"https://www.youtube.com/@Qeeebo"
]
}
```
### 6.2 QAPage Schema (Primary Extraction Target)
```json
{
"@context": "https://schema.org",
"@type": "QAPage",
"dateModified": "2025-12-11T00:00:00Z",
"datePublished": "2025-12-11T00:00:00Z",
"mainEntity": {
"@id": "https://qeeebo.com/questions/{path}/#question",
"@type": "Question",
"name": "What caused World War II?",
"text": "What caused World War II?",
"answerCount": 1,
"dateModified": "2025-12-11T00:00:00Z",
"datePublished": "2025-12-11T00:00:00Z",
"acceptedAnswer": {
"@id": "https://qeeebo.com/questions/{path}/#answer",
"@type": "Answer",
"text": "World War II (1939–1945) began with the German invasion of Poland...",
"url": "https://qeeebo.com/questions/{path}/",
"dateModified": "2025-12-11T00:00:00Z",
"datePublished": "2025-12-11T00:00:00Z",
"author": {
"@id": "https://qeeebo.com/#organization",
"@type": "Organization"
},
"publisher": {
"@id": "https://qeeebo.com/#organization",
"@type": "Organization"
}
}
},
"publisher": {
"@id": "https://qeeebo.com/#organization",
"@type": "Organization"
}
}
```
**Key Extraction Paths:**
| Field | JSON Path |
|-----------------|----------------------------------------------|
| Question | `mainEntity.name` |
| Answer | `mainEntity.acceptedAnswer.text` |
| Canonical URL | `mainEntity.acceptedAnswer.url` |
| Date Published | `datePublished` |
| Date Modified | `dateModified` |
### 6.3 BreadcrumbList Schema
```json
{
"@context": "https://schema.org",
"@type": "BreadcrumbList",
"itemListElement": [
{"@type": "ListItem", "position": 1, "name": "Home", "item": "https://qeeebo.com/"},
{"@type": "ListItem", "position": 2, "name": "Topics", "item": "https://qeeebo.com/topics/"},
{"@type": "ListItem", "position": 3, "name": "World History", "item": "https://qeeebo.com/topics/wo/world-history"},
{"@type": "ListItem", "position": 4, "name": "What caused World War II?", "item": "https://qeeebo.com/questions/wha/e7/6f1/what-caused-world-war-ii/"}
]
}
```
**Use Case:** Extract topic hierarchy for categorization.
---
## 7. Meta Tags Extraction
### 7.1 Standard Meta
```css
meta[name="description"] /* Summary/snippet */
meta[name="viewport"] /* Viewport config */
meta[name="theme-color"] /* Brand color: #8c51bb */
```
### 7.2 Open Graph
```css
meta[property="og:title"] /* Page title */
meta[property="og:description"] /* Summary */
meta[property="og:url"] /* Canonical URL */
meta[property="og:type"] /* "article" */
meta[property="og:image"] /* Social preview image */
meta[property="og:site_name"] /* "Qeeebo" */
meta[property="og:locale"] /* "en_US" */
```
### 7.3 Twitter Cards
```css
meta[name="twitter:card"] /* "summary_large_image" */
meta[name="twitter:title"] /* Page title */
meta[name="twitter:description"] /* Summary */
meta[name="twitter:image"] /* Preview image */
```
---
## 8. Elements to Exclude from Extraction
### 8.1 Navigation & Chrome
```css
header /* Site header, navigation */
footer /* Site footer */
nav[aria-label="Breadcrumb"] /* Breadcrumb nav (extract from JSON-LD instead) */
#hamburger-btn /* Mobile menu button */
#menu-dropdown /* Mobile menu content */
```
### 8.2 Interactive UI Components
```css
#cite-panel /* Citation modal overlay */
[data-q-citations] /* Citation modal container */
.cite-panel /* Citation panel */
.cite-modal /* Modal window */
.cite-share-tools /* Cite & Share toolbar */
button[onclick*="openCitePanel"] /* Cite buttons */
.answer-copy-btn /* Copy answer button */
[data-answer-copy-btn] /* Copy button data attribute */
```
### 8.3 Advertisement Placements
```css
[data-ad-placement] /* Generic ad marker */
#qeeebo-ad-spot /* Named ad slot */
#text-link-ad /* Text link ad slot */
```
### 8.4 Hidden/Utility Elements
```css
.hidden /* Tailwind hidden class */
[aria-hidden="true"] /* ARIA hidden */
script /* JavaScript blocks */
style /* Inline styles */
```
### 8.5 Citation Payload Containers (Extract Data, Not Display)
```css
[data-cite] /* Pre-formatted citations (hidden) */
[data-weblink-format] /* Link format templates */
[data-permalink-format] /* Permalink templates */
[data-export-payload] /* Export format payloads (BibTeX, RIS, etc.) */
```
---
## 9. Complete Extraction Algorithm
### 9.1 Python Reference Implementation
```python
from bs4 import BeautifulSoup
import json
import re
def extract_qeeebo_question(html: str) -> dict:
"""
Extract structured data from a Qeeebo question page.
Args:
html: Raw HTML string of the question page
Returns:
Dictionary with extracted fields
"""
soup = BeautifulSoup(html, 'html.parser')
result = {}
# 1. Canonical URL
canonical = soup.select_one('link[rel="canonical"]')
result['url'] = canonical['href'] if canonical else None
# 2. Question title
title_span = soup.select_one('h1 .pagefind-title')
if title_span:
result['question'] = title_span.get_text(strip=True)
else:
h1 = soup.select_one('h1')
result['question'] = h1.get_text(strip=True).lstrip('Q. ') if h1 else None
# 3. Summary
summary = soup.select_one('[data-summary-text]')
result['summary'] = summary.get_text(strip=True) if summary else None
# 4. Answer (preserve paragraph structure)
answer_body = soup.select_one('.answer-body[data-answer-text]')
if answer_body:
paragraphs = answer_body.find_all('p', recursive=False)
if paragraphs:
result['answer'] = '\n\n'.join(p.get_text(strip=True) for p in paragraphs)
else:
result['answer'] = answer_body.get_text(separator=' ', strip=True)
else:
result['answer'] = None
# 5. Key Facts
key_facts = soup.select('#key-facts .key-facts-list li')
result['key_facts'] = [li.get_text(strip=True) for li in key_facts] if key_facts else []
# 6. Glossary
glossary_cards = soup.select('#glossary .glossary-card')
result['glossary'] = []
for card in glossary_cards:
term = card.select_one('.glossary-term')
defn = card.select_one('.glossary-def')
source = card.select_one('.glossary-source')
result['glossary'].append({
'term': term.get_text(strip=True) if term else None,
'definition': defn.get_text(strip=True) if defn else None,
'source': source.get_text(strip=True).replace('Source: ', '') if source else None
})
# 7. Reading Level
reading_section = soup.select_one('#reading-level')
if reading_section:
badge = reading_section.select_one('.reading-level-badge')
stats = reading_section.select('.reading-stat-card')
result['reading_level'] = {
'grade': badge.get_text(strip=True) if badge else None,
'metrics': {}
}
for stat in stats:
value = stat.select_one('.reading-stat-value')
label = stat.select_one('.reading-stat-label')
if value and label:
key = label.get_text(strip=True).lower().replace(' ', '_').replace('-', '_')
try:
result['reading_level']['metrics'][key] = float(value.get_text(strip=True))
except ValueError:
result['reading_level']['metrics'][key] = value.get_text(strip=True)
else:
result['reading_level'] = None
# 8. Entities
entities = soup.select('#key-entities .entity-chip')
result['entities'] = []
for chip in entities:
# Remove SVG, get text only
for svg in chip.find_all('svg'):
svg.decompose()
text = chip.get_text(strip=True)
if text:
result['entities'].append(text)
# 9. Keywords
keywords = soup.select('#keywords .keyword-tag')
result['keywords'] = []
for tag in keywords:
for svg in tag.find_all('svg'):
svg.decompose()
text = tag.get_text(strip=True)
if text:
result['keywords'].append(text)
# 10. Related Questions
related_links = soup.select('section ul li a[href^="/questions/"]')
result['related_questions'] = [
{
'title': a.get_text(strip=True),
'url': 'https://qeeebo.com' + a['href']
}
for a in related_links
]
# 11. JSON-LD Metadata (validation/fallback)
jsonld_scripts = soup.select('script[type="application/ld+json"]')
for script in jsonld_scripts:
try:
data = json.loads(script.string)
if data.get('@type') == 'QAPage':
result['jsonld'] = {
'date_published': data.get('datePublished'),
'date_modified': data.get('dateModified'),
'answer_text': data.get('mainEntity', {}).get('acceptedAnswer', {}).get('text')
}
break
except (json.JSONDecodeError, TypeError):
continue
# 12. Topic from Breadcrumb
breadcrumb = soup.select('nav[aria-label="Breadcrumb"] a')
topics = [a.get_text(strip=True) for a in breadcrumb if '/topics/' in a.get('href', '')]
result['topic'] = topics[-1] if topics else None
return result
```
### 9.2 Output Schema (JSON)
```json
{
"url": "https://qeeebo.com/questions/wha/e7/6f1/what-caused-world-war-ii/",
"question": "What caused World War II?",
"summary": "World War II was caused by unresolved issues from World War I, economic crises, and aggressive expansion by Axis powers.",
"answer": "World War II (1939–1945) began with the German invasion of Poland, followed by declarations of war by Britain and France. The rise of fascism in Germany, Italy, and militarism in Japan, along with the Treaty of Versailles' aftermath, were key causes. It became the deadliest conflict in human history.",
"key_facts": [
"World War II (1939–1945) began with the German invasion of Poland, followed by declarations of war by Britain and France",
"The rise of fascism in Germany, Italy, and militarism in Japan, along with the Treaty of Versailles' aftermath, were key causes",
"Became the deadliest conflict in human history"
],
"glossary": [
{
"term": "cause",
"definition": "events that provide the generative force that is the origin of something",
"source": "wordnet"
}
],
"reading_level": {
"grade": "Grade 8 Reading Level",
"metrics": {
"flesch_reading_ease": 54.3,
"flesch_kincaid_grade": 8.1,
"sentences": 5,
"words": 48
}
},
"entities": ["Britain and France", "Treaty of Versailles", "World War", "German", "Poland", "Germany", "Italy", "Japan"],
"keywords": ["declaration", "militarism", "versailles", "aftermath", "invasion", "conflict", "britain", "fascism", "germany", "german"],
"related_questions": [
{
"title": "What is inflation?",
"url": "https://qeeebo.com/questions/wha/41/bbf/what-is-inflation/"
}
],
"topic": "World History",
"jsonld": {
"date_published": "2025-12-11T00:00:00Z",
"date_modified": "2025-12-11T00:00:00Z",
"answer_text": "World War II (1939–1945) began with the German invasion of Poland..."
}
}
```
---
## 10. Embedded Citation Formats
The page contains pre-formatted citation payloads in hidden elements for programmatic access:
### 10.1 Selectors
```css
[data-cite="apa7"] /* APA 7th edition */
[data-cite="mla9"] /* MLA 9th edition */
[data-cite="chicago-notes"] /* Chicago Notes-Bibliography */
[data-cite="chicago-author"] /* Chicago Author-Date */
[data-cite="harvard"] /* Harvard */
[data-cite="ieee"] /* IEEE */
[data-cite="turabian"] /* Turabian */
[data-export-payload="bibtex"] /* BibTeX */
[data-export-payload="ris"] /* RIS */
[data-export-payload="endnote"] /* EndNote */
[data-export-payload="jsonld"] /* JSON-LD */
[data-export-payload="yaml"] /* YAML front matter */
```
### 10.2 Example: BibTeX Extraction
```css
[data-export-payload="bibtex"]
```
**Content:**
```bibtex
@misc{ qeeebo-9b074abdb396,
title = { What caused World War II? },
author = { Qeeebo Editorial Team },
year = { 2025 },
howpublished = { Qeeebo },
url = { https://qeeebo.com/questions/wha/e7/6f1/what-caused-world-war-ii/ },
note = { Accessed January 23, 2026 }
}
```
---
## 11. Technical Notes
### 11.1 Character Encoding
- Document: UTF-8 (`<meta charset="utf-8">`)
- Special characters: Unicode preserved (e.g., "1939–1945" uses en-dash U+2013)
### 11.2 Content Rendering
- Pages are **static HTML**; core content does not require JavaScript execution
- JavaScript is used only for: copy-to-clipboard, citation modal, menu toggle
- Safe for headless extraction without JS rendering
### 11.3 Duplicate Handling
- Always follow `<link rel="canonical">` for deduplication
- URL variations (trailing slash, case) should resolve to canonical
### 11.4 Rate Limiting
- Respect `robots.txt` directives
- Sitemap-based discovery recommended for bulk crawling
### 11.5 Freshness
- `dateModified` in JSON-LD indicates last content update
- Sitemap `<lastmod>` provides crawl freshness hints
---
## 12. Contact & Policies
| Resource | URL |
|--------------------|----------------------------------------------|
| About | https://qeeebo.com/about |
| Editorial Policy | https://qeeebo.com/editorial-policy |
| Content Use Policy | https://qeeebo.com/content-use-policy |
| Citation Policy | https://qeeebo.com/citation-policy |
| Terms of Service | https://qeeebo.com/termsofservice |
| Privacy Policy | https://qeeebo.com/privacypolicy |
| Corrections Policy | https://qeeebo.com/corrections-policy |
| Contact | https://qeeebo.com/contact-us |
---
## Appendix A: CSS Selector Quick Reference
| Content | Primary Selector | Fallback |
|------------------------|-----------------------------------------|-----------------------------|
| Canonical URL | `link[rel="canonical"]` | JSON-LD `acceptedAnswer.url`|
| Question Title | `h1 .pagefind-title` | `h1` |
| Summary | `[data-summary-text]` | `meta[name="description"]` |
| Answer Body | `.answer-body[data-answer-text]` | `[data-answer-text]` |
| Key Facts | `#key-facts .key-facts-list li` | — |
| Glossary Terms | `#glossary .glossary-card .glossary-term` | — |
| Glossary Definitions | `#glossary .glossary-card .glossary-def` | — |
| Reading Level Badge | `#reading-level .reading-level-badge` | — |
| Reading Stats | `.reading-stat-card` | — |
| Entities | `#key-entities .entity-chip` | `.entity-chip` |
| Keywords | `#keywords .keyword-tag` | `.keyword-tag` |
| Related Questions | `section ul li a[href^="/questions/"]` | — |
| JSON-LD | `script[type="application/ld+json"]` | — |
---
## Appendix B: Changelog
| Version | Date | Changes |
|---------|------------|-----------------------------------------------------------|
| 2.0 | 2026-01-31 | Complete rewrite with verified DOM selectors, JSON-LD schemas, Python extraction algorithm |
| 1.0 | — | Initial release |