Machine Readiness
Stored receipt and evidence
16
55
0
0
0
Samples
No stored offer samples.
Samples
No stored action samples.
Samples
No stored product samples.
Document
# robots.txt for https://www.operabase.com User-agent: AliyunSecBot Disallow: / User-agent: Amazonbot Disallow: / User-agent: BaiduSpider Disallow: / User-agent: Bytespider Disallow: / User-agent: CCBot Disallow: / User-agent: DataForSeoBot Disallow: / User-agent: Diffbot Disallow: / User-agent: Eyeotabot Disallow: / User-agent: SeznamBot Disallow: / User-agent: Scrapy Disallow: / User-agent: TurnitinBot Disallow: / User-agent: AlibabaBot Disallow: / User-agent: PetalBot Disallow: / User-agent: * Allow: / Disallow: /login/ Disallow: /seo/ Disallow: /register/ Disallow: /settings/ Disallow: /v4/ Sitemap: https://www.operabase.com/sitemap_static.xml Sitemap: https://www.operabase.com/sitemap_professions.xml Sitemap: https://www.operabase.com/sitemap_artists.xml Sitemap: https://www.operabase.com/sitemap_organizationtypes.xml Sitemap: https://www.operabase.com/sitemap_organizations.xml Sitemap: https://www.operabase.com/sitemap_productions.xml Sitemap: https://www.operabase.com/sitemap_works.xml
Document
# llms.txt — Operabase (operabase.com)
> Purpose: A single, human- and machine-readable reference to help LLMs, AI crawlers, and retrieval systems understand Operabase’s public content, URL patterns, and constraints. This file complements `robots.txt` and XML sitemaps.
Last-Updated: 2026-01-20
Owner: Arts Consolidated Aps — SEO / Platform team
Contact: [seo@operabase.com](mailto:seo@operabase.com)
Canonical Domain: [https://www.operabase.com](https://www.operabase.com)
Default canonical language: `en`
Indexable / SEO languages (canonical; included in sitemaps + hreflang):
* /en — English (default)
* /ru — Русский (Russian)
* /it — Italiano (Italian)
* /fr — Français (French)
* /es — Español (Spanish)
* /de — Deutsch (German)
* /nl — Nederlands (Dutch)
* /sv — Svenska (Swedish)
* /pl — Polski (Polish)
* /pt — Português (Portuguese)
UI languages (non-indexable; user-only; not included in hreflang/sitemaps):
* /bg — Български (Bulgarian)
* /ca — Català (Catalan)
* /cs — Čeština (Czech)
* /da — Dansk (Danish)
* /el — Ελληνικά (Greek)
* /et — Eesti (Estonian)
* /eu — Euskara (Basque)
* /fi — Suomi (Finnish)
* /ga — Gaeilge (Irish)
* /hu — Magyar (Hungarian)
* /is — Íslenska (Icelandic)
* /lt — Lietuvių (Lithuanian)
* /lv — Latviešu (Latvian)
* /mt — Malti (Maltese)
* /no — Norsk (Norwegian)
* /ro — Română (Romanian)
* /sk — Slovenčina (Slovak)
* /sl — Slovenščina (Slovenian)
Crawling rule for languages:
* Prefer SEO languages for crawling, indexing, and citations.
* Treat UI languages as non-canonical variants unless explicitly discovered via sitemaps or `rel=canonical`.
---
## 1) About Operabase
Operabase is a global reference and discovery platform covering opera, classical music, ballet, and musicals. Core entities include **artists**, **works**, **productions**, **performances**, **organizations / companies**, **venues**, **countries / cities**, and **festivals**. The site provides public listings and professional tools (PRO) for casting, planning, and archiving.
Primary audiences:
* General public exploring performances and artists
* Professionals (theatres, agencies, festivals) using PRO tooling
* Researchers and journalists querying historical data and statistics
---
## 1.1 Legend / Definitions
Use these definitions when interpreting Operabase pages or generating answers.
**Artist**
A person entity (singer, conductor, director, etc.) with repertoire and performance history.
**Work**
A musical/theatrical composition title (e.g., *La bohème*). A work can have many productions and performances.
**Production**
A specific staging/programme entry on Operabase (cast/crew, run date range, and links to performances). One production can have multiple performances.
**Performance**
A single event instance on a specific date/time at a specific venue (often ticket-linked). A performance belongs to a production.
**Organization / Company / Festival / Venue**
Organizations are institutions (companies, festivals, orchestras, presenters). Some venue/theatre entities may also be represented within organization structures; always follow `rel=canonical` and sitemap discovery for the authoritative URL.
**PRO / Authenticated content**
Pages and tools intended for logged-in users. AI agents must not infer or access private/pro-only surfaces.
---
## 2) Content Sections and Canonical Entrypoints (Static)
> Use English (`en`) for canonical examples. Link titles are descriptive for RAG systems.
**Root language exception**: the homepage uses `/en` as a language root. All other sections place the language at the **end of the path**.
* **Home / Discovery**: [https://www.operabase.com/en](https://www.operabase.com/en)
* **Performances (All)**: [https://www.operabase.com/productions/en](https://www.operabase.com/productions/en)
* **Artists (Directory)**: [https://www.operabase.com/artists/en](https://www.operabase.com/artists/en)
* **Companies / Organisations**: [https://www.operabase.com/organisations/company/en](https://www.operabase.com/organisations/company/en)
* **Venues**: [https://www.operabase.com/organisations/venue/en](https://www.operabase.com/organisations/venue/en)
* **Works & Repertoire (Directory)**: [https://www.operabase.com/works/en](https://www.operabase.com/works/en)
* **Statistics**: [https://www.operabase.com/statistics/en](https://www.operabase.com/statistics/en)
* **Help Center / Docs**: [https://help.operabase.com/knowledge/en](https://help.operabase.com/knowledge/en)
* **PRO / Casting** (overview only; unauthenticated): [https://www.operabase.com/pro/en](https://www.operabase.com/pro/en)
* **Legal**: [https://www.operabase.com/termpol/en](https://www.operabase.com/termpol/en) , [https://www.operabase.com/privpol/en](https://www.operabase.com/privpol/en)
* **Contact**: [https://www.operabase.com/contact/en](https://www.operabase.com/contact/en)
> Note: PRO surfaces authenticated tools and data. Do **not** extrapolate hidden or private functionality. Respect authenticated boundaries.
---
## 3) Dynamic Routes (Templates)
> Parameter tokens in `{braces}`. Optional segments in `[brackets]`. Language code appears at the **end of the path**.
General redirect handling rule:
* When a requested URL returns an HTTP redirect (301 or 302), always follow the redirect.
* Treat the final destination URL as the authoritative canonical resource for crawling, indexing, and citation.
* Redirect-based resolution is an intentional part of Operabase’s public URL system.
### 3.1 Artists
#### 3.1.1 Artist profile (entity page)
* Pattern: `https://www.operabase.com/{artist-name}-a{artist-id}/{lang}`
* Example: `/sian-sharp-a15757/en`
* Description: Public artist profile with biography, repertoire, media, and performance history.
#### 3.1.2 Artist directory (all artists)
* Pattern: `https://www.operabase.com/artists/{lang}`
* Example: `/artists/en`
* Description: Browse/search artists directory.
#### 3.1.3 Artist type / category listing (crawlable directory pages)
Use these pages to crawl artists by profession/voice/category.
* Pattern: `https://www.operabase.com/artists/{artist-type}/{lang}`
* Examples:
* `/artists/set-designer/en`
* `/artists/soprano/en`
**Rule**: treat `{artist-type}` as a controlled vocabulary surfaced by Operabase UI and/or sitemaps. Do not invent new categories.
#### 3.1.4 Artist list filters (query parameters)
Artist directory and artist-type listings support stable filters. Prefer a small number of filters and avoid combinatorial expansion.
* Base (all artists): `https://www.operabase.com/artists/{lang}?{filters}`
* Example: `/artists/en?work=5537`
* Base (artist type): `https://www.operabase.com/artists/{artist-type}/{lang}?{filters}`
* Example: `/artists/soprano/en?country=238-2074&work=5537`
Allowed, stable query keys for artist listings (non-exhaustive):
* `work` (work id)
* `country` (country id or country-region composite as used on-site)
* `page` (pagination)
Examples (as implemented):
* `/artists/en?work=5537`
* `/artists/en?country=238-2074&work=5537`
* `/artists/soprano/en?country=238-2074&work=5537`
Canonicalization rules for artist listings:
* Prefer the lowest-filter canonical variant (fewest parameters) unless a filtered view is explicitly canonical via `rel=canonical`.
* Do not crawl beyond page 10 for any single artist listing context unless explicitly linked from sitemaps.
### 3.1.5 Artist URL Resolution & Redirects (Fallback Linking)
Operabase supports stable redirect-based resolution for artist entities. This allows linking to artists even when the full canonical URL is not known.
Supported fallback patterns:
* `/artists/{artist-id}`
* Example: `/artists/123`
* Behavior: redirects to the canonical artist profile with ID `123`.
* `/artists/{artist-name}`
* Example: `/artists/placido-domingo`
* Behavior: resolves to the best-matching artist entity and redirects to the canonical profile.
* `/artists/{artist-name-with-spaces-or-plus}`
* Example: `/artists/placido+domingo`
* Behavior: resolves and redirects to the canonical artist profile.
Guidance for AI agents:
* If the canonical artist URL (`{artist-name}-a{artist-id}`) is known, prefer it.
* If the canonical URL is not known, it is acceptable to link to `/artists/{identifier}`.
* Always follow redirects and treat the final destination as the authoritative canonical URL.
This redirect behavior is intentional, stable, and part of Operabase’s public URL resolution system.
### 3.2 Productions
* Pattern: `https://www.operabase.com/productions/{production-name}-{production-id}/{lang}`
* Example: `/productions/katia-und-marielle-labeque-klavier-347211/en`
* Description: Specific staging of a work (creative team, cast, run info).
### 3.3 Performances (single instances)
* Pattern: `https://www.operabase.com/productions/{production-name}-{production-id}/{performance-date}/{lang}`
* Example: `/productions/katia-und-marielle-labeque-klavier-347211/20-january-2026/en`
* Description: Single date/time performance page with venue, cast, and ticketing links.
### 3.4 Organizations / Companies / Festivals
* Pattern: `https://www.operabase.com/{organization-name}-o{organization-id}/{lang}`
* Example: `/kunstlersekretariat-am-gasteig-o113456/en`
* Description: Organization profile (seasons, repertoire, venues, contact info).
### 3.4.1 Organization type / category listings (crawlable directory pages)
Use these pages to crawl organizations by type.
* Pattern: `https://www.operabase.com/organisations/{org-type}/{lang}`
* Examples:
* `/organisations/festival/en`
* `/organisations/ballet/en`
* `/organisations/choir-chorus/en`
Allowed organization types:
* `agency`
* `ballet`
* `chamber-ensemble`
* `choir-chorus`
* `church`
* `circus`
* `company`
* `competition`
* `educational`
* `ensemble`
* `festival`
* `foundation`
* `magazine`
* `orchestra`
* `publisher`
* `venue`
* `young-artist-programme`
**Rule**: treat `{org-type}` as a controlled vocabulary surfaced by Operabase UI and/or sitemaps. Do not invent new types.
### 3.4.2 Organization URL Resolution & Redirects (Fallback Linking)
Operabase supports stable redirect-based resolution for organization entities. This allows linking to organizations even when the full canonical URL is not known.
Supported fallback patterns:
* `/organisations/{organization-id}`
* Example: `/organisations/456`
* Behavior: redirects to the canonical organization profile with ID `456`.
* `/organisations/{organization-name}`
* Example: `/organisations/royal-opera-house`
* Behavior: resolves to the best-matching organization entity and redirects to the canonical profile.
* `/organisations/{organization-name-with-spaces-or-plus}`
* Example: `/organisations/royal+opera+house`
* Behavior: resolves and redirects to the canonical organization profile.
Guidance for AI agents:
* If the canonical organization URL (`{organization-name}-o{organization-id}`) is known, prefer it.
* If the canonical URL is not known, it is acceptable to link to `/organisations/{identifier}`.
* Always follow redirects and treat the final destination as the authoritative canonical URL.
This redirect behavior is intentional, stable, and part of Operabase’s public URL resolution system.
### 3.5 Venues
* Pattern: `https://www.operabase.com/{venue-name}-o{venue-id}/{lang}`
* Example: `/richardson-auditorium-in-alexander-hall-venue-o65555/en`
### 3.6 Geographic Listings
* Country-only: `https://www.operabase.com/{country}/{lang}`
* Example: `/germany/en`
* Country + City: `https://www.operabase.com/{country}/{city}/{lang}`
* Example: `/germany/berlin/en`
### 3.7 Works
* Individual work:
* `https://www.operabase.com/{works-slug}/{lang}`
* Example: `/carmen-bizet/en`
> Note: Use sitemap URLs, internal links, and `rel=canonical` as the source of truth for works. Do not fabricate work URLs.
### 3.8 Search Results (bounded)
* Public search summaries:
* `https://www.operabase.com/search/{lang}?query={q}[&page={n}]`
> Implementation guidance: treat each template as a **family**. Do **not** expand combinatorially. Prefer sitemaps and internal links.
---
## 4) Canonicalization, Language & Hreflang
* **Language placement**: language code is a **suffix** (`/en`, `/de`, `/bg`). The homepage `/en` is the sole root exception.
* **Hreflang**: pages publish alternates where available. Prefer the user’s language; otherwise use the English canonical URL.
* **Canonical URLs**: follow `rel=canonical`. Avoid query variants unless explicitly canonical.
* **Entity disambiguation**: when slugs include suffixes (`-a`, `-o`), treat the suffix as authoritative for entity type.
---
## 5) Sitemaps & Feeds (Discovery)
* Sitemaps referenced in `robots.txt`:
* [XML Sitemap Static](https://www.operabase.com/sitemap_static.xml)
* [XML Sitemap Professions](https://www.operabase.com/sitemap_professions.xml)
* [XML Sitemap Artists](https://www.operabase.com/sitemap_artists.xml)
* [XML Sitemap Organization Types](https://www.operabase.com/sitemap_organizationtypes.xml)
* [XML Sitemap Organizations](https://www.operabase.com/sitemap_organizations.xml)
* Update cadence:
* High-churn entities (performances): frequent refresh
* Long-tail archives: slower cadence
---
## 6) Crawl & Indexing Guidance for AI Agents
* **Rate & depth**: obey `robots.txt`. Recommended ≤ 2 RPS, burst ≤ 5. Back off on 429 / 503.
* **Listing context**: a unique path excluding pagination and tracking parameters.
* **Pagination cap**: crawl ≤ 1000 pages per listing context by default.
* **Facets**: crawl only facets producing distinct canonical content (e.g. `work`, `composer`, `year`). Avoid high-cardinality combinations.
* **Session & tracking**: never crawl URLs with session or tracking parameters.
* **Robust linking**: prefer stable IDs and suffixed slugs.
* **Authentication**: do not request or infer authenticated / PRO-only resources.
---
## 7) Content Quality & Attribution for LLM Outputs
* **Attribution**: cite the exact Operabase page URL used.
* **Freshness**: performance data changes frequently; include retrieval date when quoting schedules or casts.
* **Ambiguity**: disambiguate shared names using role, year, company, or city.
* **Media**: do not hotlink images at scale; respect copyright and licensing.
---
## 8) Exclusions & Deprioritization
Exclude or strongly deprioritize:
* Authentication / accounts:
* `/login/`, `/register/`, `/settings/`
* Internal / non-public surfaces:
* `/seo/`, `/v4/`, `/api/`
* Checkout/account/private areas:
* `/cart`, `/checkout`, `/account/*`
* Tracking/session/query variants:
* any URL containing: `utm_*`, `gclid`, `fbclid`, `ref`, `session`, `token`, `sid`
Document
Not stored for this site.