# inworld.ai

> Markdown mirror of DialtoneApp's public top-site detail page for `inworld.ai`.

URL: https://dialtoneapp.com/top-sites/inworld.ai/index.md
Canonical HTML: https://dialtoneapp.com/top-sites/inworld.ai

## Summary

- Domain: `inworld.ai`
- Website: https://inworld.ai
- Description: ai readable | score 30 | purchase read only
- Label: ai_readable
- Payment surface: Not available
- Purchase boundary: read_only
- Control boundary: unknown
- Rank: 139593

## robots

~~~text
User-Agent: *
Allow: /
Disallow: /studio
Disallow: /test-hubspot-form
Disallow: /dev-on-prem-copy
Disallow: /tts-old
Disallow: /dev
Disallow: /ds

User-Agent: GPTBot
Allow: /
Disallow: /studio
Disallow: /test-hubspot-form
Disallow: /dev-on-prem-copy
Disallow: /tts-old
Disallow: /dev
Disallow: /ds

User-Agent: ChatGPT-User
Allow: /
Disallow: /studio
Disallow: /test-hubspot-form
Disallow: /dev-on-prem-copy
Disallow: /tts-old
Disallow: /dev
Disallow: /ds

User-Agent: ClaudeBot
Allow: /
Disallow: /studio
Disallow: /test-hubspot-form
Disallow: /dev-on-prem-copy
Disallow: /tts-old
Disallow: /dev
Disallow: /ds

User-Agent: PerplexityBot
Allow: /
Disallow: /studio
Disallow: /test-hubspot-form
Disallow: /dev-on-prem-copy
Disallow: /tts-old
Disallow: /dev
Disallow: /ds

User-Agent: GoogleOther
Allow: /
Disallow: /studio
Disallow: /test-hubspot-form
Disallow: /dev-on-prem-copy
Disallow: /tts-old
Disallow: /dev
Disallow: /ds

User-Agent: Google-Extended
Allow: /
Disallow: /studio
Disallow: /test-hubspot-form
Disallow: /dev-on-prem-copy
Disallow: /tts-old
Disallow: /dev
Disallow: /ds

User-Agent: Amazonbot
Allow: /
Disallow: /studio
Disallow: /test-hubspot-form
Disallow: /dev-on-prem-copy
Disallow: /tts-old
Disallow: /dev
Disallow: /ds

User-Agent: anthropic-ai
Allow: /
Disallow: /studio
Disallow: /test-hubspot-form
Disallow: /dev-on-prem-copy
Disallow: /tts-old
Disallow: /dev
Disallow: /ds

User-Agent: cohere-ai
Allow: /
Disallow: /studio
Disallow: /test-hubspot-form
Disallow: /dev-on-prem-copy
Disallow: /tts-old
Disallow: /dev
Disallow: /ds

User-Agent: Meta-ExternalAgent
Allow: /
Disallow: /studio
Disallow: /test-hubspot-form
Disallow: /dev-on-prem-copy
Disallow: /tts-old
Disallow: /dev
Disallow: /ds

Content-Signal: ai-train=no, search=yes, ai-input=no
Sitemap: https://inworld.ai/sitemap.xml
~~~

## llms

~~~text
# Inworld AI

> Inworld is a research lab focused on realtime voice AI. We build the infrastructure that enables understanding -- top-ranked text-to-speech, speech-to-text, intelligent LLM routing, and a realtime voice pipeline, all accessible through simple APIs. Most trusted by serious developers building voice-first applications.

Inworld TTS-1.5 Max holds the #1 ranking on the Artificial Analysis Speech Arena (ELO ~1,238, April 2026), with 3 of the top 5 positions. See https://inworld.ai/pricing for current rates.

## Products

- [TTS API (Text-to-Speech)](https://inworld.ai/tts): #1 ranked. Low-latency streaming TTS with word, phoneme, and viseme timestamps for lipsync. Supports emotion markup, voice cloning from 15 seconds of audio, and 15 production-quality languages. Models: inworld-tts-1.5-max, inworld-tts-1.5-mini.
- [STT API (Speech-to-Text)](https://inworld.ai/speech-to-text): Multi-provider transcription with voice profiling (emotion, accent, intent detection). 99+ languages via Whisper. Research Preview.
- [Router API](https://inworld.ai/router): OpenAI Chat Completions-compatible API that routes to hundreds of LLM models. Single endpoint, single API key. Free research preview.
- [Realtime API](https://inworld.ai/realtime-api): End-to-end voice pipeline combining STT + LLM + TTS in a single session. WebSocket and WebRTC transports.

## Key Specifications

- **TTS Models**: inworld-tts-1.5-max, inworld-tts-1.5-mini
- **Default Voice**: Sarah
- **TTS Latency**: P90 sub-130ms (Mini), P90 sub-200ms (Max)
- **Pricing**: See https://inworld.ai/pricing
- **Languages**: 15 (optimized for production quality)
- **Voice Cloning**: Single API call with 15 seconds of reference audio
- **Timestamp Data**: Word-level, phoneme-level, and viseme-level for real-time lipsync animation
- **Emotion Support**: Anger, joy, sadness, fear, disgust, surprise via audio markup tags
- **Deployment**: Cloud API + on-premise deployment
- **Router Models**: Hundreds of models (OpenAI, Anthropic, Google, Meta, Mistral, and more)
- **Authentication**: HTTP Basic (Authorization: Basic {KEY})
- **On-Premise**: Full on-premise deployment supported

## Quick Start (TTS)

```python
import requests
import base64

response = requests.post(
    "https://api.inworld.ai/tts/v1/voice",
    headers={"Authorization": "Basic YOUR_API_KEY"},
    json={
        "text": "Hello, I am Sarah.",
        "voiceId": "Sarah",
        "modelId": "inworld-tts-1.5-max"
    }
)
audio = base64.b64decode(response.json()["audioContent"])
```

## Documentation

- [Docs Home](https://docs.inworld.ai)
- [TTS Docs](https://docs.inworld.ai/tts/tts)
- [STT Docs](https://docs.inworld.ai/stt/overview)
- [Realtime API Docs](https://docs.inworld.ai/realtime/overview)
- [LLM Router Docs](https://docs.inworld.ai/router/introduction)
- [Complete API Reference (docs)](https://docs.inworld.ai/llms-full.txt)

## Resources

- [TTS API Quickstart](https://inworld.ai/resources/tts-api-quickstart)
- [Build a Voice Agent in 30 Minutes](https://inworld.ai/resources/build-voice-agent-30-minutes)
- [Migrate from ElevenLabs](https://inworld.ai/resources/migrate-from-elevenlabs)
- [Inworld vs ElevenLabs](https://inworld.ai/resources/inworld-vs-elevenlabs)
- [STT Voice Profiling](https://inworld.ai/resources/stt-voice-profiling-api)
- [Python TTS Tutorial](https://inworld.ai/resources/python-tts-api-tutorial)
- [Best TTS APIs](https://inworld.ai/resources/best-text-to-speech-apis)
- [Voice AI for AI Companions](https://inworld.ai/resources/voice-ai-for-ai-companions)
- [JavaScript TTS Tutorial](https://inworld.ai/resources/javascript-tts-api-tutorial)
- [Inworld vs Cartesia](https://inworld.ai/resources/inworld-vs-cartesia)
- [Inworld vs Deepgram](https://inworld.ai/resources/inworld-vs-deepgram)

## Agent Discovery

- [Full API Reference (marketing)](https://inworld.ai/llms-full.txt)
- [Agent Discovery (agents.json)](https://inworld.ai/.well-known/agents.json)
- [MCP Server](https://github.com/inworld-ai/inworld-mcp)
- [GitHub Organization](https://github.com/inworld-ai)

## Company

- **Website**: https://inworld.ai
- **Documentation**: https://docs.inworld.ai
- **GitHub**: https://github.com/inworld-ai
- **Focus**: Research lab focused on realtime voice AI. #1 ranked. Most trusted for serious developers.
~~~

## llms-full

~~~text
# Inworld AI

> Complete API reference: https://docs.inworld.ai/llms-full.txt

Inworld is a research lab focused on realtime voice AI. We build the #1 ranked models and APIs: text-to-speech, speech-to-text, intelligent LLM routing, and a Realtime API for end-to-end voice conversations. Most trusted by serious developers building voice-first applications that make every user feel understood.

## Products

- [TTS API (Text-to-Speech)](https://inworld.ai/tts): #1 ranked on Artificial Analysis TTS Arena. Low-latency streaming TTS with word, phoneme, and viseme timestamps for lipsync. Supports emotion markup, voice cloning from 15 seconds of audio, and 15 production-quality languages. Models: inworld-tts-1.5-max, inworld-tts-1.5-mini.
- [STT API (Speech-to-Text)](https://inworld.ai/speech-to-text): Multi-provider transcription with voice profiling (emotion, accent, intent detection). 99+ languages via Whisper. Research Preview.
- [Router API](https://inworld.ai/router): OpenAI Chat Completions-compatible API that routes to 200+ LLM models (OpenAI, Anthropic, Google, open-source). Single endpoint, single API key, automatic fallback and cost optimization. Free research preview.
- [Realtime API](https://inworld.ai/realtime-api): End-to-end voice pipeline combining STT + LLM + TTS in a single session. WebSocket and WebRTC transports for real-time conversational AI.

## API Reference

### Authentication

All endpoints use HTTP Basic authentication:

```
Authorization: Basic {YOUR_API_KEY}
```

### TTS REST — Single Request

```
POST https://api.inworld.ai/tts/v1/voice
Content-Type: application/json
Authorization: Basic {YOUR_API_KEY}

{
  "text": "Hello, I am Sarah.",
  "voiceId": "Sarah",
  "modelId": "inworld-tts-1.5-max"
}
```

Returns JSON with base64-encoded audioContent: `{"audioContent": "base64..."}`

### TTS Streaming

```
POST https://api.inworld.ai/tts/v1/voice:stream
Content-Type: application/json
Authorization: Basic {YOUR_API_KEY}

{
  "text": "Hello, I am Sarah.",
  "voiceId": "Sarah",
  "modelId": "inworld-tts-1.5-max"
}
```

Returns a stream of JSON objects. Each line contains a JSON object with `result.audioContent` (base64-encoded audio):

```json
{"result":{"audioContent":"base64-encoded-audio-chunk..."}}
{"result":{"audioContent":"base64-encoded-audio-chunk..."}}
```

### List Voices

```
GET https://api.inworld.ai/voices/v1/voices
Authorization: Basic {YOUR_API_KEY}
```

Returns 271 available voices.

### Voice Cloning

```
POST https://api.inworld.ai/voices/v1/voices:clone
Authorization: Basic {YOUR_API_KEY}
Content-Type: application/json

{
  "displayName": "MyClonedVoice",
  "langCode": "EN_US",
  "voiceSamples": [{"audioData": "base64-encoded-audio"}]
}
```

### Speech-to-Text

```
POST https://api.inworld.ai/stt/v1/transcribe
Authorization: Basic {YOUR_API_KEY}
```

### Router (LLM)

OpenAI Chat Completions-compatible. Routes to 200+ models.

```
POST https://api.inworld.ai/v1/chat/completions
Content-Type: application/json
Authorization: Basic {YOUR_API_KEY}

{
  "model": "gpt-5.4",
  "messages": [
    {"role": "user", "content": "Hello"}
  ]
}
```

### Realtime API

**WebSocket:**
```
wss://api.inworld.ai/api/v1/realtime/session
```

**WebRTC:**
```
POST https://api.inworld.ai/v1/realtime/calls
```

Combines STT + LLM + TTS in a single persistent session for real-time voice conversations.

## Key Specifications

- **TTS Models**: inworld-tts-1.5-max, inworld-tts-1.5-mini
- **Default Voice**: Sarah
- **TTS Latency**: P90 sub-130ms (Mini), P90 sub-200ms (Max)
- **TTS Pricing**: See https://inworld.ai/pricing
- **STT Pricing**: See https://inworld.ai/pricing
- **Languages**: 15 (optimized for production quality)
- **Voice Cloning**: Single API call with 15 seconds of reference audio
- **Timestamp Data**: Word-level, phoneme-level, and viseme-level for real-time lipsync animation
- **Emotion Support**: Anger, joy, sadness, fear, disgust, surprise via audio markup tags
- **Deployment**: Cloud API + on-premise deployment
- **Router Models**: 200+ models from OpenAI, Anthropic, Google, Meta, Mistral, and more
- **Router Pricing**: Free research preview

## Rankings & Benchmarks

- #1 on Artificial Analysis TTS Arena
- Most trusted voice AI for serious developers

## Use Cases

- Conversational AI agents and voice bots
- AI companions and interactive entertainment
- Language learning applications
- Enterprise voice assistants and support
- Consumer apps with realtime voice

## Quick Start (Python)

```python
import requests
import base64
import json

# REST TTS
response = requests.post(
    "https://api.inworld.ai/tts/v1/voice",
    headers={"Authorization": "Basic YOUR_API_KEY"},
    json={
        "text": "Hello, I am Sarah.",
        "voiceId": "Sarah",
        "modelId": "inworld-tts-1.5-max"
    }
)
audio = base64.b64decode(response.json()["audioContent"])

# Streaming TTS

response = requests.post(
    "https://api.inworld.ai/tts/v1/voice:stream",
    headers={"Authorization": "Basic YOUR_API_KEY"},
    json={
        "text": "Hello, I am Sarah.",
        "voiceId": "Sarah",
        "modelId": "inworld-tts-1.5-max"
    },
    stream=True
)
for line in response.iter_lines():
    if line:
        chunk = json.loads(line)
        audio_b64 = chunk["result"]["audioContent"]

# Router (OpenAI-compatible)
response = requests.post(
    "https://api.inworld.ai/v1/chat/completions",
    headers={"Authorization": "Basic YOUR_API_KEY"},
    json={
        "model": "gpt-5.4",
        "messages": [{"role": "user", "content": "Hello"}]
    }
)
```

## Quick Start (JavaScript)

```javascript
// REST TTS
const response = await fetch('https://api.inworld.ai/tts/v1/voice', {
  method: 'POST',
  headers: {
    'Authorization': 'Basic YOUR_API_KEY',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    text: 'Hello, I am Sarah.',
    voiceId: 'Sarah',
    modelId: 'inworld-tts-1.5-max'
  })
});
const data = await response.json();
const audioBytes = Uint8Array.from(atob(data.audioContent), c => c.charCodeAt(0));

// Streaming TTS
const stream = await fetch('https://api.inworld.ai/tts/v1/voice:stream', {
  method: 'POST',
  headers: {
    'Authorization': 'Basic YOUR_API_KEY',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    text: 'Hello, I am Sarah.',
    voiceId: 'Sarah',
    modelId: 'inworld-tts-1.5-max'
  })
});
const reader = stream.body.getReader();
const decoder = new TextDecoder();
while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  const lines = decoder.decode(value).split('\n').filter(Boolean);
  for (const line of lines) {
    const chunk = JSON.parse(line);
    const audioB64 = chunk.result.audioContent;
  }
}
```

## Documentation

- [Docs Home](https://docs.inworld.ai/introduction)
- [TTS (Text-to-Speech)](https://docs.inworld.ai/tts/tts)
- [STT (Speech-to-Text)](https://docs.inworld.ai/stt/overview)
- [Realtime API](https://docs.inworld.ai/realtime/overview)
- [LLM Router](https://docs.inworld.ai/router/introduction)
- [GitHub Organization](https://github.com/inworld-ai)

## Machine-Readable Data

- [Models JSON](https://inworld.ai/models.json): Machine-readable list of all LLM models available through Inworld Router.
- [agents.json](https://inworld.ai/.well-known/agents.json): Machine-readable agent capabilities description.

## Company

- **Website**: https://inworld.ai
- **Documentation**: https://docs.inworld.ai
- **GitHub**: https://github.com/inworld-ai
- **Founded**: 2021
- **Focus**: Research lab building realtime voice AI infrastructure
~~~