# Inworld AI > Complete API reference: https://docs.inworld.ai/llms-full.txt Inworld is a research lab focused on realtime voice AI. We build the #1 ranked models and APIs: text-to-speech, speech-to-text, intelligent LLM routing, and a Realtime API for end-to-end voice conversations. Most trusted by serious developers building voice-first applications that make every user feel understood. ## Products - [TTS API (Text-to-Speech)](https://inworld.ai/tts): #1 ranked on Artificial Analysis TTS Arena. Low-latency streaming TTS with word, phoneme, and viseme timestamps for lipsync. Supports emotion markup, voice cloning from 15 seconds of audio, and 15 production-quality languages. Models: inworld-tts-1.5-max, inworld-tts-1.5-mini. - [STT API (Speech-to-Text)](https://inworld.ai/speech-to-text): Multi-provider transcription with voice profiling (emotion, accent, intent detection). 99+ languages via Whisper. Research Preview. - [Router API](https://inworld.ai/router): OpenAI Chat Completions-compatible API that routes to 200+ LLM models (OpenAI, Anthropic, Google, open-source). Single endpoint, single API key, automatic fallback and cost optimization. Free research preview. - [Realtime API](https://inworld.ai/realtime-api): End-to-end voice pipeline combining STT + LLM + TTS in a single session. WebSocket and WebRTC transports for real-time conversational AI. ## API Reference ### Authentication All endpoints use HTTP Basic authentication: ``` Authorization: Basic {YOUR_API_KEY} ``` ### TTS REST — Single Request ``` POST https://api.inworld.ai/tts/v1/voice Content-Type: application/json Authorization: Basic {YOUR_API_KEY} { "text": "Hello, I am Sarah.", "voiceId": "Sarah", "modelId": "inworld-tts-1.5-max" } ``` Returns JSON with base64-encoded audioContent: `{"audioContent": "base64..."}` ### TTS Streaming ``` POST https://api.inworld.ai/tts/v1/voice:stream Content-Type: application/json Authorization: Basic {YOUR_API_KEY} { "text": "Hello, I am Sarah.", "voiceId": "Sarah", "modelId": "inworld-tts-1.5-max" } ``` Returns a stream of JSON objects. Each line contains a JSON object with `result.audioContent` (base64-encoded audio): ```json {"result":{"audioContent":"base64-encoded-audio-chunk..."}} {"result":{"audioContent":"base64-encoded-audio-chunk..."}} ``` ### List Voices ``` GET https://api.inworld.ai/voices/v1/voices Authorization: Basic {YOUR_API_KEY} ``` Returns 271 available voices. ### Voice Cloning ``` POST https://api.inworld.ai/voices/v1/voices:clone Authorization: Basic {YOUR_API_KEY} Content-Type: application/json { "displayName": "MyClonedVoice", "langCode": "EN_US", "voiceSamples": [{"audioData": "base64-encoded-audio"}] } ``` ### Speech-to-Text ``` POST https://api.inworld.ai/stt/v1/transcribe Authorization: Basic {YOUR_API_KEY} ``` ### Router (LLM) OpenAI Chat Completions-compatible. Routes to 200+ models. ``` POST https://api.inworld.ai/v1/chat/completions Content-Type: application/json Authorization: Basic {YOUR_API_KEY} { "model": "gpt-5.4", "messages": [ {"role": "user", "content": "Hello"} ] } ``` ### Realtime API **WebSocket:** ``` wss://api.inworld.ai/api/v1/realtime/session ``` **WebRTC:** ``` POST https://api.inworld.ai/v1/realtime/calls ``` Combines STT + LLM + TTS in a single persistent session for real-time voice conversations. ## Key Specifications - **TTS Models**: inworld-tts-1.5-max, inworld-tts-1.5-mini - **Default Voice**: Sarah - **TTS Latency**: P90 sub-130ms (Mini), P90 sub-200ms (Max) - **TTS Pricing**: See https://inworld.ai/pricing - **STT Pricing**: See https://inworld.ai/pricing - **Languages**: 15 (optimized for production quality) - **Voice Cloning**: Single API call with 15 seconds of reference audio - **Timestamp Data**: Word-level, phoneme-level, and viseme-level for real-time lipsync animation - **Emotion Support**: Anger, joy, sadness, fear, disgust, surprise via audio markup tags - **Deployment**: Cloud API + on-premise deployment - **Router Models**: 200+ models from OpenAI, Anthropic, Google, Meta, Mistral, and more - **Router Pricing**: Free research preview ## Rankings & Benchmarks - #1 on Artificial Analysis TTS Arena - Most trusted voice AI for serious developers ## Use Cases - Conversational AI agents and voice bots - AI companions and interactive entertainment - Language learning applications - Enterprise voice assistants and support - Consumer apps with realtime voice ## Quick Start (Python) ```python import requests import base64 import json # REST TTS response = requests.post( "https://api.inworld.ai/tts/v1/voice", headers={"Authorization": "Basic YOUR_API_KEY"}, json={ "text": "Hello, I am Sarah.", "voiceId": "Sarah", "modelId": "inworld-tts-1.5-max" } ) audio = base64.b64decode(response.json()["audioContent"]) # Streaming TTS response = requests.post( "https://api.inworld.ai/tts/v1/voice:stream", headers={"Authorization": "Basic YOUR_API_KEY"}, json={ "text": "Hello, I am Sarah.", "voiceId": "Sarah", "modelId": "inworld-tts-1.5-max" }, stream=True ) for line in response.iter_lines(): if line: chunk = json.loads(line) audio_b64 = chunk["result"]["audioContent"] # Router (OpenAI-compatible) response = requests.post( "https://api.inworld.ai/v1/chat/completions", headers={"Authorization": "Basic YOUR_API_KEY"}, json={ "model": "gpt-5.4", "messages": [{"role": "user", "content": "Hello"}] } ) ``` ## Quick Start (JavaScript) ```javascript // REST TTS const response = await fetch('https://api.inworld.ai/tts/v1/voice', { method: 'POST', headers: { 'Authorization': 'Basic YOUR_API_KEY', 'Content-Type': 'application/json' }, body: JSON.stringify({ text: 'Hello, I am Sarah.', voiceId: 'Sarah', modelId: 'inworld-tts-1.5-max' }) }); const data = await response.json(); const audioBytes = Uint8Array.from(atob(data.audioContent), c => c.charCodeAt(0)); // Streaming TTS const stream = await fetch('https://api.inworld.ai/tts/v1/voice:stream', { method: 'POST', headers: { 'Authorization': 'Basic YOUR_API_KEY', 'Content-Type': 'application/json' }, body: JSON.stringify({ text: 'Hello, I am Sarah.', voiceId: 'Sarah', modelId: 'inworld-tts-1.5-max' }) }); const reader = stream.body.getReader(); const decoder = new TextDecoder(); while (true) { const { done, value } = await reader.read(); if (done) break; const lines = decoder.decode(value).split('\n').filter(Boolean); for (const line of lines) { const chunk = JSON.parse(line); const audioB64 = chunk.result.audioContent; } } ``` ## Documentation - [Docs Home](https://docs.inworld.ai/introduction) - [TTS (Text-to-Speech)](https://docs.inworld.ai/tts/tts) - [STT (Speech-to-Text)](https://docs.inworld.ai/stt/overview) - [Realtime API](https://docs.inworld.ai/realtime/overview) - [LLM Router](https://docs.inworld.ai/router/introduction) - [GitHub Organization](https://github.com/inworld-ai) ## Machine-Readable Data - [Models JSON](https://inworld.ai/models.json): Machine-readable list of all LLM models available through Inworld Router. - [agents.json](https://inworld.ai/.well-known/agents.json): Machine-readable agent capabilities description. ## Company - **Website**: https://inworld.ai - **Documentation**: https://docs.inworld.ai - **GitHub**: https://github.com/inworld-ai - **Founded**: 2021 - **Focus**: Research lab building realtime voice AI infrastructure