Integrations
ElevenLabs
AI voice generation, text-to-speech, and speech-to-text. Add natural-sounding voice capabilities to your generated apps.
How it works
When you mention voice, text-to-speech, or audio generation in your app prompt, the AI Builder generates an ElevenLabs integration. This includes NestJS API endpoints that proxy requests to the ElevenLabs API, audio streaming handlers, and frontend components for playback, voice selection, and transcription.
Key features
Text-to-speech
Convert text to natural-sounding speech using ElevenLabs' AI voice models. The AI generates audio player components and streaming playback logic.
Voice selection / cloning
Choose from a library of pre-built voices or clone a custom voice. Rytora BuildLabs generates voice picker UI and cloning upload flows.
Real-time streaming
Stream audio as it's generated for low-latency playback. The AI generates WebSocket-based streaming with chunked audio rendering.
Transcription
Convert speech to text with ElevenLabs' transcription API. Generate audio upload components with real-time transcription display.
Multi-language
Generate speech in multiple languages and accents. The AI produces language selector components with voice previews for each locale.
Setup
Enable the ElevenLabs connector in Project Settings → Connectors. Add your API key through the connector UI — credentials are encrypted and automatically injected as environment variables.
Connector fields
ELEVENLABS_API_KEYxi-...Connector permissions
Each tool can be set to Always allow, Ask each time, or Never allow:
- - Generate speech
- - Clone voice
Pro plan required
ElevenLabs integration is available on the Pro plan.