Integrations

ElevenLabs

AI voice generation, text-to-speech, and speech-to-text. Add natural-sounding voice capabilities to your generated apps.

How it works

When you mention voice, text-to-speech, or audio generation in your app prompt, the AI Builder generates an ElevenLabs integration. This includes NestJS API endpoints that proxy requests to the ElevenLabs API, audio streaming handlers, and frontend components for playback, voice selection, and transcription.

Key features

Text-to-speech

Convert text to natural-sounding speech using ElevenLabs' AI voice models. The AI generates audio player components and streaming playback logic.

Voice selection / cloning

Choose from a library of pre-built voices or clone a custom voice. Rytora BuildLabs generates voice picker UI and cloning upload flows.

Real-time streaming

Stream audio as it's generated for low-latency playback. The AI generates WebSocket-based streaming with chunked audio rendering.

Transcription

Convert speech to text with ElevenLabs' transcription API. Generate audio upload components with real-time transcription display.

Multi-language

Generate speech in multiple languages and accents. The AI produces language selector components with voice previews for each locale.

Setup

Enable the ElevenLabs connector in Project Settings → Connectors. Add your API key through the connector UI — credentials are encrypted and automatically injected as environment variables.

Connector fields

ELEVENLABS_API_KEYxi-...

Required

Connector permissions

Each tool can be set to Always allow, Ask each time, or Never allow:

- Generate speech
- Clone voice

Pro plan required

ElevenLabs integration is available on the Pro plan.