ElevenLabs Voice Cloning

Clone any voice from an audio sample and generate speech with that cloned voice using ElevenLabs' advanced AI voice cloning technology. Upload an audio file containing the voice you want to clone, provide text to convert to speech, and receive high-quality audio output in the cloned voice.

Note: Audio files must first be uploaded using the Asset API before voice cloning. The audioUrl parameter should contain the path returned from the Asset API upload.

Supported Models

elevenlabs-voice-cloning: ElevenLabs Voice Cloning service with Eleven Flash v2.5

Parameters

Parameter	Type	Required	Description
`type`	string	Yes	Feature type, must be "VOICE_CLONING"
`model`	string	Yes	Model identifier, use "elevenlabs-voice-cloning"
`promptObject.audioUrl`	string	Yes	Path to audio file containing voice to clone (uploaded via Asset API)
`promptObject.text`	string	Yes	Text to convert to speech using the cloned voice
`promptObject.output_format`	string	No	Output audio format (default: "mp3_44100_128")
`promptObject.model_id`	string	No	ElevenLabs model ID (default: "eleven_flash_v2_5")
`promptObject.language_code`	string	No	Language code for the voice (default: "en")
`promptObject.remove_background_noise`	boolean	No	Remove background noise from source audio (default: false)
`promptObject.voice_settings.stability`	number	No	Voice stability (0.0-1.0, default: 0.5)
`promptObject.voice_settings.similarity_boost`	number	No	Voice similarity boost (0.0-1.0, default: 0.75)
`promptObject.voice_settings.style`	number	No	Voice style exaggeration (0.0-1.0, default: 0.0)
`promptObject.voice_settings.use_speaker_boost`	boolean	No	Use speaker boost for better clarity (default: true)

Endpoint

POSThttps://api.1min.ai/api/features

Request Headers

Field	Value
API-KEY	`<api-key>`
Content-Type	`application/json`

Supported Audio Formats

Input Formats (Voice Source)

MP3 - MPEG Audio Layer III
WAV - Waveform Audio File Format
M4A - MPEG-4 Audio
FLAC - Free Lossless Audio Codec
MP4 - MPEG-4 Part 14 (audio only)
WEBM - WebM Audio
OGG - Ogg Vorbis

Output Formats

mp3_44100_128 - MP3, 44.1kHz, 128kbps (default)
mp3_44100_64 - MP3, 44.1kHz, 64kbps
mp3_44100_96 - MP3, 44.1kHz, 96kbps
mp3_44100_192 - MP3, 44.1kHz, 192kbps
mp3_22050_32 - MP3, 22.05kHz, 32kbps
pcm_16000 - PCM, 16kHz
pcm_22050 - PCM, 22.05kHz
pcm_24000 - PCM, 24kHz
pcm_44100 - PCM, 44.1kHz

Supported Models

ElevenLabs Voice Models

eleven_flash_v2_5 - Eleven Flash v2.5 (default, fastest)
eleven_turbo_v2_5 - Eleven Turbo v2.5 (balanced speed and quality)
eleven_multilingual_v2 - Eleven Multilingual v2 (supports multiple languages)

Language Support

The API supports automatic language detection for voice cloning and text-to-speech conversion in multiple languages:

en - English
es - Spanish
fr - French
de - German
it - Italian
pt - Portuguese
ru - Russian
ja - Japanese
ko - Korean
zh - Chinese
ar - Arabic
hi - Hindi

And many more languages supported by ElevenLabs.

Voice Settings Explained

Stability (0.0 - 1.0)

Low (0.0-0.3): More variable and expressive, but may be inconsistent
Medium (0.4-0.7): Balanced stability and expressiveness (recommended)
High (0.8-1.0): Very stable but may sound monotone

Similarity Boost (0.0 - 1.0)

Low (0.0-0.3): More creative interpretation of the voice
Medium (0.4-0.7): Balanced similarity to original voice
High (0.8-1.0): Maximum similarity to the source voice (recommended)

Style (0.0 - 1.0)

Low (0.0): Natural speech patterns (recommended for most use cases)
High (1.0): Exaggerated style and emotion

Speaker Boost

Enabled: Enhances speaker similarity and audio clarity (recommended)
Disabled: Standard processing without additional enhancement

Code Example

cURL
JavaScript
Python

curl --location 'https://api.1min.ai/api/features' \
--header 'API-KEY: <api-key>' \
--header 'Content-Type: application/json' \
--data '{
"type": "VOICE_CLONING",
"model": "elevenlabs-voice-cloning",
"promptObject": {
  "audioUrl": "audios/2025_10_21_10_25_35_749_short.mp3",
  "text": "Hello, this is a test of voice cloning technology. The AI has learned to speak in my voice.",
  "output_format": "mp3_22050_32",
  "model_id": "eleven_flash_v2_5",
  "language_code": "en",
  "remove_background_noise": true,
  "voice_settings": {
    "stability": 0.5,
    "similarity_boost": 0.75,
    "style": 0.0,
    "use_speaker_boost": true
  }
}
}'

fetch('https://api.1min.ai/api/features', {
method: 'POST',
headers: {
  'Content-Type': 'application/json',
  'API-KEY': 'YOUR_API_KEY'
},
body: JSON.stringify({
  type: 'VOICE_CLONING',
  model: 'elevenlabs-voice-cloning',
  promptObject: {
    audioUrl: 'audios/2025_10_21_10_25_35_749_short.mp3',
    text: 'Hello, this is a test of voice cloning technology. The AI has learned to speak in my voice.',
    output_format: 'mp3_22050_32',
    model_id: 'eleven_flash_v2_5',
    language_code: 'en',
    remove_background_noise: true,
    voice_settings: {
      stability: 0.5,
      similarity_boost: 0.75,
      style: 0.0,
      use_speaker_boost: true
    }
  }
})
});

import requests

url = "https://api.1min.ai/api/features"
headers = {
  "Content-Type": "application/json",
  "API-KEY": "YOUR_API_KEY"
}

data = {
  "type": "VOICE_CLONING",
  "model": "elevenlabs-voice-cloning",
  "promptObject": {
      "audioUrl": "audios/2025_10_21_10_25_35_749_short.mp3",
      "text": "Hello, this is a test of voice cloning technology. The AI has learned to speak in my voice.",
      "output_format": "mp3_22050_32",
      "model_id": "eleven_flash_v2_5",
      "language_code": "en",
      "remove_background_noise": True,
      "voice_settings": {
          "stability": 0.5,
          "similarity_boost": 0.75,
          "style": 0.0,
          "use_speaker_boost": True
      }
  }
}

response = requests.post(url, headers=headers, json=data)

Interactive Playground

Try the API directly in your browser:

API Playground

https://api.1min.ai/api/features

AI Model *

Audio URL (Voice Source) *

Path to the audio file containing the voice to clone (upload via Asset API first)

Text to Speech *

The text that will be spoken in the cloned voice

Output Format

Remove Background Noise

Clean up background noise from the source audio

Stability (0.0-1.0)

Voice stability - higher values are more stable but less expressive

Similarity Boost (0.0-1.0)

How closely to match the original voice - higher values for better similarity

Style (0.0-1.0)

Style exaggeration - 0.0 for natural speech, higher for more dramatic delivery

Use Speaker Boost

Enhance speaker similarity and audio clarity

Generated cURL Command:

curl -X POST "https://api.1min.ai/api/features" \
  -H "API-KEY: <your-api-key>" \
  -H "Content-Type: application/json" \
  -d '{
  "type": "VOICE_CLONING",
  "model": "elevenlabs-voice-cloning",
  "promptObject": {
    "audioUrl": "audios/2025_10_21_10_25_35_749_short.mp3",
    "model_id": "eleven_flash_v2_5",
    "text": "Hello, this is a test of voice cloning technology.",
    "outputFormat": "mp3_44100_128",
    "remove_background_noise": true,
    "stability": 0.5,
    "similarity_boost": 0.75,
    "style": 0,
    "use_speaker_boost": true
  }
}'

Response Format

The API returns an audio file path that can be accessed via the Asset API:

{
  "aiRecord": {
    "uuid": "ab6fc10b-53f6-46d5-9c43-119723922138",
    "userId": "c937fbcc-fa8f-4565-a440-c4d87f56fcb2",
    "teamId": "a4e176b2-dabb-451e-9c58-62b451fa9630",
    "teamUser": {
      "teamId": "a4e176b2-dabb-451e-9c58-62b451fa9630",
      "userId": "c937fbcc-fa8f-4565-a440-c4d87f56fcb2",
      "userName": "John Doe",
      "userAvatar": "https://lh3.googleusercontent.com/a/ACg8ocLqgsNsHRfmWF9d-E1RvJetVsEzxNOsOg-NXWNTpMxLDPJbwELI=s96-c",
      "status": "ACTIVE",
      "role": "ADMIN",
      "creditLimit": 100000000,
      "usedCredit": 324208,
      "createdAt": "2025-10-20T04:13:40.847Z",
      "createdBy": "SYSTEM",
      "updatedAt": "2025-10-21T10:36:11.166Z",
      "updatedBy": "SYSTEM"
    },
    "model": "elevenlabs-voice-cloning",
    "type": "VOICE_CLONING",
    "metadata": null,
    "rating": null,
    "feedback": null,
    "conversationId": null,
    "status": "SUCCESS",
    "createdAt": "2025-10-21T10:39:31.287Z",
    "aiRecordDetail": {
      "promptObject": {
        "text": "Hello, this is a test of voice cloning technology.",
        "style": 0,
        "audioUrl": "audios/2025_10_21_10_25_35_749_short.mp3",
        "model_id": "eleven_flash_v2_5",
        "stability": 0.5,
        "outputFormat": "mp3_44100_128",
        "similarity_boost": 0.75,
        "use_speaker_boost": true,
        "remove_background_noise": true
      },
      "resultObject": [
        "development/audios/2025_10_21_17_39_38_149_155254.mp3"
      ],
      "responseObject": {}
    },
    "additionalData": null,
    "temporaryUrl": "https://s3.us-east-1.amazonaws.com/asset.1min.ai/development/audios/2025_10_21_17_39_38_149_155254.mp3?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Credential=AKIAVRUVQEFIHSKAXGE7%2F20251021%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20251021T103939Z&X-Amz-Expires=604800&X-Amz-Signature=e89abbc8095df3d3a857cd7840ba96f0ef0559dcadf7fc11be5393dc5e8fc308&X-Amz-SignedHeaders=host&x-amz-checksum-mode=ENABLED&x-id=GetObject"
  }
}

Use Cases

Content Creation: Create consistent voiceovers for videos, podcasts, and presentations
Personalization: Generate personalized audio messages and notifications
Accessibility: Convert text to speech using familiar voices for better user experience
Entertainment: Create character voices for games, animations, and interactive media
Education: Develop educational content with consistent narrator voices
Marketing: Create brand-consistent audio content and advertisements
Audiobooks: Generate audiobook narration in specific voice styles
Voice Assistants: Build custom voice assistants with unique personality voices

Tips for Best Results

Quality Source Audio: Use clear, high-quality recordings with minimal background noise
Speaker Duration: Provide at least 10-30 seconds of the target voice for better cloning quality
Clean Audio: Enable remove_background_noise for recordings with background sounds
Single Speaker: Use audio samples containing only one speaker for best results
Natural Speech: Source audio should contain natural conversational speech patterns
File Size: Keep source audio files under 50MB for optimal processing speed
Voice Settings: Start with default settings and adjust based on your needs:
- High similarity_boost (0.7-0.9) for close voice matching
- Medium stability (0.4-0.7) for balanced expression
- Low style (0.0-0.2) for natural speech
Text Length: Break long texts into shorter segments for better quality
Pronunciation: The cloned voice will follow the pronunciation patterns from the source audio

Supported Models​

Parameters​

Endpoint​

Request Headers​

Supported Audio Formats​

Input Formats (Voice Source)​

Output Formats​

Supported Models​

ElevenLabs Voice Models​

Language Support​

Voice Settings Explained​

Stability (0.0 - 1.0)​

Similarity Boost (0.0 - 1.0)​

Style (0.0 - 1.0)​

Speaker Boost​

Code Example​

Interactive Playground​

API Playground

Generated cURL Command:

Response Format​

Use Cases​

Tips for Best Results​

Supported Models

Parameters

Endpoint

Request Headers

Supported Audio Formats

Input Formats (Voice Source)

Output Formats

Supported Models

ElevenLabs Voice Models

Language Support

Voice Settings Explained

Stability (0.0 - 1.0)

Similarity Boost (0.0 - 1.0)

Style (0.0 - 1.0)

Speaker Boost

Code Example

Interactive Playground

Response Format

Use Cases

Tips for Best Results