ElevenLabs Voice Cloning
Clone any voice from an audio sample and generate speech with that cloned voice using ElevenLabs' advanced AI voice cloning technology. Upload an audio file containing the voice you want to clone, provide text to convert to speech, and receive high-quality audio output in the cloned voice.
Note: Audio files must first be uploaded using the Asset API before voice cloning. The audioUrl parameter should contain the path returned from the Asset API upload.
Supported Models
elevenlabs-voice-cloning: ElevenLabs Voice Cloning service with Eleven Flash v2.5
Parameters
| Parameter | Type | Required | Description | 
|---|---|---|---|
type | string | Yes | Feature type, must be "VOICE_CLONING" | 
model | string | Yes | Model identifier, use "elevenlabs-voice-cloning" | 
promptObject.audioUrl | string | Yes | Path to audio file containing voice to clone (uploaded via Asset API) | 
promptObject.text | string | Yes | Text to convert to speech using the cloned voice | 
promptObject.output_format | string | No | Output audio format (default: "mp3_44100_128") | 
promptObject.model_id | string | No | ElevenLabs model ID (default: "eleven_flash_v2_5") | 
promptObject.language_code | string | No | Language code for the voice (default: "en") | 
promptObject.remove_background_noise | boolean | No | Remove background noise from source audio (default: false) | 
promptObject.voice_settings.stability | number | No | Voice stability (0.0-1.0, default: 0.5) | 
promptObject.voice_settings.similarity_boost | number | No | Voice similarity boost (0.0-1.0, default: 0.75) | 
promptObject.voice_settings.style | number | No | Voice style exaggeration (0.0-1.0, default: 0.0) | 
promptObject.voice_settings.use_speaker_boost | boolean | No | Use speaker boost for better clarity (default: true) | 
Endpoint
Request Headers
| Field | Value | 
|---|---|
| API-KEY | <api-key> | 
| Content-Type | application/json | 
Supported Audio Formats
Input Formats (Voice Source)
- MP3 - MPEG Audio Layer III
 - WAV - Waveform Audio File Format
 - M4A - MPEG-4 Audio
 - FLAC - Free Lossless Audio Codec
 - MP4 - MPEG-4 Part 14 (audio only)
 - WEBM - WebM Audio
 - OGG - Ogg Vorbis
 
Output Formats
mp3_44100_128- MP3, 44.1kHz, 128kbps (default)mp3_44100_64- MP3, 44.1kHz, 64kbpsmp3_44100_96- MP3, 44.1kHz, 96kbpsmp3_44100_192- MP3, 44.1kHz, 192kbpsmp3_22050_32- MP3, 22.05kHz, 32kbpspcm_16000- PCM, 16kHzpcm_22050- PCM, 22.05kHzpcm_24000- PCM, 24kHzpcm_44100- PCM, 44.1kHz
Supported Models
ElevenLabs Voice Models
eleven_flash_v2_5- Eleven Flash v2.5 (default, fastest)eleven_turbo_v2_5- Eleven Turbo v2.5 (balanced speed and quality)eleven_multilingual_v2- Eleven Multilingual v2 (supports multiple languages)
Language Support
The API supports automatic language detection for voice cloning and text-to-speech conversion in multiple languages:
en- Englishes- Spanishfr- Frenchde- Germanit- Italianpt- Portugueseru- Russianja- Japaneseko- Koreanzh- Chinesear- Arabichi- Hindi
And many more languages supported by ElevenLabs.
Voice Settings Explained
Stability (0.0 - 1.0)
- Low (0.0-0.3): More variable and expressive, but may be inconsistent
 - Medium (0.4-0.7): Balanced stability and expressiveness (recommended)
 - High (0.8-1.0): Very stable but may sound monotone
 
Similarity Boost (0.0 - 1.0)
- Low (0.0-0.3): More creative interpretation of the voice
 - Medium (0.4-0.7): Balanced similarity to original voice
 - High (0.8-1.0): Maximum similarity to the source voice (recommended)
 
Style (0.0 - 1.0)
- Low (0.0): Natural speech patterns (recommended for most use cases)
 - High (1.0): Exaggerated style and emotion
 
Speaker Boost
- Enabled: Enhances speaker similarity and audio clarity (recommended)
 - Disabled: Standard processing without additional enhancement
 
Code Example
- cURL
 - JavaScript
 - Python
 
curl --location 'https://api.1min.ai/api/features' \
--header 'API-KEY: <api-key>' \
--header 'Content-Type: application/json' \
--data '{
"type": "VOICE_CLONING",
"model": "elevenlabs-voice-cloning",
"promptObject": {
  "audioUrl": "audios/2025_10_21_10_25_35_749_short.mp3",
  "text": "Hello, this is a test of voice cloning technology. The AI has learned to speak in my voice.",
  "output_format": "mp3_22050_32",
  "model_id": "eleven_flash_v2_5",
  "language_code": "en",
  "remove_background_noise": true,
  "voice_settings": {
    "stability": 0.5,
    "similarity_boost": 0.75,
    "style": 0.0,
    "use_speaker_boost": true
  }
}
}'
fetch('https://api.1min.ai/api/features', {
method: 'POST',
headers: {
  'Content-Type': 'application/json',
  'API-KEY': 'YOUR_API_KEY'
},
body: JSON.stringify({
  type: 'VOICE_CLONING',
  model: 'elevenlabs-voice-cloning',
  promptObject: {
    audioUrl: 'audios/2025_10_21_10_25_35_749_short.mp3',
    text: 'Hello, this is a test of voice cloning technology. The AI has learned to speak in my voice.',
    output_format: 'mp3_22050_32',
    model_id: 'eleven_flash_v2_5',
    language_code: 'en',
    remove_background_noise: true,
    voice_settings: {
      stability: 0.5,
      similarity_boost: 0.75,
      style: 0.0,
      use_speaker_boost: true
    }
  }
})
});
import requests
url = "https://api.1min.ai/api/features"
headers = {
  "Content-Type": "application/json",
  "API-KEY": "YOUR_API_KEY"
}
data = {
  "type": "VOICE_CLONING",
  "model": "elevenlabs-voice-cloning",
  "promptObject": {
      "audioUrl": "audios/2025_10_21_10_25_35_749_short.mp3",
      "text": "Hello, this is a test of voice cloning technology. The AI has learned to speak in my voice.",
      "output_format": "mp3_22050_32",
      "model_id": "eleven_flash_v2_5",
      "language_code": "en",
      "remove_background_noise": True,
      "voice_settings": {
          "stability": 0.5,
          "similarity_boost": 0.75,
          "style": 0.0,
          "use_speaker_boost": True
      }
  }
}
response = requests.post(url, headers=headers, json=data)
Interactive Playground
Try the API directly in your browser:
API Playground
https://api.1min.ai/api/featuresGenerated cURL Command:
curl -X POST "https://api.1min.ai/api/features" \
  -H "API-KEY: <your-api-key>" \
  -H "Content-Type: application/json" \
  -d '{
  "type": "VOICE_CLONING",
  "model": "elevenlabs-voice-cloning",
  "promptObject": {
    "audioUrl": "audios/2025_10_21_10_25_35_749_short.mp3",
    "model_id": "eleven_flash_v2_5",
    "text": "Hello, this is a test of voice cloning technology.",
    "outputFormat": "mp3_44100_128",
    "remove_background_noise": true,
    "stability": 0.5,
    "similarity_boost": 0.75,
    "style": 0,
    "use_speaker_boost": true
  }
}'
Response Format
The API returns an audio file path that can be accessed via the Asset API:
{
  "aiRecord": {
    "uuid": "ab6fc10b-53f6-46d5-9c43-119723922138",
    "userId": "c937fbcc-fa8f-4565-a440-c4d87f56fcb2",
    "teamId": "a4e176b2-dabb-451e-9c58-62b451fa9630",
    "teamUser": {
      "teamId": "a4e176b2-dabb-451e-9c58-62b451fa9630",
      "userId": "c937fbcc-fa8f-4565-a440-c4d87f56fcb2",
      "userName": "John Doe",
      "userAvatar": "https://lh3.googleusercontent.com/a/ACg8ocLqgsNsHRfmWF9d-E1RvJetVsEzxNOsOg-NXWNTpMxLDPJbwELI=s96-c",
      "status": "ACTIVE",
      "role": "ADMIN",
      "creditLimit": 100000000,
      "usedCredit": 324208,
      "createdAt": "2025-10-20T04:13:40.847Z",
      "createdBy": "SYSTEM",
      "updatedAt": "2025-10-21T10:36:11.166Z",
      "updatedBy": "SYSTEM"
    },
    "model": "elevenlabs-voice-cloning",
    "type": "VOICE_CLONING",
    "metadata": null,
    "rating": null,
    "feedback": null,
    "conversationId": null,
    "status": "SUCCESS",
    "createdAt": "2025-10-21T10:39:31.287Z",
    "aiRecordDetail": {
      "promptObject": {
        "text": "Hello, this is a test of voice cloning technology.",
        "style": 0,
        "audioUrl": "audios/2025_10_21_10_25_35_749_short.mp3",
        "model_id": "eleven_flash_v2_5",
        "stability": 0.5,
        "outputFormat": "mp3_44100_128",
        "similarity_boost": 0.75,
        "use_speaker_boost": true,
        "remove_background_noise": true
      },
      "resultObject": [
        "development/audios/2025_10_21_17_39_38_149_155254.mp3"
      ],
      "responseObject": {}
    },
    "additionalData": null,
    "temporaryUrl": "https://s3.us-east-1.amazonaws.com/asset.1min.ai/development/audios/2025_10_21_17_39_38_149_155254.mp3?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Credential=AKIAVRUVQEFIHSKAXGE7%2F20251021%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20251021T103939Z&X-Amz-Expires=604800&X-Amz-Signature=e89abbc8095df3d3a857cd7840ba96f0ef0559dcadf7fc11be5393dc5e8fc308&X-Amz-SignedHeaders=host&x-amz-checksum-mode=ENABLED&x-id=GetObject"
  }
}
Use Cases
- Content Creation: Create consistent voiceovers for videos, podcasts, and presentations
 - Personalization: Generate personalized audio messages and notifications
 - Accessibility: Convert text to speech using familiar voices for better user experience
 - Entertainment: Create character voices for games, animations, and interactive media
 - Education: Develop educational content with consistent narrator voices
 - Marketing: Create brand-consistent audio content and advertisements
 - Audiobooks: Generate audiobook narration in specific voice styles
 - Voice Assistants: Build custom voice assistants with unique personality voices
 
Tips for Best Results
- Quality Source Audio: Use clear, high-quality recordings with minimal background noise
 - Speaker Duration: Provide at least 10-30 seconds of the target voice for better cloning quality
 - Clean Audio: Enable 
remove_background_noisefor recordings with background sounds - Single Speaker: Use audio samples containing only one speaker for best results
 - Natural Speech: Source audio should contain natural conversational speech patterns
 - File Size: Keep source audio files under 50MB for optimal processing speed
 - Voice Settings: Start with default settings and adjust based on your needs:
- High 
similarity_boost(0.7-0.9) for close voice matching - Medium 
stability(0.4-0.7) for balanced expression - Low 
style(0.0-0.2) for natural speech 
 - High 
 - Text Length: Break long texts into shorter segments for better quality
 - Pronunciation: The cloned voice will follow the pronunciation patterns from the source audio
 
Error Handling
Common error scenarios and solutions:
- File not found: Ensure the source audio file was uploaded via Asset API first
 - Invalid audioUrl: Verify the path matches exactly what was returned from Asset API upload
 - Poor cloning quality: Try using cleaner source audio or adjusting voice settings
 - Voice creation failed: Check that the source audio contains clear speech from a single speaker
 - Text too long: Break long texts into smaller chunks for better processing
 
Rate Limits
- Voice cloning operations are resource-intensive and may have lower rate limits
 - Consider implementing queuing for bulk voice cloning operations
 - Monitor your usage to avoid hitting concurrent processing limits