Telephony - Speech to Text
Convert telephony audio recordings and telecommunication content to accurate text transcriptions with specialized processing optimized for telephony systems and communication networks. Perfect for call center operations, PBX recordings, and telecommunication infrastructure.
Note: Audio files must first be uploaded using the Asset API before transcription. The audioUrl parameter should contain the path returned from the Asset API upload.
Supported Models
telephony: Specialized for telephony audio with enhanced processing for telecommunication system characteristics
Endpoint
Request Headers
| Field | Value |
|---|---|
| API-KEY | <api-key> |
| Content-Type | application/json |
Supported Audio Formats
- MP3 - MPEG Audio Layer III
- WAV - Waveform Audio File Format (preferred for telephony)
- M4A - MPEG-4 Audio
- FLAC - Free Lossless Audio Codec
- MP4 - MPEG-4 Part 14 (audio only)
- WEBM - WebM Audio
- OGG - Ogg Vorbis
Language Support
The API supports various languages including:
en-US- English (US)en-GB- English (UK)vi-VN- Vietnamesees-ES- Spanishfr-FR- Frenchde-DE- Germanit-IT- Italianpt-PT- Portugueseru-RU- Russianja-JP- Japaneseko-KR- Koreanzh-CN- Chinese (Simplified)ar-SA- Arabic
Note: For a complete list of all supported languages and their language codes, please refer to the Google Cloud Text-to-Speech documentation.
Code Example
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
type | string | Yes | Feature type, must be "SPEECH_TO_TEXT" |
model | string | Yes | Model identifier, use "telephony" |
promptObject.audioUrl | string | Yes | Path to audio file (uploaded via Asset API) |
promptObject.language | string | Yes | Language code for transcription (e.g., "en-US", "vi-VN") |
Code Examples
- cURL
- JavaScript
- Python
curl --location 'https://api.1min.ai/api/features' \
--header 'API-KEY: <api-key>' \
--header 'Content-Type: application/json' \
--data '{
"type": "SPEECH_TO_TEXT",
"model": "telephony",
"promptObject": {
"audioUrl": "audios/2025_02_20_16_26_08_652_New_Recording.m4a",
"language": "en-US"
}
}'
fetch('https://api.1min.ai/api/features', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'API-KEY': 'YOUR_API_KEY'
},
body: JSON.stringify({
type: 'SPEECH_TO_TEXT',
model: 'telephony',
promptObject: {
audioUrl: 'audios/2025_02_20_16_26_08_652_New_Recording.m4a',
language: 'en-US'
}
})
})
import requests
url = "https://api.1min.ai/api/features"
headers = {
"Content-Type": "application/json",
"API-KEY": "YOUR_API_KEY"
}
data = {
"type": "SPEECH_TO_TEXT",
"model": "telephony",
"promptObject": {
"audioUrl": "audios/2025_02_20_16_26_08_652_New_Recording.m4a",
"language": "en-US"
}
}
response = requests.post(url, headers=headers, json=data)
Interactive Playground
Try the API directly in your browser:
API Playground
https://api.1min.ai/api/featuresPath to the telephony audio file you want to transcribe (upload via Asset API first)
Generated cURL Command:
curl -X POST "https://api.1min.ai/api/features" \
-H "API-KEY: <your-api-key>" \
-H "Content-Type: application/json" \
-d '{
"type": "SPEECH_TO_TEXT",
"model": "telephony",
"promptObject": {
"audioUrl": "audios/2025_02_20_16_26_08_652_New_Recording.m4a",
"language": "en-US"
}
}'
Response Format
{
"success": true,
"data": {
"transcription": "Welcome to our customer service line. Your call is important to us. Please hold while we connect you to the next available agent. Thank you for your patience.",
"duration": "00:03:45",
"language": "en-US",
"confidence": 0.93,
"telephony_metadata": {
"call_quality": "standard",
"codec": "G.711",
"sample_rate": "8kHz"
}
}
}
Use Cases
- Call Center Operations: Transcribe customer service and support calls
- PBX Recordings: Convert private branch exchange system recordings
- VoIP Communications: Transcribe Voice over IP call recordings
- Telecommunication Analytics: Analyze call patterns and customer interactions
- Quality Assurance: Monitor and evaluate call center performance
- Compliance Recording: Document regulatory compliance calls
- Conference Bridge: Transcribe multi-party conference calls
- IVR Integration: Convert interactive voice response system interactions
Tips for Best Results
- Upload First: Use the Asset API to upload your telephony recording before transcription
- Telephony Quality: Works well with standard telephony audio quality (8kHz sampling)
- Format Compatibility: WAV format is preferred for telephony recordings
- Language Selection: Choose the correct language for optimal accuracy
- Call Quality: Standard telephony codecs (G.711, G.729) are well supported
- Processing Time: Processing time varies with call duration and complexity
Error Handling
Common error scenarios and solutions:
- File not found: Ensure the audio file was uploaded via Asset API first
- Invalid audioUrl: Verify the path matches exactly what was returned from Asset API upload
- Language not supported: Check that the language code is in the supported list
- Poor telephony quality: Very low-quality recordings may affect transcription accuracy